Growth-oriented individual targeting Linux Administrators and Site Reliability Engineering roles in Bangalore, Hyderabad, Chennai, and Pune.
Proven ability to work collaboratively with Development Teams, Operations, and Stakeholders to achieve common goals and resolve complex issues.
Resourceful Site Reliability Engineer known for high productivity and efficient task completion. Skilled in automation, continuous integration and delivery (CI/CD), and cloud identity and access management. Excel in problem-solving, collaboration, and adaptability, ensuring seamless operations and system reliability.
Overview
7
7
years of professional experience
Work History
Linux L1 Admin and Site Reliability Engineer
VMware
Bangalore
08.2017 - 04.2024
Integral member of the VMware Command Center team.
Spearheaded development and enhancement of service monitoring tools through advanced integrations, automation, and collaborative efforts.
Managed Network Services DNS/DHCP alerts, Infoblox DNS requests, and AVI load balancer (LB) alerts
Designed load balancer VIPs and coordinated with L2 teams for architectural adaptations to meet application requirements.
Demonstrated proficiency in analyzing VIP and Pool statuses on F5 and AVI platforms; responsible for monitoring and investigating alert causes on both platforms.
Acted as the technical focal point for VMware's SaaS service success, providing critical expertise to the Command Center.
Elevated team technical skills, developed automation tools for VMware services, and facilitated automated problem resolution using Python.
Monitored network traffic, systems, and devices; maintained data networking systems using tools like BigPanda, Wavefront, and Grafana.
Expert in establishing and managing Service Level Agreements (SLAs), defining Service Level Objectives (SLOs), and implementing Service Level Indicators (SLIs) for performance optimization
Contributed to the creation of work instructions and knowledge base articles.
Utilized expertise to guide discussions and proactively identify and address issues or trends.
Addressed and resolved Linux L1 alerts.
Conducted training sessions for new team members.
Provided 24/7 remote support for BIND DNS, ISC DHCP servers, and Infoblox DDI administration.
Delivered 24/7 on-call support for incident monitoring and assistance for both internal and external customers.
Implemented automation tools to increase efficiency in deployment processes.
Monitored systems performance using various metrics such as latency, throughput, availability.
Built CI and CD pipelines leveraging tools like Jenkins or GitLab CI.
Good understanding in resolving Kubernetes L1-related alerts.