IT professional with 13 years of comprehensive experience in infrastructure management, and operations, including 4 years specializing in Site Reliability Engineering (SRE). Proven expertise in designing, implementing, and maintaining highly available, scalable, and resilient systems. Skilled in automation, monitoring, incident management, and performance optimization to ensure seamless service delivery and improved system reliability.
Overview
13
13
years of professional experience
1
1
Certification
Work History
Senior Technical Solutions Engineer
Persistent Systems
Pune
01.2025 - Current
Containerized applications using Docker and orchestrated deployments with Kubernetes, improving scalability and resource utilization.
Implemented monitoring solutions with Prometheus and Grafana, enhancing system visibility and reducing downtime by 40%.
Managed deployment CICD pipelines using Jenkins and Docker, reducing deployment time by 50%.
Integrated Git for version control in CI/CD workflows, enabling seamless collaboration and code review processes, which improved code quality and team productivity.
Managed scalable Kubernetes clusters on AWS EKS, ensuring high availability and performance for production applications.
Manage source code repositories in Git, handling branching, commits, and merges to ensure smooth development workflows.
Cloud Engineer
Cisco
Pune
08.2022 - 11.2024
SRE on-call for Production support.
Used Jenkins for the Code/App deployment related activities on linux based VMs.
Resolve API-related issues and ensure smooth deployments in production environments.
Used Grafana, Splunk and AppDynamics dashboards to troubleshoot API related issues.
Automate and streamline deployment tasks using shell scripts and Ansible playbooks.
Managed CICD pipelines using Jenkins.
Deployed and managed Elastic Kubernetes Cluster (EKS) on AWS and troubleshooting.
Created grafana dashboards and integrated with Prometheus.
Setup alerts in Prometheus for kubernetes clusters.
Created dashboards in Splunk tool for production related activities.
Analyzed transaction snapshots, diagnostic sessions, and performance metrics in AppDynamics to isolate root causes of slow response times and application errors.
Technical Support Lead
Persistent Systems
Pune
04.2019 - 07.2022
Manage over 20,000 Linux servers across multiple data center's, ensuring high availability and performance.
Provide 24x7 Site Reliability Engineer on-call support for critical production issues, ensuring high availability and system reliability of an application.
Server Migrations of Linux servers from On-Prem to cloud.
Managed user permissions, file systems, and security policies in accordance with organizational standards.
Supported the migration of on-premise infrastructure to OpenStack, improving scalability and reducing costs.
Creating volumes and managing file systems using Logical Volume Manager (LVM) and extending and reducing the file system using LVM.
Led bridge calls with global IT teams and business partners to resolve critical business incidents and outages.
Collaborated with development teams to automate CI/CD pipelines, integrating Linux-based solutions with Jenkins and Git.
Drive knowledge management across the supported applications and ensure full compliance.
Used Splunk and AppDynamics to troubleshoot API issues and log analysis.
Managed Kubernetes Cluster using Rancher Tool.
Continuously improve systems and applications' reliability, scalability, and performance through root cause analysis, code and architecture review, and proactive monitoring.
Participate and respond to critical incidents promptly and efficiently, performing troubleshooting and incident management as needed.
Used Grafana, Splunk and AppDynamics to monitor metrics in the production environment.
Monitor systems and applications for performance, availability, and security, and respond to issues quickly and efficiently.
Collaborate with development and product teams to ensure that applications and systems are designed and implemented with reliability, scalability, and performance in mind.
Managed Prometheus-based alerting systems for Linux, Kubernetes, and application-level issues in production environments.
Implemented configuration management using Ansible for efficient system administration.
Support the resolution of incidents and problems within the team. Assist with the resolution of complex incidents.
Created and managed GitHub repositories, overseeing branching, code merges, tagging, and user permissions.
Technical Solutions Engineer
Mojo Networks
Pune
07.2017 - 01.2019
Monitored and provisioned wireless devices, including sensors and access points, on the Mojo Cloud Server.
Actively monitored cloud servers using Nagios for performance and uptime.
Generated reports from the cloud server to track active and inactive device counts.
Managed cloud infrastructure upgrades and maintenance services.
Troubleshooting Linux server related issues and Linux administration tasks.
Provided Level 1 customer support via calls and emails, resolving technical issues promptly.
Collaborated with the product management team to improve products based on customer feedback, addressing issues, and implementing new feature requests.
Application support for projects like Comcast Wi-Fi Pro business and WatchGuard Technologies. Ensures that all incidents are managed robustly and effectively and that any business impact is identified and minimized.
Application Support Analyst
ADP (Automatic Data Processing)
Pune
04.2014 - 01.2017
Provided Level 1 customer support via calls and emails, resolving technical issues promptly.
Weekly call with engineering/L3 teams to discuss pending issues.
Monitor application performance using Splunk and depth log analysis.
Application deployment on the Enterprise servers.
Troubleshooting of Web/App during production issues and reporting bugs to the L3 team.
Sr. Associate
WNS
Pune
02.2012 - 04.2014
Answered employee's inquiries in person, email and via telephone.
Installation, configuration and maintenance of Windows OS.
Troubleshooting of WLAN and LAN issues.
Diagnosed and resolved PC problems and software issues.
Key Account & Marketing Manager at Kuvos Tech (A DYNAMIK LABS GROUP OF COMPANY)Key Account & Marketing Manager at Kuvos Tech (A DYNAMIK LABS GROUP OF COMPANY)