Experienced Site Reliability & DevOps Engineer with 8 years in automating and deploying large-scale cloud environments across AWS and Azure Kubernetes Service. Expertise in Terraform, CI/CD, Kubernetes, and observability, driving cloud cost reductions and ensuring high availability for microservices. Proficient in Ansible, Jenkins, ELK, and scripting, enhancing operational efficiency and security.
Overview
9
9
years of professional experience
2
2
Certifications
Work History
DevOps Engineer
Cognizant Technology Solutions
Hyderabad
03.2020 - Current
Migrated 30+ production servers from CentOS to Ubuntu with zero downtime through phased rollout and rollback validation strategy, ensuring service continuity.
Managed and maintained Kubernetes workloads including Deployments, Services, ConfigMaps, Secrets, and Horizontal Pod Autoscalers for 20+ microservices in production.
Implemented Blue-Green and weighted routing deployment strategies in Kubernetes, ensuring zero-downtime releases and controlled traffic shifting.
Designed and built end-to-end CI/CD pipelines using Jenkins, and GitHub Actions for Java-based microservices, covering build, unit testing, code quality analysis (SonarQube), Docker image creation, and EKS deployment using Helm.
Configured CloudWatch monitoring, SNS alerting, and custom dashboards for EC2, ALB, and application-level metrics to ensure proactive incident detection.
Automated SSL/TLS certificate renewals with Jenkins, S3, Ansible, and Auto Scaling Groups to eliminate manual interventions, enhancing security and reliability.
Provisioned and managed AWS infrastructure components including EC2, S3, CloudFront, IAM, and ALB using Terraform and CloudFormation following Infrastructure-as-Code best practices.
Executed structured decommissioning of 30+ AWS resources with dependency mapping, impact analysis, and rollback validation to ensure safe cleanup of legacy environments.
Developed AWS Lambda functions (Python) to optimize cloud costs by automatically identifying and deleting unused AMIs, snapshots, and orphaned resources; provided APIs for controlled execution and integrated with Postman for testing.
Site Reliability Engineer
Cognizant Technology Solutions
Chennai
04.2017 - 02.2020
Reduced incident resolution time by 25–30% through production support for over 20 microservices.
Achieved 99.95% uptime with proactive monitoring and automated healing scripts.
Developed custom health checks in Shell and Python to minimize false alerts and enhance MTTR.
Managed SSL/TLS certificates using AWS ACM and Azure Key Vault for secure communications.
Created CloudWatch and Kibana dashboards to monitor CPU, memory, error rates, and APIs.
Automated log cleanup on Linux servers, achieving a 60% reduction in disk usage.
Documented runbooks to expedite recovery processes for operations teams.
Facilitated on-call processes to monitor system performance and prevent disruptions.
Education
Btech - Computer Science And Engineering
Anurag University
Hyderabad
08-2016
Skills
Cloud platforms: AWS and Azure
Infrastructure as code: Terraform
Containerization and orchestration: Docker, Kubernetes
Continuous integration and delivery: Jenkins, GitHub Actions
Automation scripting: Shell scripting
Monitoring tools: Kibana, Grafana, Prometheus
Database management: MySQL, MongoDB, DynamoDB
Development tools: Git, JIRA, BMC Helix
Certification
AWS Cloud Practitioner
Accomplishments
Received appreciation for automating the alerts based on the errors in our microservices, which saved a huge amount of time in identifying which service is failing.