Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Timeline
Generic

Shiva Kumar Paila

Hyderabad

Summary

Experienced Site Reliability & DevOps Engineer with 8 years in automating and deploying large-scale cloud environments across AWS and Azure Kubernetes Service. Expertise in Terraform, CI/CD, Kubernetes, and observability, driving cloud cost reductions and ensuring high availability for microservices. Proficient in Ansible, Jenkins, ELK, and scripting, enhancing operational efficiency and security.

Overview

9
9
years of professional experience
2
2
Certifications

Work History

DevOps Engineer

Cognizant Technology Solutions
Hyderabad
03.2020 - Current
  • Migrated 30+ production servers from CentOS to Ubuntu with zero downtime through phased rollout and rollback validation strategy, ensuring service continuity.
  • Managed and maintained Kubernetes workloads including Deployments, Services, ConfigMaps, Secrets, and Horizontal Pod Autoscalers for 20+ microservices in production.
  • Implemented Blue-Green and weighted routing deployment strategies in Kubernetes, ensuring zero-downtime releases and controlled traffic shifting.
  • Designed and built end-to-end CI/CD pipelines using Jenkins, and GitHub Actions for Java-based microservices, covering build, unit testing, code quality analysis (SonarQube), Docker image creation, and EKS deployment using Helm.
  • Configured CloudWatch monitoring, SNS alerting, and custom dashboards for EC2, ALB, and application-level metrics to ensure proactive incident detection.
  • Automated SSL/TLS certificate renewals with Jenkins, S3, Ansible, and Auto Scaling Groups to eliminate manual interventions, enhancing security and reliability.
  • Provisioned and managed AWS infrastructure components including EC2, S3, CloudFront, IAM, and ALB using Terraform and CloudFormation following Infrastructure-as-Code best practices.
  • Executed structured decommissioning of 30+ AWS resources with dependency mapping, impact analysis, and rollback validation to ensure safe cleanup of legacy environments.
  • Developed AWS Lambda functions (Python) to optimize cloud costs by automatically identifying and deleting unused AMIs, snapshots, and orphaned resources; provided APIs for controlled execution and integrated with Postman for testing.

Site Reliability Engineer

Cognizant Technology Solutions
Chennai
04.2017 - 02.2020
  • Reduced incident resolution time by 25–30% through production support for over 20 microservices.
  • Achieved 99.95% uptime with proactive monitoring and automated healing scripts.
  • Developed custom health checks in Shell and Python to minimize false alerts and enhance MTTR.
  • Managed SSL/TLS certificates using AWS ACM and Azure Key Vault for secure communications.
  • Created CloudWatch and Kibana dashboards to monitor CPU, memory, error rates, and APIs.
  • Automated log cleanup on Linux servers, achieving a 60% reduction in disk usage.
  • Documented runbooks to expedite recovery processes for operations teams.
  • Facilitated on-call processes to monitor system performance and prevent disruptions.

Education

Btech - Computer Science And Engineering

Anurag University
Hyderabad
08-2016

Skills

  • Cloud platforms: AWS and Azure
  • Infrastructure as code: Terraform
  • Containerization and orchestration: Docker, Kubernetes
  • Continuous integration and delivery: Jenkins, GitHub Actions
  • Automation scripting: Shell scripting
  • Monitoring tools: Kibana, Grafana, Prometheus
  • Database management: MySQL, MongoDB, DynamoDB
  • Development tools: Git, JIRA, BMC Helix

Certification

AWS Cloud Practitioner

Accomplishments

Received appreciation for automating the alerts based on the errors in our microservices, which saved a huge amount of time in identifying which service is failing.

Timeline

DevOps Engineer

Cognizant Technology Solutions
03.2020 - Current

Site Reliability Engineer

Cognizant Technology Solutions
04.2017 - 02.2020

Btech - Computer Science And Engineering

Anurag University
Shiva Kumar Paila