DevOps & Site Reliability Engineer with 3 YOE in building and automating cloud-native infrastructure across AWS & GCP. Skilled in Kubernetes, CI/CD, Infrastructure as Code, and observability tools (Prometheus, Grafana, ELK, APM), with proven expertise in cost optimization, monitoring, and system reliability.
Disaster Recovery & Resilience – Replicated AWS infrastructure (EC2, ECS, EKS, RDS, Route53, SNS, VPC, VPC Peering, Codedeploy, Cloudwatch, Lambda, Elasticache, Auto Scaling Groups, Load Balancers, ECR) to the Hyderabad region using Terraform, ensuring business continuity and high availability
Compliance & Security – Conducted infrastructure audits aligned with ISO 27000 & DPDP standards, enabling certification readiness and strengthening cloud security practices.
FinOps & Monitoring Optimization – Replaced New Relic with Signoz on EKS, reducing monitoring costs by 55% while enhancing application performance visibility.
One-Click Environment – provisioning (server setup, RDS DB creation, Parameter Store, Nginx, DNS, CDN, deployment, and testing) to streamline release process using Terraform and Python Boto3.
Cloud Automation – Developed AWS Lambda + Python workflows for incremental RDS backups, scheduled start/stop of EC2 & RDS instances, daily AMI and snapshot creation, and automated cleanup to optimize cost and ensure disaster recovery readiness.
DevOps Engineer
ShopDeck (E-Commerce Company)
06.2024 - 02.2025
CI/CD Optimization – Migrated from GCP Cloud Build to Jenkins, lowering CI/CD costs and improving build pipeline efficiency. Automated deployments with ArgoCD for Kubernetes.
Infra as Code – Automated provisioning of nginx VMs and domain onboarding for 1500+ sellers using Terraform + Ansible, integrated with SSL certbot.
Observability & Logging – Implemented ELK stack for Kubernetes pod logging with Logstash GROK filters and Index Lifecycle Management (ILM) to optimize log storage.
Data Engineering Automation – Automated Pub/Sub → BigQuery integration with Terraform for scalable analytics data flow.
Unified Monitoring & Alerts – Configured Prometheus + Grafana dashboards, migrated alerts from GCP to Grafana OnCall to improve incident response.
Deployed Percona Monitoring and Management (PMM) to detect slow queries, enhancing database performance visibility.
Site Reliability Engineer
Dukaan (E-Commerce Company)
02.2023 - 04.2024
Kubernetes & GitOps – Implemented GitOps with ArgoCD for multi-service deployments, improving release reliability and reducing manual overhead.
Cloud Migration – Migrated data from GCS to Wasabi using Rclone with strict bucket policies, optimizing storage usage and security.
Monitoring & Observability – Set up Prometheus + Grafana with custom dashboards, deployed SigNoz and Sentry to enhance application performance monitoring and reduce incident resolution time.
Redis Performance Optimization – Configured and optimized Redis key eviction policies, ensuring efficient memory usage and stable system performance.