Site Reliability Engineer / DevOps Engineer with 6+ years of experience building, operating, and scaling reliable systems on AWS and Kubernetes. Expert in observability (Datadog, Prometheus, Grafana), incident response, performance tuning, and Infrastructure-as-Code with Terraform. Delivered multiple zero-downtime migrations (Redis.io, CloudAMQP) and engineered DR for EKS to uphold strict SLAs and improve MTTA/MTTR.
Cloud: AWS (EC2, EBS, EFS, ALB/ELB, S3, Route 53, VPC, CloudWatch, API Gateway, ElastiCache), CloudAMQP