Highly skilled Senior Site Reliability Engineer with 7 years of expertise in managing large-scale cloud infrastructure, focusing on Azure Kubernetes Service (AKS), platform automation, and incident remediation. Experienced in improving platform performance, reliability, and observability while driving system health through proactive monitoring and incident management. A strong problem solver with expertise in identifying service patterns, creating automation solutions, and collaborating with cross-functional teams to enhance cloud-based services. Passionate about optimizing platforms to support developers and businesses while maintaining high availability and scalability.
Microsoft Azure
Kubernetes, Docker, Helm, AKS
Terraform, ARM Templates, Ansible
Prometheus, Grafana, Azure Monitor
Python, Bash, PowerShell, Ansible
undefinedAZ104
AZ104
AZ400