
Site Reliability Engineer with 3+ years of experience optimizing distributed systems and infrastructure automation. Expert in implementing Observability frameworks via Prometheus, and Grafana. Proficient in Python-driven automation, incident response, and high-availability architecture design. Proven track record of maintaining 99.9% uptime and scaling containerized environments using Docker and Kubernetes within cloud-native infrastructures.