

Principal Cloud Platform Engineer and Site Reliability Engineer with 9.7+ years of experience designing, operating, and scaling large-scale, high-availability production systems across Microsoft Azure, Google Cloud Platform, and AWS. Proven expertise in SRE principles (SLO, SLI, SLA), Kubernetes-based platforms, Infrastructure as Code, CI/CD automation, and deep Linux and networking fundamentals.
Strong background in cloud platform engineering, DevOps automation, and reliability ownership, including incident management, root cause analysis, MTTR reduction, and observability. Experienced in building secure, resilient, and compliant cloud architectures using policy-as-code, defense-in-depth security models, and governance frameworks. Demonstrated ability to lead DevOps and SRE initiatives, mentor engineers, and collaborate with cross-functional stakeholders to deliver reliability-first, scalable, and cost-optimized platforms.
Cloud: Microsoft Azure, Google Cloud Platform (GCP), AWS, VMware
Architecture: Multi-Cloud, Hybrid Cloud, High Availability, Scalability, Disaster Recovery
SRE: SLO, SLI, SLA, Error Budgets, Incident Management, On-Call, RCA, Post-Mortems, MTTR, Capacity Planning, High-Traffic Systems
DevOps & CI/CD: CI/CD Pipeline Design, Jenkins, GitHub Actions, Azure DevOps, Cloud Build, GitOps, Release Engineering
Containers: Kubernetes (AKS, GKE, EKS), Docker, Helm, Kustomize, Microservices, Ingress, Autoscaling, Rolling Deployments, Workload Identity
IaC & Automation: Terraform, Ansible, ARM Templates, Policy-as-Code, Governance Automation, Python, Bash, Shell
Observability: Prometheus, Grafana, Azure Monitor, Cloud Monitoring, Splunk, New Relic, Metrics, Logs, Alerts, Latency Monitoring
Networking & Security: Linux, TCP/IP, DNS, HTTP(S), Load Balancing, VPC/VNet, Hub-Spoke, Private Connectivity, IAM, RBAC, Secrets, Zero Trust, ISO, SOC, GDPR
Leadership: Technical Leadership, Mentoring, Stakeholder Communication, Architecture Reviews, Documentation, Agile, Scrum, DevOps Culture