Work Preference
Summary
Overview
Work History
Education
Skills
Certification
Languages
Accomplishments
Work Availability
Affiliations
Quote
Software
Interests
Websites
Timeline
AdministrativeAssistant
Sourav Jha

Sourav Jha

Bangalore,KA

Work Preference

Work Type

Full Time

Location Preference

On-SiteRemoteHybrid

Important To Me

Career advancementWork-life balanceCompany CultureFlexible work hoursPersonal development programsHealthcare benefitsWork from home optionPaid time offPaid sick leaveTeam Building / Company Retreats

Summary

Principal Cloud Platform Engineer and Site Reliability Engineer with 9.7+ years of experience designing, operating, and scaling large-scale, high-availability production systems across Microsoft Azure, Google Cloud Platform, and AWS. Proven expertise in SRE principles (SLO, SLI, SLA), Kubernetes-based platforms, Infrastructure as Code, CI/CD automation, and deep Linux and networking fundamentals.

Strong background in cloud platform engineering, DevOps automation, and reliability ownership, including incident management, root cause analysis, MTTR reduction, and observability. Experienced in building secure, resilient, and compliant cloud architectures using policy-as-code, defense-in-depth security models, and governance frameworks. Demonstrated ability to lead DevOps and SRE initiatives, mentor engineers, and collaborate with cross-functional stakeholders to deliver reliability-first, scalable, and cost-optimized platforms.

Overview

10
10
years of professional experience
1
1
Certificate

Work History

System Engineer III

Tesco
04.2025 - Current
  • Serve as Principal Cloud Platform Engineer for Cloud Platform team responsible for architecture, reliability, security, and operations of enterprise-scale Azure cloud platforms supporting global workloads.
  • Act as both Site Reliability Engineer and Cloud Developer, enabling compute, networking, security, Kubernetes, and AI-enabled cloud services in production environments.
  • Design, build, and operate Azure-native infrastructure including VMs, VNets, AKS, Load Balancers, Private Endpoints, Virtual WAN (vWAN), and DNS, ensuring high availability, scalability, and resilience.
  • Own reliability and availability of business-critical, high-traffic cloud platforms, proactively addressing risks before customer impact.
  • Define and implement SLOs, SLIs, and alerting strategies to monitor latency, error rates, saturation, and availability across distributed systems.
  • Drive SRE best practices, including incident response, on-call operations, root cause analysis (RCA), post-incident reviews, and long-term reliability improvements.
  • Implement defense-in-depth security architectures using Azure Policy, Microsoft Defender for Cloud, RBAC, and industry-aligned security benchmarks to reduce vulnerabilities and enforce secure-by-default deployments.
  • Enforce Policy-as-Code and cloud governance frameworks, ensuring compliance, standardization, and audit readiness across environments.
  • Manage and support production Kubernetes platforms (AKS), including cluster hardening, secure workload identity, autoscaling, and controlled application rollouts.
  • Automate infrastructure provisioning and configuration using Terraform and Python, reducing manual operations, configuration drift, and operational risk.
  • Design and optimize secure cloud networking architectures (private connectivity, load balancing, DNS, segmentation) to support compliant and scalable workloads.
  • Implement end-to-end observability (metrics, logs, alerts) to improve visibility, reduce MTTR, and enhance operational decision-making.
  • Collaborate with application teams, security, and platform stakeholders to ensure reliability-first system design and secure application delivery.
  • Mentor engineers and contribute to runbook, SOPs, architecture documentation, and operational best practices.

Senior DevOps Engineer

Luxoft
06.2022 - 04.2025
  • Led DevOps and Site Reliability Engineering initiatives for application teams, owning end-to-end cloud infrastructure, CI/CD processes, and production automation for enterprise applications.
  • Acted as SDE III for a large-scale retail client Tesco, supporting engineering teams with reliable, secure, and scalable cloud platforms.
  • Designed and built CI/CD pipelines with automated testing, security scans, and infrastructure provisioning, improving deployment reliability and release confidence.
  • Established monitoring and alerting frameworks to ensure SLA compliance, proactively detecting issues and reducing incident recurrence.
  • Drove cloud and governance initiatives, aligning infrastructure and delivery pipelines with automation, compliance and security standards.
  • Guided teams through cloud migration and modernization efforts, ensuring minimal production disruption and improved platform resilience.
  • Automated infrastructure provisioning and configuration using Infrastructure as Code principles, reducing manual effort and operational risk.
  • Partnered closely with Product Managers, Developers, Architects, and Security teams to ensure smooth delivery and alignment with business requirements.
  • Contributed to architecture decisions, cost-optimization strategies, and reliability improvements, balancing performance, security, and efficiency.
  • Provided hands-on production support and incident resolution, applying SRE principles and root cause analysis to prevent repeat issues.
  • Worked on-site for approximately two years in Warsaw Poland with Central Europe team, enabling close collaboration with client stakeholders and faster decision-making.

Senior Cloud Analyst

Accenture
01.2021 - 06.2022
  • Worked as a Senior Cloud Infrastructure Analyst supporting large-scale enterprise production systems for E-Commerce client H&M across public cloud environments.
  • Engineered and optimized CI/CD pipelines using Azure DevOps, and Ansible, improving deployment speed and reducing release failures.
  • Supported Kubernetes-based platforms across cloud environments, ensuring high availability, scalability, and performance for containerized workloads.
  • Established SRE practices including proactive monitoring, alerting, and incident response, ensuring high availability and SLA adherence.
  • Designed and optimized cloud networking components to improve application availability and performance.
  • Led cloud migration and modernization initiatives, ensuring minimal disruption to business-critical workloads.
  • Responded to production incidents, performing root cause analysis (RCA) and implementing preventive solutions to reduce recurrence.
  • Authored technical documentation, runbook, and operational procedures, improving support efficiency and knowledge sharing.
  • Collaborated with stakeholders to align infrastructure solutions with business, security, and scalability requirements.

Senior Software Engineer

Think Future Technologies
10.2020 - 12.2020
  • Designed and implemented end-to-end CI/CD pipelines using Azure DevOps, and Git, enabling automated build, test, and deployment workflows for microservices-based applications.
  • Containerized applications using Docker, standardizing runtime environments and reducing deployment inconsistencies across environments.
  • Deployed and supported Kubernetes-based workloads, managing manifests, services, ingress configurations, and rollout strategies.
  • Automated infrastructure provisioning and configuration using Terraform and shell scripting, improving repeatability and reducing manual setup errors.
  • Integrated artifact repositories into CI/CD pipelines to manage versioned application builds and dependencies.
  • Implemented deployment strategies to minimize downtime during releases.
  • Supported production and non-production environments, troubleshooting deployment failures, container crashes, and pipeline issues.
  • Implemented basic monitoring and logging to track application health, resource utilization, and deployment success.
  • Collaborated closely with developers and QA teams to streamline release cycles and resolve environment-related blockers.

Technical Specialist

IBM
07.2016 - 10.2020
  • Supported Linux-based production systems, performing system administration, monitoring, and troubleshooting activities.
  • Assisted in server provisioning, configuration, patching, and performance tuning across environments.
  • Wrote basic shell scripts to automate repetitive operational tasks and improve efficiency.
  • Supported infrastructure components such as web servers, databases, and middleware services, ensuring uptime and stability.
  • Participated in incident response and root cause investigations, escalating issues and implementing corrective actions.
  • Built strong foundational knowledge in operating systems, networking concepts, and system reliability, forming the base for advanced DevOps and SRE roles.

Education

B.Tech - Computer Science & Engineering

Galgotia University
Greater Noida, India
06-2016

Skills

  • Cloud: Microsoft Azure, Google Cloud Platform (GCP), AWS, VMware
  • Architecture: Multi-Cloud, Hybrid Cloud, High Availability, Scalability, Disaster Recovery
  • SRE: SLO, SLI, SLA, Error Budgets, Incident Management, On-Call, RCA, Post-Mortems, MTTR, Capacity Planning, High-Traffic Systems
  • DevOps & CI/CD: CI/CD Pipeline Design, Jenkins, GitHub Actions, Azure DevOps, Cloud Build, GitOps, Release Engineering
  • Containers: Kubernetes (AKS, GKE, EKS), Docker, Helm, Kustomize, Microservices, Ingress, Autoscaling, Rolling Deployments, Workload Identity
  • IaC & Automation: Terraform, Ansible, ARM Templates, Policy-as-Code, Governance Automation, Python, Bash, Shell
  • Observability: Prometheus, Grafana, Azure Monitor, Cloud Monitoring, Splunk, New Relic, Metrics, Logs, Alerts, Latency Monitoring
  • Networking & Security: Linux, TCP/IP, DNS, HTTP(S), Load Balancing, VPC/VNet, Hub-Spoke, Private Connectivity, IAM, RBAC, Secrets, Zero Trust, ISO, SOC, GDPR
  • Leadership: Technical Leadership, Mentoring, Stakeholder Communication, Architecture Reviews, Documentation, Agile, Scrum, DevOps Culture

Certification

  • AWS Certified Solutions Architect
  • Microsoft Azure DevOps Solutions Expert
  • Microsoft Azure Administrator
  • AWS Cloud Practitioner
  • ITIL® 4 Foundation

Languages

English
Bilingual or Proficient (C2)
Hindi
Bilingual or Proficient (C2)

Accomplishments

  • Captain – Computer Science Cricket Team: Led the team in inter-department tournaments, demonstrating leadership and teamwork.
  • Hospitality & Management Coordinator – BookMyShow (IPL 2014 & 2015): Coordinated large-scale event operations, ensuring smooth execution and stakeholder alignment.
  • Motorsport Marshal – BookMyShow: Supported high-profile events at Buddh International Circuit (Indian GP, JK Tyre Championship), gaining experience in high-pressure environments.
  • Image Encryption & Decryption (Blowfish Algorithm): Applied cryptography principles to secure digital data.
  • Criminal Investigation System: Contributed to a cloud-based project management solution tailored for investigation workflows, showcasing SaaS solution design.

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Affiliations

  • Strategic problem-solving and analytical mindset
  • Effective communicator, mentor, and collaborative leader
  • Results-driven, detail-oriented, and customer-focused
  • Proactive, adaptable, and innovative in leveraging emerging technologies
  • Strong stakeholder management with focus on security, compliance, and operational excellence

Quote

Even if you are on the right track, you’ll get run over if you just sit there.
Will Rodgers

Software

Microsoft 365

ChatGPT

DrawIO

Rancher

Interests

Cricket

Travel

Technical Blogs

Mentoring and Coaching

Timeline

System Engineer III

Tesco
04.2025 - Current

Senior DevOps Engineer

Luxoft
06.2022 - 04.2025

Senior Cloud Analyst

Accenture
01.2021 - 06.2022

Senior Software Engineer

Think Future Technologies
10.2020 - 12.2020

Technical Specialist

IBM
07.2016 - 10.2020

B.Tech - Computer Science & Engineering

Galgotia University
Sourav Jha