Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Timeline
Generic

Radhika Pamarthi

Senior Site Reliability Engineer
Hanamkonda

Summary

Highly skilled Senior Site Reliability Engineer with 7 years of expertise in managing large-scale cloud infrastructure, focusing on Azure Kubernetes Service (AKS), platform automation, and incident remediation. Experienced in improving platform performance, reliability, and observability while driving system health through proactive monitoring and incident management. A strong problem solver with expertise in identifying service patterns, creating automation solutions, and collaborating with cross-functional teams to enhance cloud-based services. Passionate about optimizing platforms to support developers and businesses while maintaining high availability and scalability.

Overview

6
6
years of professional experience
6
6
years of post-secondary education
2
2
Certifications
2
2
Languages

Work History

Senior Site Reliability Engineer

Nagarro
Hanamkonda
10.2022 - Current
  • Manage the health and performance of a global-scale Kubernetes platform (AKS), ensuring platform components are continuously updated and aligned with dependencies across other tech stacks.
  • Proactively monitor service alerts, identifying recurring issues and implementing automation for metadata enrichment and auto-remediation, reducing manual intervention by 50%.
  • Lead cross-functional collaboration with development teams to define, design, and deliver platform enhancements aimed at improving system stability and scaling capabilities.
  • Build and maintain advanced monitoring dashboards using Prometheus and Grafana, enabling real-time issue tracking and incident detection for quicker resolutions.
  • Implement regular upgrades and patches, ensuring seamless integration and reducing downtime by establishing effective testing and rollback strategies.
  • Enhance security compliance across the platform, working with security teams to ensure updates and configurations adhere to organizational policies and industry standards.
  • Evaluated new technologies and tools to enhance overall system performance, stability, and security.
  • Developed custom scripts/tools as needed to automate routine tasks, increasing overall team productivity and efficiency.

Site Reliability Engineer

Rapid Circle
03.2022 - 09.2022
  • Ensured the continuous health and scalability of cloud-native services deployed on Kubernetes, with a strong focus on Azure infrastructure.
  • Developed automated solutions for common platform issues, reducing downtime by creating self-healing scripts that triggered auto-remediation actions for specific incidents.
  • Improved the observability of the platform by designing custom monitoring solutions and alerting systems to quickly detect and troubleshoot system failures.
  • Conducted post-incident reviews, identifying root causes of failures, and implemented preventive measures and improvements to mitigate future disruptions.
  • Maintained a high level of operational excellence through the management of Azure Kubernetes clusters, ensuring consistent deployment of services, application monitoring, and performance optimization.

Cloud Consultant

Value Momentum
Hyderabad
10.2018 - 02.2022
  • Led the deployment and management of cloud infrastructure, utilizing Azure services to support Kubernetes-based workloads and cloud-native applications.
  • Created and managed infrastructure-as-code (IaC) deployments with Terraform and ARM templates, ensuring consistency across development, staging, and production environments.
  • Developed custom dashboards and alerts to monitor system health and service performance, helping detect bottlenecks and prevent outages.
  • Assisted in defining best practices for Kubernetes deployments, upgrades, and service maintenance across multiple environments.
  • Supported infrastructure scalability by automating capacity planning and resource provisioning, optimizing performance while reducing costs.
  • Administered cloud and on-premises infrastructure for critical systems, ensuring 99.99% uptime and resolving system outages within predefined SLAs.
  • Monitored and managed system configurations, security patches, and performance metrics for a range of Linux and Windows-based environments.
  • Collaborated with development teams to implement automated monitoring solutions and logging tools, improving incident detection and response time.
  • Implemented network security protocols, managing firewalls, VPNs, and ensuring compliance with regulatory standards.
  • Established CI/CD pipelines using Azure DevOps to streamline development workflows, from code deployment to monitoring in production environments.
  • Automated infrastructure provisioning and management using Azure DevOps, Terraform, and PowerShell, reducing manual intervention and accelerating project timelines.
  • Utilized Azure Monitor, Log Analytics, and Application Insights for proactive monitoring, troubleshooting, and optimizing the performance of deployed cloud applications.

Education

Bachelor of Technology -

JNTUK
09.2013 - 05.2017

Intermediate -

SRI SARADA JR COLLEGE
06.2011 - 03.2013

SSC -

ZPHS
06.2010 - 04.2011

Skills

Microsoft Azure

Kubernetes, Docker, Helm, AKS

Terraform, ARM Templates, Ansible

Prometheus, Grafana, Azure Monitor

Python, Bash, PowerShell, Ansible

undefined

Accomplishments

  • Led the design and implementation of an auto-remediation system using Python, Bash, and Ansible to address frequent service failures, significantly reducing manual intervention time and improving platform uptime.
  • Automated Kubernetes cluster upgrades and scaling processes using Helm and Terraform, ensuring that infrastructure remained highly available and optimized for load balancing.
  • Developed a custom Grafana dashboard that provided insights into Kubernetes cluster health, reducing incident response times and enabling proactive issue detection before impacting users.

Certification

AZ104

Timeline

AZ104

03-2023

AZ400

01-2023

Senior Site Reliability Engineer

Nagarro
10.2022 - Current

Site Reliability Engineer

Rapid Circle
03.2022 - 09.2022

Cloud Consultant

Value Momentum
10.2018 - 02.2022

Bachelor of Technology -

JNTUK
09.2013 - 05.2017

Intermediate -

SRI SARADA JR COLLEGE
06.2011 - 03.2013

SSC -

ZPHS
06.2010 - 04.2011
Radhika PamarthiSenior Site Reliability Engineer