Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Sathish Nelavoy

Dallas

Summary

Seasoned Devops Engineer with over 10+ years of experience driving operational excellence, service resilience, and scalable infrastructure across complex, mission-critical systems. Proven track record in leading SRE and technical support teams, managing business collaboration platforms, and delivering innovative solutions to enhance availability, reliability, and performance. Expert in SRE best practices, including SLO/SLI definition, observability (AppDynamics, Datadog, Splunk), incident management, and automation of operational toil. Adept at architecting and deploying modern, containerized applications using Docker, Kubernetes, and AWS ECS, and building robust CI/CD pipelines with GitLab, Jenkins, and Terraform. Strong leadership in cross-functional collaboration across development, networking, end-user support, and cloud engineering. Passionate about fostering a culture of technical excellence, building self-service tools, and aligning engineering outcomes with business objectives. Demonstrated ability to manage hybrid environments, modernize legacy systems, and lead global teams in high-availability production environments. Senior Site Reliability Engineer with proven success in reducing incident response times by 40% and optimizing deployment processes through automation. Expert in AWS, Docker, and Kubernetes, contributing to enhanced system reliability.

Overview

12
12
years of professional experience
1
1
Certification

Work History

Senior Site Reliability Engineer

RitePros/COPART
Dallas
08.2025 - Current
  • Led a cross-regional SRE team supporting business collaboration applications, reducing incident response time by 40% through streamlined on-call rotations and real-time alerting improvements.
  • Improved system Mean Time to Resolution (MTTR) by 30% by optimizing alert routing and developing comprehensive runbooks.
  • Implemented Infrastructure as Code (IaC) using Terraform/CloudFormation for all AWS resources across multiple accounts, reducing environment provisioning time from 4 hours to 15 minutes.
  • Automated the entire deployment process using AWS CodePipeline/CodeDeploy and Jenkins, increasing deployment frequency by 6x while maintaining zero-downtime releases.
  • Implemented SRE best practices across development teams, including error budgets, SLI/SLO definition, and post-incident review frameworks—leading to a 30% drop in incident recurrence.
  • Copart is a global leader in 100% online car auctions featuring used, wholesale and repairable vehicles.

DevOps Engineer

RAND MERCHANT BANK
Johannesburg
05.2021 - 07.2025
  • Architected and deployed containerized microservices using Docker and Kubernetes on AWS ECS, improving system scalability and reducing deployment-related downtimes by 60%.
  • Deployed and managed containerized applications using Kubernetes, ensuring high availability, scalability, and fault tolerance across environments.
  • Standardized infrastructure deployment across multiple environments (Dev, QA, Prod) by migrating legacy manual processes to Terraform (HCL), resulting in a 75% reduction in environment setup time.
  • Implemented archival and rotational policies via Cron jobs to manage Linux disk space, resulting in 200gb of saving monthly.
  • Integrated Terraform into CI/CD pipelines to automate the init, plan, and apply workflow, enabling GitOps principles for infrastructure changes.
  • Managed and maintained L7 traffic flow for a microservices platform using Ingress Controllers, Service Mesh and API Gateways, ensuring 99.99% uptime and high availability.
  • Established a unified troubleshooting process utilizing AppDynamics to pinpoint the slow transaction and Splunk to analyse application and infrastructure logs, increasing the efficiency of the DevOps team by 20%.
  • Utilized terraform import and state manipulation techniques to onboard and manage existing production infrastructure under IaC control without downtime.
  • Reduced Mean Time to Recovery (MTTR) by 45% by building robust monitoring dashboards and setting actionable alerts using Datadog and Splunk.
  • Built CI/CD pipelines using Jenkins and GitLab CI, decreasing deployment time from hours to under 15 minutes and enabling safe, frequent releases across environments.
  • RMB is a leading African Corporate and Investment Bank and part of one of the largest financial services groups (by market capitalisation) in Africa – FirstRand Limited.

Site Reliability Engineer

ACCENTURE/HASBRO
Pawtucket
08.2016 - 11.2020
  • Configured and managed Quality Gates to enforce code quality standards, ensuring all new code met a minimum threshold before being deployed to production.
  • Utilized GitHub Actions to automate build, test, and deployment workflows, reducing manual intervention and improving code quality.
  • Managed release cycles for major, minor, and hotfix deployments, strictly adhering to versioning standards and maintaining comprehensive audit trails.
  • Architected and managed Maven pom.xml files for large-scale enterprise applications, ensuring consistent deployment across Dev, QA, and Production environments.
  • Integrated Ant build scripts into Continuous Integration (CI) platforms such as Jenkins, for automated nightly and on-demand builds.
  • Set up and managed the SonarQube Scanner across multiple project repositories to provide continuous static code analysis and immediate feedback to developers.
  • Implemented automated alerts and notifications via Slack and email for build failures and deployment status.
  • Configured Datadog alerts and anomaly detection to proactively identify system issues, reducing mean time to detection (MTTD) by 40%.
  • Led the migration of 20+ microservices from on-prem to AWS ECS, enhancing deployment speed by 60% and reducing infrastructure costs by 25%.
  • Hasbro is a global play and entertainment company committed to creating the World’s Best Play Experiences. From toys and games to television, movies, digital gaming and consumer products.

Build and Release Engineer

IBM
Bangalore
03.2014 - 05.2016
  • Interacting directly with client making deployment process automation as per there request.
  • Updating maven scripts for SonarQube.
  • Developed and maintained complex ansible playbooks to automate configuration management.
  • Improved system reliability by implementing automated security patches using Ansible.
  • Created Branches, Tags for each release and merged the branches after the code deployed to Production.
  • Maintain Subversion repository giving access controls to all user.
  • Used Ant &Shell scripts to automate the Build process.
  • Providing 24/7 technical support to Production and development environments.
  • Troubleshoot and resolve for build or deployment pipelines.
  • Maintain documentation for build and release procedures.
  • The organization is responsible for welfare, pensions and child maintenance policy. As the biggest public service department, it administers the state and a range of working age, disability and ill health benefits to over 22 million claimants and customers.

Education

Master of Science - Electronic and mobile communications

Glamorgan university
WALES, UNITED KINGDOM
09-2012

Skills

  • AWS
  • EC2
  • ECS
  • S3
  • RDS
  • CloudWatch
  • ELB
  • CloudFront
  • Dynamo
  • ElastiCache
  • Lambda
  • CodePipeline
  • CodeBuild
  • CodeDeploy
  • Route 53
  • Secrets
  • VPC
  • API Gateway
  • IAM
  • Security Groups
  • Docker
  • Kubernetes
  • EKS
  • Rundeck
  • Ant
  • Maven
  • RTC
  • Datadog
  • Splunk
  • AppDynamics
  • Prometheus
  • Grafana
  • Python
  • Shell
  • Terraform
  • GitHub Actions
  • GitLab CI/CD
  • Jenkins
  • MySQL
  • MongoDB
  • Oracle
  • SVN
  • Git
  • GitHub
  • GitLab
  • Bitbucket
  • PagerDuty
  • Jira
  • ServiceNow
  • Control-M

Certification

• AWS Certified Solutions Architect – Associate
• Certified as Kubernetes and Cloud Native- Associate
• Claude Code In Action

Timeline

Senior Site Reliability Engineer

RitePros/COPART
08.2025 - Current

DevOps Engineer

RAND MERCHANT BANK
05.2021 - 07.2025

Site Reliability Engineer

ACCENTURE/HASBRO
08.2016 - 11.2020

Build and Release Engineer

IBM
03.2014 - 05.2016

Master of Science - Electronic and mobile communications

Glamorgan university
Sathish Nelavoy