Summary
Overview
Work History
Education
Additional Information
Software
Certification
Timeline
Generic

Sukumar Thangaraj

Site Reliability Manager
Chennai,TN

Summary

  • I'm an experienced IT professional with strong expertise in AI capabilities development, DevOps, cloud platforms, application reliability, and team leadership over 14yrs.
  • I have successfully built and enhanced internally developed applications by integrating AI-driven functionalities and designing resilience and reliability tools.
  • My technical portfolio spans across Azure and AWS, where i have managed, created, and deployed static and dynamic web applications, databases, and data‑warehousing solutions. I have configured CI/CD pipelines in Azure DevOps, utilized diverse DevOps tools for operations, and implemented robust monitoring through AppDynamics and Grafana.
  • Handled core operational responsibilities including incident management, DR activities (with reduced RTO), and Business Impact Analysis (BIA) ensuring high availability and business continuity. Additionally, i have effectively managed a 36‑member team, driving performance, coordination, and smooth delivery across engagements.
  • Overall, I bring a strong blend of technical depth, cloud expertise, operational excellence, and leadership, making you a valuable contributor to any technology‑driven organization.

Overview

14
14
years of professional experience
2
2
Certifications
3
3
Languages

Work History

Site Reliability Engineering Manager

EY
09.2024 - Current
  • Developing AI capabilities for internally built applications to enhance automation and intelligence within the systems.
  • Designed and implemented a Resilience and Reliability Tool for internal use, improving system stability and operational efficiency.
  • Executed Disaster Recovery (DR) activities for a key engagement, successfully reducing the Recovery Time Objective (RTO) to ensure faster service restoration.
  • Managed, created, and set up static and dynamic web applications, databases, and data-warehousing solutions in AWS environment, ensuring secure architecture, scalability, and high operational reliability.
  • Conducted a Business Impact Analysis (BIA) for an application to identify critical functions, dependencies, and risk mitigation priorities.
  • Configured CI/CD pipelines in Azure DevOps to automate build, testing, and deployment processes, ensuring faster and more reliable application delivery.

Associate Technical Architect

Mphasis Technologies
04.2023 - 04.2024
  • Configured and utilized DevOps tools across all operational activities, enabling streamlined workflows, improved automation, and efficient end-to-end delivery processes.
  • Strategically enhancing FedEx applications through SRE and DevOps implementation, leading teams, and solving problems.
  • Configured and utilized DevOps tools across all operational activities, enabling streamlined workflows, improved automation, and efficient end-to-end delivery processes.
  • Overseeing IT governance strategy, iteration planning, and execution.
  • Handled incident management activities, ensuring timely resolution, and minimal impact for business users.
  • Configured AppDynamics across all applications to enable proactive performance monitoring and issue detection.
  • Developed application performance visualizations using Grafana, improving observability and operational insights for stakeholders.

Senior DevOps Engineer

Gain Credit
06.2021 - 11.2022
  • Overseeing SRE duties and release management, managing common issues, service, and ad hoc requests.
  • Improving monitoring for business users.
  • Cost Optimizations – Represent the team in financial forecasting for existing and upcoming products. It’s the process of decommissioning and detaching the unused UAT server and its resources.
  • Release task on Continuous Deployment in patch update, and Manual Database changes.
  • Managed, created, and set up static and dynamic web applications, databases, and data-warehousing solutions, ensuring robust architecture, scalability, and reliable operational performance.

Senior Engineer

MindView
11.2018 - 06.2021
  • Prioritizing and troubleshooting issues to provide production support, and ensure continuous application availability.
  • Managing on-call operations for the Critical Incident Management (CIM) team.

Senior Analyst

Red Beans
09.2011 - 2018

Education

Bachelor of Science - Electrical, Electronics And Communications Engineering

Madha Engineering College
Chennai, India
04.2001 -

Additional Information

Finalist in the EY GDS Tech Olympiad, recognized for developing an AI‑driven chat-bot interface capable of automatically diagnosing and resolving issues based on user prompts.

Optimized a large‑scale codebase in the AWS environment, improving performance and significantly reducing operational costs for business stakeholders.

Software

Cloud : AWS & Azure Services

AI: AI agents, RAG

Ticketing : Jira, Service Now

Monitoring : New Relic, ELK, Data dog, Splunk, AppDynamics, Grafana, Sysdig, SignalFX

DB : MySQL, Postgres, Vector DB

IAC : Terraform

CI/CD: Jenkins, Azure DevOps, Gitlab

SCM : Git, Bit bucket

Artifact : Nexus, Docker Hub

Web : Apache, Nginx Tomcat

Container: Docker

OS: Linux, Windows

Certification

SRE Foundation

Timeline

SRE Foundation

06-2026

AI Agents & RAG

08-2025

Site Reliability Engineering Manager

EY
09.2024 - Current

Associate Technical Architect

Mphasis Technologies
04.2023 - 04.2024

Senior DevOps Engineer

Gain Credit
06.2021 - 11.2022

Senior Engineer

MindView
11.2018 - 06.2021

Senior Analyst

Red Beans
09.2011 - 2018

Bachelor of Science - Electrical, Electronics And Communications Engineering

Madha Engineering College
04.2001 -
Sukumar ThangarajSite Reliability Manager