Summary
Overview
Work History
Education
Skills
Timeline
Generic

SUMIT KUMAR

Bangalore

Summary

Experienced and results-driven Site Reliability Engineering Manager with a proven track record of building and managing highly scalable, secure, and resilient infrastructure. Led cross-functional SRE teams, automated operations, and improved deployment efficiency by 30% through CI/CD modernization. Skilled in AWS, Kubernetes, Terraform, incident management, and system observability. Known for driving reliability through automation, reducing MTTR, and implementing security best practices across cloud platforms.

Overview

10
10
years of professional experience

Work History

SRE Manager

Khoros
05.2023 - Current
  • Manage and mentor a high-performing SRE team of 10 people, driving a culture of ownership, collaboration, and continuous improvement.
  • Define and uphold SLOs/SLIs to ensure high availability and performance for critical services.
  • Oversee cloud infrastructure (AWS, EKS) with a focus on scalability, reliability, and automation via Terraform, and ArgoCD.
  • Streamline incident management processes, resulting in a significant reduction in MTTR.
  • Observability practices using Datadog and Sumo Logic, optimizing alerting and monitoring strategies.
  • Enforce infrastructure and application security using mTLS, RBAC, and WAF rules.
  • Collaborate cross-functionally with Dev, Product, and Support teams for production readiness and smooth deployments.
  • Lead monthly reviews on cloud spend and implemented DynamoDB cost optimization strategies.

Lead SITE RELIABILITY ENGINEER

Khoros India Pvt Limited
05.2022 - 05.2023
  • Implemented Web Application Firewall for more than 1000 Load balancers, CDNs, Api gateways in AWS and GCP
  • Added security features like Shield advanced for Infra residing in AWS cloud
  • Automated security features deployment using Github
  • Developed and maintained a mission-critical application designed to mitigate and manage outages effectively
  • Migrated legacy build pipelines to latest CI/CD Jenkins
  • Worked on Writing end-to-end Automation Scenarios for many modules
  • Stabilizing smoke and functional Suite while being SDET engineer
  • Covered API endpoints using selenium and TestNG for different module in our product

SENIOR SRE

Khoros India pvt Limited
03.2021 - 05.2022
  • Led a team of SREs responsible for the design, deployment, and maintenance of critical infrastructure components
  • Conducted incident reviews, root cause analysis and implemented preventive measures to enhance system resilience
  • Provided technical guidance on the design, implementation, and maintenance of cloud infrastructure
  • Implemented automation tools to increase efficiency in deployment processes
  • Monitored systems performance using various metrics such as latency, throughput, availability
  • Created automated scripts for software deployments and configuration management tasks
  • Maintained security policies for the organization's cloud services according to industry standards
  • Documented best practices and procedures for incident response activities

SRE-3

Khoros India pvt limited
10.2019 - 03.2021
  • Developed and implemented monitoring solutions to improve system reliability
  • Researched and evaluated new technologies to enhance platform reliability and stability
  • Optimized existing infrastructure components for cost savings while ensuring compliance requirements
  • Performed capacity planning activities based on current usage trends and future projections
  • Conducted monthly progress meetings to inform senior leadership and stakeholders of project advancements

SRE - 2

Khoros
10.2018 - 10.2019

Focused on improving system reliability, automating operations, and scaling cloud infrastructure using AWS, Kubernetes, and Terraform.

SDET II

Lithium Technologies Pvt Ltd
07.2015 - 01.2018
  • Experience in Automating Web UI Application Testing using Selenium WebDriver and Rest Api with TestNG framework
  • Technology: JAVA
  • Tools Used: Selenium Web driver for Ui with Junit, Rest API Testing using Java + Retrofit

Education

Bachelor of Engg. - Information Science

SJCE
Mysore
01.2015

AISSCE(12th) - PCM

MPS
Forbesganj
01.2010

AISSE (10th) - Science

Forbesganj
01.2008

Skills

  • Infrastructure as Code (IaC): Cloudformation, Terraform
  • Cloud Platforms : AWS, GCP
  • Security : Firewall, Shield Advanced, Rate Limiting, DDoS Mitigation
  • Containerization : Docker, K8s
  • Monitoring and Logging: DataDog, Sumo Logic, Nagios
  • Scripting and Programming : Java, Shell, Apache Velocity, AWS Cli
  • CI/CD : Jenkins, Github Actions
  • Version Control : Git, Bitbucket, SVN
  • Networking: DNS , Load Balancing
  • Incident Management : Pager Duty
  • Collaboration : Jira, Confluence
  • Workforce management

Timeline

SRE Manager

Khoros
05.2023 - Current

Lead SITE RELIABILITY ENGINEER

Khoros India Pvt Limited
05.2022 - 05.2023

SENIOR SRE

Khoros India pvt Limited
03.2021 - 05.2022

SRE-3

Khoros India pvt limited
10.2019 - 03.2021

SRE - 2

Khoros
10.2018 - 10.2019

SDET II

Lithium Technologies Pvt Ltd
07.2015 - 01.2018

Bachelor of Engg. - Information Science

SJCE

AISSCE(12th) - PCM

MPS

AISSE (10th) - Science

SUMIT KUMAR