Summary
Overview
Work History
Education
Skills
Timeline
Generic

Kuldeep Surendra

Bengaluru

Summary

Accomplished Senior Site Reliability Engineer with 8+ years of expertise in architecting, migrating, and optimizing large-scale data platforms on Kubernetes (AWS EKS). Proven leader in spearheading organizational Kubernetes adoption, automating application onboarding processes, and migrating critical services from SaaS(e.g. Airflow, Spark, JupyterHub) to cost-effective self-managed infrastructure. Expert in engineering zero-downtime deployment strategies, building robust CI/CD pipelines (ArgoCD, Helm, Jenkins), and implementing comprehensive observability solutions. Adept at developing custom automation tools for resource optimization and initiating governance frameworks for streaming platforms like kafka. Experienced in scaling DevOps teams and mentoring engineers.

Overview

9
9
years of professional experience

Work History

SENIOR SITE RELIABILITY ENGINEER

CRED
01.2023 - Current
  • Spearheaded the adoption and implementation of Kubernetes (AWS EKS) from the ground up for the SRE-Data team, establishing a robust, self-managed infrastructure to support Data Engineering, Science, and Analytics platforms.
  • Led the strategic migration of critical data workloads from SaaS providers to a self-managed EKS environment, including moving Apache Airflow from Astro and Databricks compute to JupyterHub on Kubernetes, Spark on EKS, enhancing control and reducing vendor dependency and licensing cost.
  • Architected and built the complete Kubernetes ecosystem, including setting up Helm chart repositories (ChartMuseum), GitOps workflows with ArgoCD, and standardized deployment patterns for services deployed on kubernetes for data teams.
  • Engineered and deployed a self-managed, scalable metrics platform using VictoriaMetrics on EKS, successfully migrating from SignalFx to provide a cost-effective and highly available observability solution.
  • Enhanced Kafka platform maturity and optimized streaming data infrastructure by developing a Topic & Schema Catalog and Kafka Lens UI, establishing centralized governance, real-time cost attribution, and comprehensive visibility to ensure high-throughput, reliable data pipelines for real-time analytics.

SENIOR DEVOPS ENGINEER

Zeta
06.2020 - 12.2022
  • Designed and automated the onboarding of 100+ banking applications to Kubernetes (EKS), reducing the process from 5 days to under 2 hours using Python, Helm, and Jenkins.
  • Engineered a custom canary deployment process using Python, ArgoCD, and Jenkins, ensuring zero-downtime releases for services with over 1000rpm.
  • Developed an internal resource optimization tool using Python and Prometheus that scientifically analyzed application usage on K8s to correct over-provisioning, leading to significant cost savings.
  • Enhanced cluster observability by creating Grafana dashboards-as-code and Prometheus alerting rules, bundled as dependencies within application Helm charts.
  • Instrumental in scaling the DevOps team from 3 to 8 members by leading interviews, and mentored junior engineers in project planning and execution.

DEVOPS ENGINEER

Box8, Poncho Hospitality Pvt. Ltd
12.2018 - 06.2020
  • Architected and provisioned a fault-tolerant production infrastructure on Kubernetes (AWS EKS).
  • Designed and implemented CI/CD pipelines for Ruby on Rails and Angular applications using Jenkins and Capistrano.
  • Executed a zero-downtime migration of a mission-critical PostgreSQL database from the Singapore to Mumbai AWS region, implementing master-slave streaming replication.

DEVOPS ENGINEER

Qwinix Technologies
09.2016 - 12.2018
  • Worked on developing Web applications using ruby on rails and javascript frameworks
  • Deployed and managed web applications on AWS, creating scalable infrastructure using Terraform and CloudFormation.
  • Established monitoring and alerting for applications and servers using New Relic and CloudWatch to ensure high availability.


Education

Bachelor of Engineering - Computer Science

Visvesvaraya Technological University
07-2016

Skills

  • Experience with AWS infrastructure and management tools
  • Containerization & Orchestration: Kubernetes, Docker, Helm
  • CI/CD & GitOps: Jenkins, ArgoCD, GitLab CI
  • Infrastructure as Code (IaC): Terraform, CloudFormation, Pulumi
  • Monitoring & Observability: Prometheus, Grafana, VictoriaMetrics, New Relic, CloudWatch
  • Programming & Scripting: Python, Shell Scripting, Ruby, JavaScript, Groovy
  • Data & Streaming Technologies: Kafka, PostgreSQL, Airflow, Spark, JupyterHub
  • Operating Systems & Web Servers: Linux, Nginx
  • Proficient in Git version control

Timeline

SENIOR SITE RELIABILITY ENGINEER

CRED
01.2023 - Current

SENIOR DEVOPS ENGINEER

Zeta
06.2020 - 12.2022

DEVOPS ENGINEER

Box8, Poncho Hospitality Pvt. Ltd
12.2018 - 06.2020

DEVOPS ENGINEER

Qwinix Technologies
09.2016 - 12.2018

Bachelor of Engineering - Computer Science

Visvesvaraya Technological University
Kuldeep Surendra