Summary
Overview
Work History
Education
Skills
Websites
Skills
Personal Information
Timeline
Generic

SANDIP MANDAL

Bengaluru

Summary

Site Reliability Engineer with 3+ years of experience optimizing distributed systems and infrastructure automation. Expert in implementing Observability frameworks via Prometheus, and Grafana. Proficient in Python-driven automation, incident response, and high-availability architecture design. Proven track record of maintaining 99.9% uptime and scaling containerized environments using Docker and Kubernetes within cloud-native infrastructures.

Overview

4
4
years of professional experience

Work History

Site Reliability Engineer

JUSTDIAL
Bengaluru
05.2022 - Current
  • Engineered monitoring ecosystem using Prometheus, Grafana, reduced Mean Time to Detect MTTD by 40% automated alerting
  • Automated distributed API metric extraction by developing Python and Shell scripts
  • Configured F5 Load Balancers Nginx reverse proxies, maintained 99.9% system availability peak traffic
  • Directed Root Cause Analysis RCA production incidents using ELK Stack logs Elasticsearch Logstash Kibana, prevented user-facing downtime
  • Standardized server hardening security patches 500+ Linux servers using Ansible configuration management YUM package manager, achieved 100% compliance
  • Migrated legacy services Docker containers, increased deployment velocity resource utilization.

Education

M.Tech - Computer Science Data Processing

Indian Institute of Technology Kharagpur

M.Sc - Mathematics

Indian Institute of Technology Madras

Skills

  • Docker
  • Kubernetes
  • Ansible
  • Jenkins
  • Infrastructure as Code (IaC)
  • Prometheus
  • Grafana
  • ELK Stack (Elasticsearch, Logstash, Kibana)
  • Nagios
  • Alertmanager
  • Python
  • Bash/Shell Scripting
  • Automation Workflows
  • Nginx
  • F5 Load Balancers
  • TCP/IP
  • DNS
  • SSL/TLS Certificates
  • Linux Administration (RHEL, CentOS)
  • MySQL
  • C
  • Data Structures & Algorithms

Skills

Docker, Kubernetes, Ansible, Jenkins, Infrastructure as Code (IaC), Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), Nagios, Alertmanager, Python, Bash/Shell Scripting, Automation Workflows, Nginx, F5 Load Balancers, TCP/IP, DNS, SSL/TLS Certificates, Linux Administration (RHEL, CentOS), MySQL, C++, Data Structures & Algorithms

Personal Information

Title: Site Reliability Engineer

Timeline

Site Reliability Engineer

JUSTDIAL
05.2022 - Current

M.Tech - Computer Science Data Processing

Indian Institute of Technology Kharagpur

M.Sc - Mathematics

Indian Institute of Technology Madras
SANDIP MANDAL