Summary
Overview
Work History
Education
Skills
Timeline
Generic

Shiv Kamal Pandey

Bengaluru

Summary

Proactive and goal-oriented professional with excellent time management and problem-solving skills. Known for reliability and adaptability, with swift capacity to learn and apply new skills. Committed to leveraging these qualities to drive team success and contribute to organizational growth.

Overview

13
13
years of professional experience

Work History

SRE Module Lead

Nous Infosystem
01.2024 - Current
  • Handling End to end SRE responsibility for the client BHHC
  • Maintaining Several Azure servers and their services, maintenance, website operation for BHHC client, automation troubleshooting of any critical issue within SLO
  • Creating RCA for any Severity 1 issue, creating help document under confluence, deployment, change management.
  • Required Automation to eliminate toil

SRE

VMware
03.2020 - 12.2023

Handling responsibility for an SRE

  • Handling multiple kind of alert related to different kind of servers and sources
  • Handled more 30 sources for alert monitoring like Dynatrace, Zabbix, Log insight, DELL-EMC etc.
  • Getting pager duty for any kind of critical issue and responding on time
  • Troubleshooting users issue related to any downtime for provided services and production issue
  • Handling pager duty for critical issue, Problem management and end to end troubleshooting
  • I Maintained filesystem, logging on more than 4000 servers
  • Automating manual task using python and shell scripting
  • Runbook creation ,modification, creating pager duty configuration for severe one issue

SRE

Ascent
02.2019 - 09.2019
  • Handled SRE responsibility for Kafka infrastructure team
  • Managed build and deploy for the existing infrastructure for the component like producer, broker, zookeeper, sentinel, MYSQL
  • Capacity planning
  • Incident management
  • Automation

Senior SRE

Makemytrip
11.2016 - 02.2019
  • Part of Site Reliability Engineering & website operations team and doing deployments for all MakeMyTrip LOBs like Flights, Hotels, Payments, Holidays and others including the complex multi-tier backend
  • Infrastructure maintenance and critical incident handling for MakeMyTrip and Goibibo along with their various component.
  • Python scripting for maintaining existing automation code as well as developing new module and its integration
  • Working as Linux admin and DevOps engineer in combine
  • Configuration and monitoring of Zabbix alert for more than 3000 production servers
  • Creating Grafana dashboard, and monitoring for production environment and related API, servers etc.
  • Release management, change management, root cause analysis, problem management, Runbook creation and modification, capacity planning, DR activity and various on demand task.

IT Analyst

TCS
03.2016 - 10.2016

On Bench [ No project allocated]

Linux Admin

XEROX
12.2014 - 02.2016
  • Linux Admin: Redhat OS installation and upgrade (Redhat5-6), LVM management, user management, FTP , SSH , NFS , CRON , Data center management , Apache , configuration management.
  • Printer solution and maintaining infrastructure for the same
  • New server build and server decommissioning

Senior Technical Consultant

Hewlett Packard
07.2011 - 05.2014
  • Production, stage Environment troubleshooting,
  • Task automation
  • Server management,
  • Tomcat,weblogic ,cacti configuration and monitoring
  • J2ee application deployment and bug fix in various test environment
  • LVM configuration
  • Checking various subsystems(Process, Disk, application)
  • Performance Monitoring
  • Server Administration - Disk Cleanups, Log Rotation
  • Solving Remedy Tickets
  • Creating change request and change implementation in production
  • Creating implementation plan and verification
  • Capacity planning
  • Partition and space management file system management
  • Incident handling related to environment issue
  • Maintaining weblogic server for any failure
  • Attending the war room for investigating the root cause of site failure along with several team
  • Implemented no’s of automation script for vodafone.co.uk production support stability.
  • Maintaining the synchronization between Disaster Recovery (back-up) and Production servers.

Education

Master of Computer Applications - Computer Engineering Technology

KIIT
Bhubaneswar
07.2011

Bachelor of Science - Information Technology

RU
Ranchi, India
07.2008

Skills

  • System Performance Analysis
  • Requirements Analysis
  • Root Cause Analysis and preparation
  • Incident management
  • Change management
  • Site reliability engineering
  • ITIL
  • Python
  • Shell script
  • Monitoring solution and configuration
  • Linux,
  • Windows, Azure, RHCSA, RHCE, Docker, Kubernetes, Jira,
  • Dynatrace
  • Zabbix, Grafana

Timeline

SRE Module Lead

Nous Infosystem
01.2024 - Current

SRE

VMware
03.2020 - 12.2023

SRE

Ascent
02.2019 - 09.2019

Senior SRE

Makemytrip
11.2016 - 02.2019

IT Analyst

TCS
03.2016 - 10.2016

Linux Admin

XEROX
12.2014 - 02.2016

Senior Technical Consultant

Hewlett Packard
07.2011 - 05.2014

Master of Computer Applications - Computer Engineering Technology

KIIT

Bachelor of Science - Information Technology

RU
Shiv Kamal Pandey