Summary
Overview
Work History
Education
Skills
Certification
Education and Training
Skills
Timeline
Generic

Palash Dange

Hyderabad

Summary

Progressive Site Reliability Engineer with 9 years of IT experience, focused on improving system reliability and reducing MTTR. Skilled in managing large-scale on-premise and cloud (vSphere, AWS) environments using Terraform and Ansible. Certified in AI, with hands-on experience building internal chat assistants to automate operations. Dedicated to creating scalable, self-healing systems using strong scripting and modern SRE/DevOps best practices.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Site Reliability Engineer

Salesforce
12.2018 - Current
  • Monitored systems performance using various metrics such as latency, throughput, availability.
  • Implemented and managed black box (user-experience) and white box (internal metrics) monitoring to ensure service reliability and enable rapid root-cause analysis.
  • Created automated scripts for software deployments and configuration management tasks.
  • Documented best practices and procedures for incident response activities.
  • Implemented automation tools to increase efficiency in deployment processes.
  • Provided training sessions on SRE principles and best practices to team members.
  • Performed capacity planning activities based on current usage trends and future projections.
  • Implemented systems automation using scripting languages like Python
  • Handled multiple P0-P2 incidents to drive toward resolution.
  • Participated in post-mortem reviews following major outages or incidents.
  • Researched and evaluated new technologies to enhance platform reliability and stability.
  • Set up monitoring tools like AppDynamics, Apica and internal tools to track performance metrics.
  • Managed infrastructure components including virtual machines, storage devices, networks.
  • Analyzed SLI metrics to proactively identify trends, drive operational improvements, and maintain adherence to SLOs and SLAs.
  • Maintained version control systems such as GitLab for all software developmentprojects.
  • Collaborated with developers in order to troubleshoot application related issues quickly.

Programmer Trainee

Cognizant Technologies Solutions
05.2017 - 12.2018
  • Implemented proactive measures to reduce downtime by identifying potential problems before they occur.
  • Assisted in developing processes around incident response and service restoration.
  • Identified areas of improvement in existing monitoring systems and implemented changes accordingly.
  • Monitored server logs for errors or anomalies related to application performance or availability.
  • Configured monitoring tools such as SolarWinds and Nagios to monitor the health of applications and servers.
  • Provided technical support to help desk staff with escalated incidents related to monitored services.
  • Analyzed system performance and conducted root cause analysis to identify issues.

Desktop Support Engineer

Mphasis
06.2016 - 04.2017
  • Assisted in the setup of new workstations with appropriate operating systems, software applications and peripheral devices.
  • Installed and configured computer systems, printers, and other peripherals.
  • Collaborated with other teams within the organization to resolve complex incidents quickly.
  • Resolved network connectivity issues for local area networks and wide area networks.
  • Created detailed documentation for IT processes, procedures and troubleshooting steps.
  • Managed backup operations using Symantec Backup Exec or other similar tools.
  • Configured user accounts, permissions and passwords according to company policies.
  • Managed Active Directory user accounts, groups, and permissions, enhancing system security.

Education

Bachelor of Science - Computer Science

Pune University
04-2015

Class 12th -

Delhi University
04-2012

Class 10th -

Army School
05-2009

Skills

  • Automation scripting and infrastructure automation
  • Incident management and service restoration
  • Capacity planning and performance optimization
  • Systems monitoring and monitoring tools
  • Configuration management and documentation practices
  • Root cause analysis
  • System monitoring
  • Virtualization technologies
  • System administration
  • Version control systems
  • Continuous integration/ Continuous Deliver
  • Linux administration
  • Performance tuning
  • Configuration management
  • Load balancing
  • Software development

Certification

  • Salesforce certified AI associate
  • AWS Certified Solutions Architect โ€“ Associate
  • Ansible basics - beginners course
  • Kubernetes for the absolute beginners - hands-on tutorial

Education and Training

other,other,other

Skills

  • Linux administration
  • Database administration
  • Software development and scripting languages

Timeline

Site Reliability Engineer

Salesforce
12.2018 - Current

Programmer Trainee

Cognizant Technologies Solutions
05.2017 - 12.2018

Desktop Support Engineer

Mphasis
06.2016 - 04.2017

Bachelor of Science - Computer Science

Pune University

Class 12th -

Delhi University

Class 10th -

Army School
Palash Dange