Summary
Overview
Work History
Education
Skills
Certification
Timeline
SeniorSoftwareEngineer
Palash Dange

Palash Dange

Hyderabad

Summary

Skilled engineer with a 5+ years of experience SRE experience. Hardworking and focused to complete work quickly and consistently. Maintain high availability, through monitoring, automations, resiliency initiative. Responsible for reporting system failure and creating observability tooling. Managed multiple high severity incidents over the years and led them towards resolution.

Overview

8
8
years of professional experience
1
1
Certification

Work History

Site Reliability Engineer

Salesforce
Hyderabad
12.2018 - Current
  • Developed and implemented monitoring solutions to improve system reliability.
  • Performed root cause analysis of production incidents and provided recommendations for improvement.
  • Collaborated with development teams to ensure proper release engineering practices are followed.
  • Implemented automation tools to increase efficiency in deployment processes.
  • Monitored systems performance using various metrics such as latency, throughput, availability.
  • Ensured high availability and scalability of applications across multiple environments.
  • Researched and evaluated new technologies to enhance platform reliability and stability.
  • Documented best practices and procedures for incident response activities.
  • Provided training sessions on SRE principles and best practices to team members.
  • Assisted with developing service level objectives for critical services.
  • Created automated scripts for software deployments and configuration management tasks.
  • Performed capacity planning activities based on current usage trends and future projections.
  • Conducted regular reviews of alerts generated by monitoring tools to identify potential issues.
  • Provide Weekend support through On-Call.

Project works:

1. Slack command to auto create tickets ( PYTHON)

  • Using slack command user can directly create a ticket for SRE team to take action on customer impacting servers/tasks/sql sprocs.

2. Self Service Bot on Slack ( PYTHON)

  • Designed to auto reply in the thread with most relevant options ask on the public channel.
  • Bot was able to send PagerDuty to the teams,
  • Assist to create tickets ( using 1 option) and change cases for SRE team.
  • Provide health status of provided Database/Server/instances

3. Alert Audits

  • Gather all the alerts and segregate them as per their services
  • Collaborate with the team to validate the alert and their resolution steps.
  • Update the knowledge articles if needed
  • Decommission unnecessary alert or transfer the alerts to Service Owners
  • This helped to reduce the SRE alerts and workload significantly

Programmer Trainee

Cognizant Technologies & Solutions
Pune
05.2017 - 12.2018
  • Act as primary tier for alert reporting and follow ups.
  • Identified areas of improvement in existing monitoring systems and implemented changes accordingly.
  • Monitored server logs for errors or anomalies related to application performance or availability.
  • Developed automated scripts for alerting, reporting, and performance tuning.
  • Conducted regular reviews of system performance data collected from monitoring tools.
  • Maintained up-to-date documentation regarding configuration settings and procedures related to monitoring activities.
  • Collaborated with development teams to ensure proper implementation of monitoring solutions.
  • Monitor system activities and events like System Performance, CPU Usage, Disk space usage, Network statistics etc

Technical Support Engineer

Mphasis
Pune
06.2016 - 04.2017
  • Installed, configured and maintained computer hardware, software and peripherals.
  • Provided technical assistance to users in person, via phone or email.
  • Diagnosed and resolved hardware and software issues efficiently.
  • Configured user accounts, permissions and passwords according to company policies.
  • Resolved printer problems remotely or onsite as needed by users.
  • Responded to assistance requests from users and directed individuals through basic troubleshooting tasks.
  • Documented repair processes and helped streamline procedures for future technical support actions.
  • Installed and performed minor repairs to hardware, software or peripheral equipment.

Education

B.Sc. - Computer Science

University of Pune
Nashik
01-2015

12th Class -

Delhi University
Nashik
01-2012

10th Class -

Army School
Jhansi
01-2009

Skills

  • System monitoring
  • Scripting Languages( Python)
  • Configuration Management
  • Incident Management
  • Operating System ( Windows/Linux)
  • System Administration
  • Cloud Technologies ( AWS)
  • Containerization Technologies ( KubernetesDocker, Ansible)

Certification

  • Kubernetes for Beginners
  • Ansible
  • Docker for Beginner
  • AWS Certified Practitioner - 2022
  • AWS Certified Solution Architect - Associate -2022
  • Salesforce Marketing Cloud Email Specialist - 2017

Timeline

Site Reliability Engineer

Salesforce
12.2018 - Current

Programmer Trainee

Cognizant Technologies & Solutions
05.2017 - 12.2018

Technical Support Engineer

Mphasis
06.2016 - 04.2017

B.Sc. - Computer Science

University of Pune

12th Class -

Delhi University

10th Class -

Army School
Palash Dange