Summary
Overview
Work History
Education
Skills
Websites
Languages
Timeline
Generic

Tilak Chauhan

Hyderabad

Summary

Site Reliability Engineer at ServiceNow with a strong focus on automation, reliability engineering, and developer enablement. Experienced in designing and improving monitoring and alerting systems, reducing alert fatigue, and building automation to eliminate operational toil. Reduced false positives by 44% and saved 250+ SRE hours through scalable alert automation and workflow improvements. Actively leveraging and integrating AI and agent-based tools to accelerate incident analysis, debugging, and automation development. Strong in Python, system troubleshooting, and reliability-focused engineering in fast-paced production environments.

Overview

3
3
years of professional experience

Work History

Site Reliability Engineer

ServiceNow
Hyderabad
03.2025 - Current
  • Led reliability engineering initiatives focused on reducing alert fatigue and improving monitoring signal quality, saving 250+ hours of SRE effort through automation.
  • Designed and implemented alert automation using scripting and platform integrations, improving incident response efficiency and reducing manual intervention.
  • Identified systemic bottlenecks in monitoring workflows and reduced false-positive alerts from 67% to 23% through tuning, deduplication, and logic improvements.
  • Partnered with software engineering teams to debug production issues and improve observability and alert coverage across services.
  • Working to build end-to-end automated alerting workflows, moving toward minimal human involvement during incident detection and response.
  • Applied AI-assisted and agent-based tools to accelerate incident analysis, debugging, and automation development in production environments.

Associate Site Reliability Engineer

ServiceNow
Hyderabad
06.2023 - 02.2025
  • Owned production stability for critical services by participating in 24x7 on-call rotations and handling high-severity incidents in live environments.
  • Troubleshot and remediated production issues across systems, ensuring availability, reliability, and rapid recovery during incidents.
  • Authored detailed root cause analyses and implemented preventive fixes and automation to reduce recurring incidents and operational toil.
  • Collaborated with cross-functional teams to improve incident handling processes, documentation, and operational workflows.
  • Built a strong foundation in monitoring, alerting, incident response, and reliability best practices while supporting large-scale production systems.

Education

B.Tech - Information Technology

Sathyabama Institute of Science And Technology
Chennai, India
05-2023

Skills

  • Python for automation and scripting
  • Incident response and on-call operations
  • Linux systems administration
  • MySQL and relational database fundamentals
  • Database management and basic performance troubleshooting
  • Alert automation (ServiceNow, scripting, integrations)
  • Monitoring and alerting systems
  • Production troubleshooting in live environments
  • Root cause analysis and post-incident reviews
  • Kubernetes fundamentals (cluster concepts, workloads, troubleshooting)
  • Containerization fundamentals (Docker concepts)
  • JavaScript for platform and workflow automation
  • AI-assisted incident analysis and log summarization
  • Prompt engineering for operational and automation workflows
  • Leveraging LLM tools to accelerate debugging, scripting, and documentation
  • Process improvement and reliability engineering
  • Operational toil reduction through automation

Languages

  • English
  • Hindi
  • Telugu

Timeline

Site Reliability Engineer

ServiceNow
03.2025 - Current

Associate Site Reliability Engineer

ServiceNow
06.2023 - 02.2025

B.Tech - Information Technology

Sathyabama Institute of Science And Technology
Tilak Chauhan