Unix/Linux
Accomplished Lead Site Reliability Engineer with a decade of expertise in Production Support Operations, SRE best practices, and large-scale application transitions. Proven ability to build and lead high-performing global teams, drive incident, problem, and change management, and ensure seamless platform reliability across cloud, hybrid, and on-prem environments. Expert in observability, automation, performance tuning, and database management, optimizing system uptime, security, and operational efficiency. Passionate about solving complex technical challenges, enhancing system resilience, and driving continuous improvement in production environments.
Additional Impactful Contributions
✔ Led migration activities across environments, ensuring seamless transitions.
✔ Conducted regular health checks (weekly/monthly) to maintain system stability.
✔ Managed pathing activities and DST (Daylight Savings Time) adjustments.
✔ Handled XML conversions, data analysis, and system integrations.
✔ Ensured adherence to ITIL best practices, handling story-based and ServiceNow-based requests.
✔ Facilitated risk calls with client, onshore, and offshore teams to assess and mitigate potential threats.
Log analysis
Security best practices
Performance tuning
Network troubleshooting
Incident management
System monitoring
ITIL framework
Operations management
Database administration
Scheduling and planning
Infrastructure automation
Problem-solving
Unix/Linux
Windows
Oracle
Microsoft Azure
Chef
Ansible
GITHub
Jenkins
ADO
Bit bucket
Terraform
Shell Scripting
Python
Data Stage
Informatica
CA Workload Automation ESP
Control M
Autosys
MAT
Splunk
New Relic
Sumo Logic
App Dynamics
Genios
Nagios
Service Now
Jira
WebSphere
WCS(Web Commerce Sever)
Cloud Fare(CDN)
Web Servers(Apache Tomcat,IIS)
JBOSS
Developer Tools
SQL Developer
TOAD
Kubernetes
Grafana
O365 Applications
IBM Db2
Postgres
Mysql
Azure SQL
Mongo DB