Experienced Major Incident Manager with 10+ years of experience in service stability, incident resolution, and IT service delivery. Skilled in implementing ITIL best practices through internal training. Proficient in managing Severity 1 & 2 incidents, stakeholder communications, and conducting root cause analysis using the 5 Whys technique. Recognized for leadership, collaboration, and process improvements to reduce downtime and improve SLA compliance.
ITIL Specialist | Incident, Change and Problem Management | Service Delivery
Major Incident Management and Stakeholder Coordination:
• Major Incident Management and Stakeholder Coordination.
• Served as the central point of contact within the Major Incident Management (MIM) framework, leading the coordination of engineering teams and stakeholders during Severity 1 and 2 incidents.
• Guided technical teams during active incidents by facilitating real-time bridge calls, and ensuring business-critical incidents are prioritized as per SLA commitments.
• Orchestrated horizontal and vertical communications across technology and business units during major incidents, ensuring timely updates and service restoration.
• Conducted detailed technical triage and analysis to drive incident investigations and improve mean time to resolve (MTTR).
• Continuously monitored service health, and proactively informed engineering teams about impacted services and tenants.
• Collaborated with cross-functional teams to align incident response efforts with the nature and scale of the threat.
• Demonstrated leadership and critical thinking in resolving complex incidents, showcasing strong analytical and problem-solving abilities.
• Led monthly service failover drills, and hosted weekly Teams bridges to support Business Continuity Planning (BCP) initiatives.
• Translated business and stakeholder needs into actionable strategies, ensuring minimal disruption, and enhanced client satisfaction.
• Improved service restoration time through efficient escalation and coordination.
• Mentored team members on incident management techniques and process adherence.
Problem Management and Root Cause Analysis:
• Managed Reactive Problem Management by raising P3 problem tickets following P1 incidents and driving end-to-end post-incident analysis.
• Performed Root Cause Analysis (RCA) using the 5 Whys methodology to uncover underlying issues and prevent recurrence.
• Initiated and tracked Post Corrective Actions (PCA) with engineering teams, verifying successful implementation before problem closure.
• Documented RCA findings, corrective measures, and outcomes to strengthen knowledge sharing and process improvements.
• Ensured timely follow-ups on open problems, reducing recurrence rates and enhancing service stability.
Change Management Support and Governance:
• Provided end-to-end support for the Change Management process, ensuring compliance with organizational and ITIL standards.
• Reviewed change records for accuracy and adherence to process requirements; rejected non-compliant requests, and guided teams toward corrective action.
• Ensured all change requests received the necessary approvals prior to implementation, minimizing risk, and promoting accountability.
• Supported the preparation and facilitation of Change Advisory Board (CAB) meetings, contributing to risk assessment and implementation strategies.
• Collaborated with technical teams to identify and streamline low-risk, standard change types, improving overall change efficiency.
• Actively participated in prioritizing change requests aligned with organizational goals, and assisted in execution planning and documentation.
• Led knowledge-sharing sessions to upskill teams on process adherence, new tools, and technology adoption.
Major Incident & Problem Management (5 Whys RCA)
ServiceNow
RemedyForce
TeamDynamix
Microsoft Teams / O365