Roles and responsibilities:
Team Lead:
- Inquisitive Lead Technical Support Engineer with 6.4 years of experience in a centralized team managing and providing services to clients to maintain smooth operations.
- Managed the team to honor the SLA by effectively allocating task and avoided unnecessary escalations.
- Planned and implemented readiness checklist involving various parameters such as domain setup, application workflow, server setup details for onboarding client for ease of triaging issue.
Incident and Problem Management:
- Strong working knowledge on Incident Management & Problem Management adhering to ITIL management processes and best practices.
- Working with clients and providing updates regarding blocker/critical incidents through call, email and chat.
Network Operation Center:
- Expertise in AWS infrastructure monitoring and troubleshooting infrastructure and application components & metrics in operation.
- Hands-on in debugging of both web application server architecture (Apache, Tomcat).
- Good knowledge in Kubernetes platform (Microservice architecture, Docker, Basic of Kubectl commands. etc.,)
- Creating Standard Operating Procedures (SOP) for issue analysis and quick resolution by simulating various fail case scenarios in the lower environment. Hence improving effective and rapidly resolution time and decreasing downtime/outages.
- Administration and configuration of Application manager, 3scale management, Instana & Pingdom tool.
Command Center: Good knowledge in troubleshooting real time issues, analyzing logs from various tools such as Instana, ELK, AWS and other infrastructure components. Preparation of Daily, weekly and monthly reports.
Audit Tasks: Reviewing and auditing existing processes and enhancing whenever it is required
Other Tasks:
- Cross skilling and training new team members on the process.
- Installed the Kubernetes and backend components to step up new environments in blade servers and educated the new features to the team.
- Created support documentation exploring various features in applications/tools that empowered and enabled others to extend skills, leverage tool's features and find quick resolutions to problems without escalations.
- Evaluating various data trends and coordinating with the development teams performing cleanup tasks to maintain servers and systems, keeping networks fully operational during peak periods.