Results-driven IT Infrastructure Monitoring and Automation Specialist with extensive expertise in SolarWinds, Nagios, and cloud-based monitoring platforms (AWS CloudWatch, Azure, GCP). Proven Subject Matter Expert (SME) in delivering scalable monitoring solutions, seamlessly integrating tools into diverse environments, and optimizing system performance for reliability and efficiency. Hands-on experience in Linux administration, with a strong command of Red Hat Enterprise and shell scripting for automation. Adept at SQL database management and performance query optimization, coupled with advanced Power BI reporting capabilities to create actionable insights and enhance decision-making processes. Seasoned practitioner of ITIL principles with a successful track record in Incident, Problem, and Change Management. Expertise in integrating monitoring environments with ServiceNow for auto-alerting and streamlining operational workflows, reducing incident resolution times by
40%
. Known for developing automation solutions using Ansible and Terraform to enhance efficiency and scalability.
Overview
10
10
years of professional experience
1
1
Certification
Work History
Software Engineer 4
Unisys India Private Ltd
05.2022 - 10.2024
I have been working as monitoring engineer working on tools like Nagios, Cloud Watch, Azure, SolarWinds, Grafana, and Prometheus for various clients.
Setting up and configuring the Nagios monitoring system, which includes defining hosts, services, and alerts.
Performance tuning is a key aspect of my role, where I optimize Nagios's performance to ensure effective monitoring and alerting.
Configure and manage alerts, ensuring they reach the appropriate users and channels, and integrate the client Nagios environment with ServiceNow for automated alerting.
Identify and resolve issues within the Nagios system or the monitored infrastructure.
Generate reports on system performance and availability.
Manage user accounts, permissions, and access to the Nagios interface.
Integrate Nagios with other tools and systems, such as ticketing systems or configuration management tools.
Administered SolarWinds tools (NPM, NTA, SAM) for proactive network and system monitoring.
Optimized monitoring configurations to enhance system performance and reduce downtime.
Developed automation scripts to improve efficiency in alerting and reporting.
Generated advanced reports and alerts using SWQL/T-SQL for actionable insights.
-Diagnosed and resolved network issues using SNMP, Syslog, and other protocols.
Managed and maintained our Linux infrastructure (RHEL, UBUNTU) servers, ensuring high availability and performance.
Implemented and maintained security measures, including firewalls, intrusion detection systems, and access controls.
Troubleshoot and resolve complex system issues, minimizing downtime.
Created documentation that describes the client monitoring infrastructure.
Modify scripts according to customer requirements for monitoring (shell, Python).
Create performance data using Power BI and SQL queries.
I utilize Ansible for orchestration to configure systems, deploy software, and create scripts.
I have expertise in Requests for Proposals (RFPs), supplier selection processes, service requirements, and Service Level Agreements (SLAs).
I am capable of interfacing with various project and service methodologies, such as Agile, Scrum, DevOps, and ITIL.
Strong understanding of the project lifecycle and governance, with experience mapping Service Design and Transition activities against project governance.
Transition SME
IBM India Private Ltd
07.2020 - 05.2022
Led the end-to-end transition of monitoring systems for a major automobile client, ensuring minimal disruption and seamless integration.
Implemented automated alerting solutions by integrating Nagios Core with ServiceNow, enhancing incident response efficiency.
Administered and optimized Linux-based infrastructure (RHEL, Ubuntu) to support monitoring environments with high availability and performance.
Strategized and executed the migration plan from Nagios Core to WhatsUp Gold, optimizing monitoring capabilities and system reliability.
Coordinated cross-functional teams during service transition to ensure alignment with client objectives and compliance with ITIL guidelines.
Authored comprehensive documentation outlining client monitoring infrastructure and transition processes to facilitate knowledge transfer and future reference.
Developed and executed comprehensive transition plans, mitigating risks and ensuring seamless migration of monitoring tools and business operations.
Collaborated with client stakeholders to tailor monitoring solutions that aligned with specific operational requirements and business goals.
Applied ITIL best practices and transition frameworks to manage service design, risk, and quality during monitoring system transitions.
Developed custom shell and Python scripts to enhance monitoring capabilities and automate routine tasks within the transition environment.
Monitored and analyzed performance data using Power-BI and SQL queries to provide actionable insights during the transition phase.
Coordinated knowledge transfer sessions and training programs to ensure smooth adoption of new monitoring tools and processes by client teams.
Provided expert guidance on monitoring system configurations and customizations to optimize performance and meet evolving client requirements.
Systems Management Specialist
IBM India Private Limited
02.2015 - 05.2022
Implementing an enterprise class monitoring platform (WhatsUp-Gold, Nagios core, Nagios Xi, ScienceLogic, SolarWinds)
Responsible for the design, support and maintenance of a large monitoring infrastructure
Relevant knowledge of monitoring industry standards and the use of analytics in an operational setting
Implemented distributed Nagios with active and passive monitoring of over 100 production metrics through a mixture SSH, SNMP, NRPE, NSCA, NCPA, WMI.
Deployed & managed end-to-end SolarWinds monitoring stack for 500+ nodes.
Tuned SNMP, Syslog, NetFlow to expose traffic and device-health blind spots.
Innovate techniques for visualizing large amounts of complex, real-time data in a simple, elegant manner for users
Assist in implementing scalable monitoring and logging architecture and systems
Working knowledge of ITIL process and supporting procedures
Managed and maintained our infra Linux (RHEL. UBUNTU) servers, ensuring high availability and performance.
Resolved complex network/server issues via PerfStack, NetPath, and log analytics.
Automated routine tasks using shell scripting and configuration, Scripted onboarding, alerting, and report jobs (PowerShell + SWIS API)
Compliance Focal
IBM India Private Limited
05.2015 - 11.2016
Acted as the primary compliance focal for the Global Asset and Configuration Data Warehouse (GACDW) team, ensuring adherence to regulatory and organizational standards.
Led efforts to standardize data quality and reduce inconsistencies, contributing to compliance improvements and operational efficiency within the SO Delivery environment.
Coordinated cross-functional teams to align data management practices with compliance requirements across Global Asset and Configuration Data Warehouse initiatives.
Developed and maintained compliance documentation and reporting interfaces to support audit processes and regulatory submissions.
Conducted regular audits and assessments to identify compliance risks within asset and configuration data, implementing corrective actions promptly.
Education
PG-DITISS - Computer And Information Systems Security
CDAC
Noida
05.2014
Master of Computer Applications - Computer Application
GGSIPU
New Delhi
05.2013
Bachelor of Computer Applications - Computer And Information Sciences
Application Test Engineer [Cargo Portal Services] at Unisys India Private LimitedApplication Test Engineer [Cargo Portal Services] at Unisys India Private Limited