Experienced Solutions Engineer/Site Reliability Engineer with a strong background in Observability and Management tools, encompassing expertise in Oracle O&M (APM, Database Management, Logging Analytics, Operations Insights), Splunk administration, New Relic, Cacti, Nagios, and Elastic Search. Proficient in DevOps practices utilizing GitHub, Jenkins, continuous integration/continuous deployment (CI/CD), and Observability principles. Committed to delivering inventive resolutions for intricate technical challenges and dedicated to a customer-centric approach as an SRE/Solutions Engineer. Accomplished with over 8 years of progressive experience in the IT industry, focusing on IT Operations, Infrastructure, and Services Management.
Worked as Solutions Engineer where primary roles and responsibilities include following:
✓ Driving Manageability products business across APAC region with primary area of focus on Oracle Cloud Observability and
Management.
✓ Delivering POCSs, RFPs, demos, and workshops to customers for Oracle Cloud Observability and Management product.
✓ Experienced in DevOps that includes working on GitHub, Jenkins, Kubernetes, CICD.
✓ Built and managed a 4-member team that increased the Oracle O&M services consumption worth more than 1 million.
✓ Worked on more than 25 customer environments for implementation of Observability and Management solutions.
I was part of the SRT team where our primary roles and responsibility includes following:
Installation and Maintenance:
✓ Installation of different tools (Splunk, New Relic, Nagios, Cacti), upgrading agents to latest version for respective tools by following the ITIL processes that included incident, problem and change management using Service Now.
✓ Patching of production and Test servers with pre-validation and post-validation checks
✓ Use meaningful metrics to monitor environment's performance.
✓ Validate log sources and indexed data, and search through indexed data to optimize search criteria
✓ Splunk integration with Elastic Search using add-ons. Agent upgrade for different tools using Ansible.
Configuration:
✓ Setting up alerts in different tools i.e., New Relic, Cacti and Nagios, Devo, Splunk, and Kibana.
✓ Managing and configuring New Relic containerized private minions using docker for setting up synthetic monitors.
✓ Editing and maintaining configuration files and apps for different tools
✓ Recognizing and onboarding new data sources into Splunk with analysis, designing Building dashboards highlighting key trends of data
Administration:
✓ Work with Application owners to set/ modify the thresholds, alert conditions, and parameters for the services.
✓ The SRT team interacts with the Cybersecurity Engineering team/SRE members to gather requirements, perform
troubleshooting, and assist with the creation of Splunk search queries and dashboards, etc.
Development:
✓ Splunk Health Check Script in Python and User Interactive Splunk Dashboard creation using SPL for validation of Secure work migration.
✓ Node JS scripts for setting up synthetic monitors in the New Relic tool.
Splunk Analyst Role
✓ Implementing, architecting, administering Splunk, and performing data ingestion and data visualization in Splunk
✓ Development of SPLUNK Queries to create and use lookups, reports, dashboards, schedule searches, and alerts.
✓ Creation and usage of Regular expression for field extraction at search time.
✓ Worked on Onboarding of logs for different applications in Splunk
Splunk Admin Role
✓ Development of SPLUNK Queries to generate reports, schedule searches, and alerts
✓ Developing high-end Splunk dashboards using XML, HTML, and drill-down techniques.
✓ Handling Splunk Monthly maintenance activities, and deployment. Understanding of Splunk configuration to debug Splunk-related issues.
✓ Handling Incidents, Problems, and Change Requests related to applications using Service Now and BMS PAC 2000 tool.
✓ Ensuring defect resolutions related to Infrastructure and identification of code break points in Production Environment.
✓ Analyzing and implementing Emergency Change Requests (ECRs), Weekend Maintenance involving patch upgrades on servers, Server failovers, and Reboots.
● Supporting Production implementation, post-production issues, and Application monitoring 24/7 (99.99%) support.
Splunk Admin/Developer Role
✓ Development of SPLUNK Queries to generate reports, schedule searches, and alerts.
✓ Splunk dashboard development using JavaScript, XML tokens, and drill-down technique.
✓ Managing Indexers, Forwarders, Index management, and importing data in Splunk through Splunk configuration.
Application Support Role
✓ Responsible for building and supporting applications of various technologies including Tomcat, UNIX, Wily Introscope, and Splunk.
✓ Defect resolutions related to Infrastructure, identification of code break points in Production Env through IBM Thread dump Analyzer.
✓ Ensuring timely notification and escalation of possible issues/problems, options, and recommendations for prompt resolution.
✓ Follow change management procedures through Service Now. Ensuring proper testing, sign-off, monitoring.
✓ Handling various application releases for smooth deployment and production behavior, Coordinating, and performing validations of maintenance activities involving server reboots and release-related activities.
Python Programming
Splunk Admin
New Relic Admin
Oracle Cloud Admin, Oracle Cloud Observability and Management
Docker, Kubernetes
Linux
GitHub
ITIL
Elastic Search, Prometheus Node Exporter
Service Now
Application Support
Machine Learning
· Recipient of Accenture APEX award in Q3FY15 and Q4FY16
· Recipient of Wells Fargo Champion Award in Quarter I - 2018
· Recipient of FY23 Fast Start from Oracle 2022
Jul 2023: Oracle Cloud Infrastructure 2023 Certified DevOps Professional
Mar 2023: Oracle Cloud Infrastructure Data Science 2022 Certified Professional
Jan 2023: Oracle Cloud Infrastructure 2022 Observability and Management Certified Professional
Apr 2022: Oracle Cloud Infrastructure 2021 Certified Architect Associate
Sept 2021: ITIL® v4 Foundation Certificate in IT Service Management
Aug 2021: Python Basics
May 2020: New Relic Performance Monitoring Fundamentals
May 2020: AWS Certified Solutions Architect - Associate 2020 Course Completion
Jan 2017: Splunk Certified Admin, Splunk Certified Power User, Splunk certified Advanced Dashboards and Visualization