Experienced DevOps/SRE Engineer with 7.11 years of expertise in DevOps/SRE role comprising of experience in CI/CD pipelines using Git, GitHub, and Jenkins. Hands-on with Python scripting, AWS services, and IaC with CloudFormation and Terraform. Skilled in Docker, Kubernetes, and monitoring tools like Logz.io, Opsgenie, and Grafana. Strong incident management with JIRA, documentation in Confluence, and participation in SCRUM meetings. Proven track record in AWS infrastructure monitoring, network protocols, and ITIL-based incident resolution. Collaborative and detail-oriented, ensuring optimal performance and SLA compliance.
Overview
24
24
years of professional experience
3
3
Certificate
Work History
DevOps Engineer
Sophos Technologies Pvt Ltd
6 2022 - Current
Implemented CI/CD pipelines using Git, GitHub, and Jenkins to automate software delivery processes.
Accomplished Python scripting tasks and automation of infrastructure provisioning. Managed AWS resources using CloudFormation for infrastructure as code (IAC) implementations.
Orchestrated containerized applications with Docker and Kubernetes for scalability and reliability.
Handled administrative task on AWS services including IAM, S3, EC2, CloudWatch, Auto Scaling, RDS, Lambda.
Proficient in managing infrastructure using Terraform, AWS CloudFormation, and Ansible.
Configured observability tool Grafana and monitored logging tool logz.io.
Worked with Git for version control management, including branching strategies, code reviews, and pull requests, facilitating collaborative development workflows and ensuring code quality.
Utilized monitoring tools such as Opsgenie and Logic Monitor for alert monitoring, configuring alerting rules and notification mechanisms to ensure timely and accurate alert delivery, reducing incident response time by 30%.
Experienced in managing Kubernetes workloads and orchestrated containerized applications.
Performed AWS Production account setup using Infrastructure as Code (IaC) with YAML files.
Spearheaded efforts with cross-functional teams to define and streamline incident management workflows, performing IR testing, and finally decreasing incident resolution time by 25%.
Hands-on experience with JIRA for ticketing and tracking incidents, improving tracking accuracy and response time by 20%.
Documented incident reports and operational procedures using Atlassian Confluence, enhancing team knowledge sharing and reducing recurring incidents.
Participated in SCRUM meetings to discuss sprint planning, epics, stories, and tasks, contributing to a 20% increase in project completion rates.
Led and managed severity incidents, facilitated RCA meetings, followed up with vendors, and documented to prevent recurrence, ensuring quick recovery and future prevention.
Monitored automated build and continuous software integration process to drive build/release failure resolution.
Collaborated closely with product development teams and other stakeholders.
Created proofs of concept for innovative new solutions.
Technology System Engineer
CGI Info System and Management Consultants Pvt Ltd
08.2016 - 6 2022
Monitored AWS infrastructure in Integrated Operations Centre.
Administered IAM, AWS Redshift, EC2, S3, and other AWS services for optimal performance.
Managed incident tickets and ensured SLA compliance for production issues.
Demonstrated expertise in network protocols and services, including DNS, HTTP, SSH, and more.
Generated service-related reports and managed incidents based on ITIL guidelines.
Collaborated with cross-functional teams to resolve production issues promptly.
Fine-tuned alerts to reduce noise and improve system performance.
Worked on ticketing tools such as ITSM and BMC remedy.
Along with that, proficient in Incident Management.
Working experience in 24/7 environments and fine-tuned alerts to reduce noise by almost 40%.
Implemented monitoring tools and processes to track Database status, system health, and compliance. Generated regular reports on alert fatigue and monitoring activities.
Aligned with Incident Management team while troubleshooting server and Application down issues.
Ensured and published periodic progress reports on Infra monitoring for key projects.
Monitored Oracle DB infrastructure alerts and requests within SLAs.
Resolved issues and escalated problems with knowledgeable support and quality service.
Compiled data and generated graphs to interpret results and suggest key operational improvements.
Good Understanding in network protocols and services (DNS, HTTP, HTTPS, SSH, FTP, SMTP, DHCP, TCP, IP etc.)
Ability to work in teams and independently with minimal supervision to meet deadlines.
Education
Post Graduate Diploma in Advance Computing - Computer And Information Sciences