Summary
Overview
Work History
Education
Skills
Websites
Certification
Accomplishments
Timeline
Generic
Manpreet Singh

Manpreet Singh

Pune

Summary

Experienced Site Reliability and Cloud DevOps Engineer with a strong background in designing, implementing, and optimizing scalable infrastructure solutions. Proven ability to lead teams and drive initiatives to ensure the reliability, availability, and performance of critical systems. Adept at leveraging cloud technologies and automation to streamline operations and enhance efficiency.

Overview

12
12
years of professional experience
6
6
years of post-secondary education
6
6
Certifications

Work History

SENIOR SITE RELIABILITY ENGINEER

IBM India Software Labs
11.2023 - Current
  • Design and architect highly available, scalable, and reliable infrastructure solutions for IBM Qsuite SaaS, ensuring optimal performance and security.
  • Ensured high availability of services by developing comprehensive disaster recovery plans and backup procedures.
  • Evaluated new technologies and tools to enhance overall system performance, stability, and security.
  • Made python-based AWS monthly cost report exporter, and this script was made part of Jenkins job which is scheduled weekly to generate reports for dev and prod QRadar spending in AWS and these reports are uploaded to S3 bucket.
  • Made python script which migrates single tenant's events data from old clickhouse backup file to new clickhouse. It downloads tenant backup inventory and reloads events per day from each shard.
  • Created python script to check the git-release from vault for all clusters that are not flagged as decommissioned.
  • Made script to monitor RabbitMQ major version upgrade status. During the RabbitMQ major version upgrade we can run this script and it reports if RabbitMQ brokers are upgraded and running with expected version.
  • Technical Leadership: Provide technical leadership and mentorship to junior members of Site Reliability Engineering team, fostering a culture of collaboration, innovation, and continuous learning.
  • Documentation and Knowledge Sharing: Maintain comprehensive documentation of infrastructure configurations, deployment procedures, and incident response processes, facilitating knowledge sharing and onboarding of new team members.

CLOUD SUPPORT ENGINEER 2 - DEVOPS

Amazon Web Services
08.2021 - 10.2023
  • Primarily working on AWS managed Kubernetes and Docker services (EKS and ECS) but other services as well like Fargate, CloudMap, Xray, ECR etc.
  • Reduced downtime for clients by proactively monitoring and troubleshooting cloud-based issues.
  • Conducted training sessions for junior team members and new hires, fostering a culture of continuous learning and skills development.
  • Hands on experience with troubleshooting Kubernetes clusters and other AWS integrated services with k8s like AWS Load Balancers, EC2, S3, VPC, and IAM
  • Created python script for taking backup of 5 microservices, which were handled via Custom Resource Definition in OpenShift.
  • Created python script for running ROSA (OpenShift on AWS) clusters version upgrade and checking status on successful completion of same.
  • Implemented AWS Load Balancer Controller Installation on AWS EKS Cluster with Terraform.
  • Installation of cluster autoscaler on EKS cluster with Terraform.
  • Setup of AWS EKS monitoring and logging using kubectl and terraform.
  • AWS EKS upgradation with zero downtime using Terraform.
  • Provisioning of multiple RDS instances using Terraform.
  • Setting up VPC peering connection between two VPCs using Terraform.
  • Creating ECS clusters with fargate tasks using terraform.
  • Hands on experience in troubleshooting and configuring Fully managed and Hybrid Kubernetes environments like EKS-Anywhere and ECS-Anywhere
  • Worked on Kubernetes and its associated Open- Source projects like AWS Load balancer controller, AWS CSI EBS and EFS volume drivers, Cluster Autoscaler, VPA and HPA
  • Troubleshooting Kubernetes cluster related issues with worker nodes, IAM Authorizations, RBAC, Service Accounts and Implementing IRSA, Optimizing and reserving kubelet resources via userdata
  • Experience in working on AWS and its services like AWS IAM, VPC, EC2, ECS, EBS, EFS, RDS, S3, Lambda, ELB, Auto Scaling, Route 53, Cloud Front, Cloud Watch, Cloud Trail, SQS, and SNS
  • Good understanding of OSI Model, TCP/IP protocol suite (IP, ARP, TCP, UDP, SMTP, FTP, and TFTP)
  • Experience in all aspects of software life cycle like Build/Release/Deploy with AWS tools like CodeBuild and Code Deploy and open-source tools like Jenkins.
  • Assisting customers with configuring and managing Kubernetes cluster control plane and data plane and handling escalations.
  • Actively involved in improving documentation, writing Kumo Articles
  • Raising service level issues to development teams and raising public facing GitHub issues.
  • Identified issues, analyzed information and provided solutions to problems.

TECHNOLOGY ANALYST

Airbus India Pvt Ltd
12.2019 - 08.2021
  • Testing and cost optimization of Abaqus application on AWS and on-premise servers.
  • Reduced technical debt by refactoring legacy code and implementing modern development methodologies.
  • Resolved complex technical issues through rigorous troubleshooting and root-cause analysis, minimizing downtime and disruptions to business operations.
  • Testing and cost optimization of Abaqus application on AWS and on-premise servers.
  • POC for NFS4 ACLs.
  • Created sanity script using ansible adhoc commands.
  • Resolved issue of VDIs not able to join AD by creating ansible playbook.
  • Resolved blackscreen issues in RHEL 7.7 VDIs.
  • Created shell/bash script for printing files from linux servers.
  • Worked on GIT to manage source code for applications.
  • Defining, building and automating CI/CD build pipeline using Jenkins.
  • Deploying applications to VMs and VDI nodes.
  • OpenScap(Security Tool) Configuration.
  • SSSD Configuration and rollout to Devel, Val and Prod nodes.
  • Patching Servers via Ansible tower.
  • Implemented hpn-ssh for scientific computing environment.
  • Created a tool using shell scripting to disable ibus daemon and fast the processing of Hyperworks application.
  • Responsible for identifying, troubleshooting and resolving problems with the build process using Jenkins and ensures that the release has been accepted by all parties.
  • Periodically monitored logs for optimal performance in Splunk.
  • Containerizing applications using docker.
  • Configuring monitoring of servers in Splunk.
  • Worked on EC2, VPC, S3, IAM, Route53 services in AWS.
  • Worked closely with Developers, QA and project management for smooth scheduled releases.
  • Participated in application builds and deployments to Dev, QA, Preprod and Prod environments.
  • Involved in release process and deployed applications (WAR, EAR and JAR).
  • Troubleshoot build issues and coordinate with development team on resolving build issues.
  • Maintain knowledge base to track known issues and their resolutions.
  • Updating release note for every release.
  • Involved in documentation of all processes and procedures.
  • Joining into bridge calls and providing necessary information to teams are involved.

TECHNICAL SERVICES ENGINEER

Fujitsu Consulting India Pvt. Ltd.
01.2016 - 12.2019
  • Monitored automated build and continuous software integration process to drive build/release failure resolution.
  • Fostered strong relationships with clients through excellent communication skills when addressing their technical inquiries or concerns.
  • Contributed to successful project completions by serving as reliable point of contact for technical expertise.
  • Monitored automated build and continuous software integration process to drive build/release failure resolution.
  • Researched and identified new technologies and tools helping to grow agile development environment.
  • Provide End-to-End support of Linux servers to multiple clients as part of shared support.
  • Installing and configuring Docker and running containers.
  • Installing, configuring and maintaining on-premise Kubernetes cluster.
  • Maintained security and mitigated threats as new ones were identified.
  • Built multiple server systems and security hardening.
  • Installation, configuration and maintenance of Linux OS and Open-source applications.
  • Experience in building Production Servers and validation for new build releases.
  • Managing Virtualisation Environment with OVM, Hyper-V & VMWare
  • Perform Deployment and patch update to all Linux servers.
  • Perform OS patching, release updates and vulnerability fix.
  • Performance analysis and troubleshooting
  • Managing storage volumes SAN (FC, ISCSI) and NAS. □ Used LVM extensively for creating LUNs, building volume groups, and creating and maintaining file systems

SENIOR SYSTEM ENGINEER

ATOS India Private Ltd
11.2014 - 01.2016
  • Used LVM extensively for creating LUNs, building volume groups, and creating and maintaining file systems.
  • Worked with stakeholders to determine implementation and integration of system-oriented projects.
  • Reduced downtime for critical systems by proactively identifying potential issues and conducting preventative maintenance.
  • Managing storage volume – LVM with Red Hat Multipath.
  • Administration of NAS filer for Atos customized SAP and application environment.
  • New server onboarding and server decommissioning.
  • Configured and implemented the automation tool in the client environment to reduce the manual and repeated works.
  • Works on high priority incidents and escalated incidents from L2.
  • Performed troubleshooting for various problems, logging calls with vendors for hardware issues.
  • Knowledge in ITIL roles and responsibilities.

UNIX ADMINISTRATOR

HP India Sales Pvt. Ltd
05.2012 - 11.2014
  • Streamlined workflow processes, automating repetitive tasks with custom shell scripts and tools.
  • Improved system performance by optimizing Unix server configurations and streamlining processes.
  • Administration on HPUX 10i and 11i O.Ss.
  • Maintenance of HP-UX (IA & PA-RISC) and Superdome (SD32) servers
  • Troubleshooting O/S (UNIX) related problems.
  • Reconfiguration of kernel parameters.
  • Advanced User/Group Administration.
  • Served as an escalation point for complex technical issues related to Unix administration, providing expert guidance to resolve incidents quickly while minimizing impact on endusers
  • Reduced downtime by implementing effective backup strategies and disaster recovery plans for critical systems

Education

B.E - Electronics & Communication

Institute of Information Technology & Management
Gwalior
08.2007 - 07.2011

PCMB -

AISSCE CBSE XII, K.V NO.2, C.B.S.E
Gwalior
04.2006 - 04.2007

All Subjects -

AISSCE CBSE X, K.V NO.2, C.B.S.E
Gwalior
04.2004 - 03.2005

Skills

Linux

Certification

AWS Certified Solutions Architect Associate

Accomplishments

  • Awards in Airbus: Collaboration Hero Award, DA Vinci Award for most innovations, Spot Award for driving innovations.
  • Awards in Fujitsu: Accredited Champion of service excellence.

Timeline

SENIOR SITE RELIABILITY ENGINEER

IBM India Software Labs
11.2023 - Current

CLOUD SUPPORT ENGINEER 2 - DEVOPS

Amazon Web Services
08.2021 - 10.2023

TECHNOLOGY ANALYST

Airbus India Pvt Ltd
12.2019 - 08.2021

TECHNICAL SERVICES ENGINEER

Fujitsu Consulting India Pvt. Ltd.
01.2016 - 12.2019

SENIOR SYSTEM ENGINEER

ATOS India Private Ltd
11.2014 - 01.2016

UNIX ADMINISTRATOR

HP India Sales Pvt. Ltd
05.2012 - 11.2014

B.E - Electronics & Communication

Institute of Information Technology & Management
08.2007 - 07.2011

PCMB -

AISSCE CBSE XII, K.V NO.2, C.B.S.E
04.2006 - 04.2007

All Subjects -

AISSCE CBSE X, K.V NO.2, C.B.S.E
04.2004 - 03.2005
AWS Certified Solutions Architect Associate
Gremlin Enterprise Chaos Engineering Certified
RHCSA 7
ITIL V3
Lean Sigma Yellow Belt Certified
HP CSA
Manpreet Singh