Summary
Overview
Work History
Education
Skills
Websites
Certification
Languages
Languages
Accomplishments
Timeline
Generic

Pankaj Thapliyal

New Delhi

Summary

Lead DevOps Engineer with 8+ years of experience in designing and deploying scalable AWS cloud infrastructures. Achieved 50% reduction in downtime and 25% increase in system performance through automation, leading to 60% decrease in manual efforts. Enhanced team productivity by 40% through proven leadership skills, while optimizing large-scale infrastructure management with expertise in CI/CD pipelines and AWS services. Proficient in security management and cost optimization strategies, ensuring efficient and secure operations.

Overview

9
9
years of professional experience
3
3
Certifications

Work History

Associate Technical Lead

Appinventiv Technlogies Pvt Ltd
Noida
04.2022 - Current
  • Designed and implemented a highly scalable and secure cloud infrastructure, reducing downtime by 50% and improving system performance by 25%
  • Installed and upgraded Kubernetes clusters using tools like kubeadm, eksctl, or managed services (EKS).
  • Managed cluster configuration, API server access, and control-plane components.
  • Created and maintained namespaces, resource quotas, and limit ranges for multi-tenant clusters.
  • Managed RBAC by creating roles, cluster roles, role bindings, and service accounts.
  • Configured and maintained ConfigMaps and Secrets for application configuration.
  • Deployed and managed workloads using Deployments, StatefulSets, DaemonSets, and Jobs.
  • Performed rolling updates, rollbacks, and handled pod restarts with zero or minimal downtime.
  • Monitored cluster health, node status, and pod lifecycle issues.
  • Troubleshot pod failures, CrashLoopBackOff, ImagePullBackOff, and networking issues.
  • Managed node scaling, cordon, drain, and uncordon during maintenance activities.
  • Integrated HPA and cluster autoscaler for auto-scaling workloads.
  • Configured ingress controllers and managed ingress rules for traffic routing.
  • Managed persistent storage using PVs, PVCs, and storage classes.
  • Implemented network policies to control pod-to-pod and namespace-level communication.
  • Collected logs and metrics using Prometheus, Grafana, and centralized logging tools.
  • Performed cluster security hardening and followed Kubernetes best practices.
  • Backed up and restored cluster resources and etcd (where applicable).
  • Supported CI/CD integrations with Kubernetes for automated deployments.
  • Architected and implemented a disaster recovery (DR) solution with 5-minute RTO and RPO, ensuring high business continuity
  • Developed and maintained automation scripts for deployment, monitoring, and maintenance, reducing manual effort by 60% and improving system reliability by 30%
  • Led end-to-end migration of DEV, QA, UAT, and Production environments from on-premises to AWS using Amazon EKS, ECR, and Docker, significantly streamlining deployment workflows
  • Implemented Infrastructure as Code (IaC) using Terraform, enabling consistent and repeatable provisioning across all environments
  • Automated configuration management with Ansible, improving efficiency in software deployment and server configuration management
  • Designed and established CI/CD pipelines using GitLab CI/CD, automating build, test, and deployment processes to reduce manual intervention and increase release reliability
  • Deployed Kubernetes applications using Helm and implemented Argo CD for GitOps-based continuous delivery
  • Integrated centralized monitoring and alerting using Prometheus, Grafana, and Loki, enabling proactive incident detection and faster response times
  • Optimized cloud costs using AWS Reserved Instances and Spot.io, reducing infrastructure costs during non-peak hours while maintaining scalability
  • Mentored and led a team of 5 senior DevOps engineers, improving team productivity by 40% and reducing production issues by 15%

Senior DevOps Engineer

Appinventiv Technlogies Pvt Ltd
Noida
04.2019 - 03.2022
  • Designed cloud infrastructure and architecture diagrams; configured Amazon EKS clusters using eksctl
  • Implemented monitoring and alerting to ensure high availability, performance, and system reliability
  • Integrated Amazon MSK for message streaming and implemented Spot.io for cloud cost optimization
  • Built and maintained CI/CD pipelines using Jenkins Master Slave Architecture, reducing release time by 70% and improving code quality by 40%
  • Deployed and managed Kubernetes workloads using Helm
  • Designed and implemented backup and disaster recovery strategies, reducing data loss by 90% and improving availability by 20%
  • Collaborated with development teams, AWS support, and vendors to deliver scalable solutions, increasing customer satisfaction by 30% and reducing support tickets by 25%

AWS Cloud Engineer

Appinventiv Technlogies Pvt Ltd
Noida
01.2018 - 03.2019
  • Spearheaded the end-to-end design and deployment of the Moxy App infrastructure using Nginx, Node.js, Kafka, ElastiCache, CloudFront, WebSocket, AWS DevOps services, serverless architecture, and ECS clusters
  • Played a key role in establishing CI/CD pipelines, configuring serverless components and EC2 instances, and implementing robust security controls
  • Designed system architecture and integrated centralized monitoring and alerting solutions to ensure operational efficiency and reliability
  • Implemented comprehensive security policies and compliance procedures, eliminating security incidents and improving compliance with industry standards by 50%
  • Researched and evaluated emerging technologies to enhance DevOps processes, achieving a 20% reduction in operational costs and a 15% improvement in team efficiency
  • Directed and contributed to critical DevOps initiatives, improving project success rates by 25% and increasing team morale by 10%

Linux Administrator

Site Learning Of India
NEW DELHI
02.2017 - 12.2017
  • Installed, configured, and maintained Linux servers (RHEL, CentOS, Ubuntu) across physical and virtual environments
  • Managed user accounts, groups, permissions, and sudo access
  • Performed system monitoring, performance tuning, and capacity planning to ensure optimal server operations
  • Handled patching, upgrades, and security updates of Linux systems
  • Configured and maintained critical services: SSH, Apache/Nginx, FTP, DNS, NFS, Samba
  • Implemented security best practices, including firewalls (iptables/firewalld), SELinux, and access control policies
  • Managed disk partitions, LVM, file systems, and storage mounting
  • Automated routine administrative tasks using Bash scripting
  • Troubleshot system, network, and application-level issues
  • Monitored logs using journalctl, rsyslog, and log rotation strategies
  • Managed backups, restores, and disaster recovery procedures
  • Configured cron jobs for scheduled tasks and system maintenance
  • Ensured system hardening and compliance with organizational standards
  • Collaborated with SRE and application teams for deployments, troubleshooting, and operational support

Education

Bachelor of Computer Applications - Computer Application, Computer Engineering

Indira Gandhi National Open University
06.2015

Skills

  • Cloud Platforms: AWS services including EKS, VPC, IAM, EC2, Lambda, S3, API Gateway, RDS, CloudFront, and WAF
  • Infrastructure as Code (IaC): Terraform and Ansible for automated and repeatable infrastructure provisioning
  • CI/CD & Deployment: Jenkins, GitLab CI/CD, GitHub Actions, Argo CD, Argo Rollouts, AWS CodePipeline, and AWS CodeBuild
  • Containerization & Orchestration: Docker and Kubernetes (EKS, Kubeadm, Minikube) with Helm for package management
  • Service Mesh & Networking: Istio, Kubernetes Ingress, ALB, NLB, Auto Scaling Groups (ASG)
  • Monitoring & Logging: Prometheus, Grafana, Loki, AWS CloudWatch, ELK Stack, Datadog, New Relic, and Site24x7
  • Security & Compliance: SonarQube, Veracode, Dockle, OWASP practices, WAF, IAM, DevSecOps, and Zero Trust security model
  • Scripting & Automation: Python, Shell, Bash scripting, and AWS CLI
  • Version Control & Collaboration: Git, GitHub, GitLab, JIRA, and Confluence
  • Databases: MySQL, PostgreSQL, MongoDB, and ScyllaDB
  • Artifact and image management: AWS CodeArtifact and Amazon ECR
  • AI-assisted Development & Ops: ChatGPT, Cursor, GitHub Copilot (code generation, troubleshooting, automation scripts)

Certification

AWS Certified DevOps Professional

Languages

  • Bash
  • Python

Languages

English
Upper Intermediate (B2)
B2

Accomplishments

Star Performer of the Quarter 2023

Timeline

Associate Technical Lead

Appinventiv Technlogies Pvt Ltd
04.2022 - Current

Senior DevOps Engineer

Appinventiv Technlogies Pvt Ltd
04.2019 - 03.2022

AWS Cloud Engineer

Appinventiv Technlogies Pvt Ltd
01.2018 - 03.2019

Linux Administrator

Site Learning Of India
02.2017 - 12.2017

Bachelor of Computer Applications - Computer Application, Computer Engineering

Indira Gandhi National Open University
Pankaj Thapliyal