Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic
Faisal Hussian Shah

Faisal Hussian Shah

Kupwara

Summary

Senior DevOps / Site Reliability Engineer with five years of experience building and operating automated, cloud-native platforms across AWS, GCP, and Azure. Strong hands-on expertise in Kubernetes, GitOps, service mesh, Terraform, and automation. Designed and scaled systems handling millions of requests per second, with a focus on availability, security, disaster recovery, observability, and cost efficiency. Experienced in multi-cloud architectures and GenAI platform enablement.

Overview

5
5
years of professional experience
1
1
Certification

Work History

Senior Site Reliability Engineer

Axelerant
Remote
01.2025 - Current
  • Designed and deployed multi-cluster Kubernetes infrastructure across AWS regions, ensuring high availability and zero-downtime deployments.
  • Replaced NGINX ingress with Istio service mesh, enabling mTLS, advanced L7 traffic routing, traffic shifting, and service-level observability.
  • Engineered GitOps-oriented CI/CD pipelines employing GitHub Actions, Helm, and ArgoCD, enabling blue-green and canary deployments alongside automated rollback mechanisms.
  • Implemented centralized observability using OpenTelemetry, Prometheus, Grafana, Loki, and SigNoz for metrics, logs, and tracing.
  • Established incident response and alerting workflows using PagerDuty, AWS SNS, and Slack, improving MTTR and SLO adherence.
  • Orchestrated AWS cost reduction strategies by leveraging Savings Plans, karpeneter, rightsizing Spot Instances, KEDA, and Compute Optimizer.
  • Strengthened platform security through implementation of IAM, RBAC, pod security standards, and secrets management utilizing AWS Secrets Manager and HashiCorp Vault..
  • Performed RCA and postmortems, promoting continuous improvements in reliability.

DevOps Engineer

Meesho
bangalore
03.2022 - 01.2025
  • Scaled and secured Azure OpenAI models, implementing secure access controls and private connectivity for production workloads.
  • Established private connectivity between GCP workloads and Azure OpenAI services using NAT and controlled egress, ensuring secure, low-latency cross-cloud communication.
  • Deployed and integrated AI voice platforms including Deepgram and Smallest.ai, supporting real-time speech-to-text and voice-driven applications.
  • Designed, operated, and scaled 20+ production Kubernetes clusters across AWS and GCP, supporting business-critical workloads.
  • CDN migration from Cloudflare to Akamai, optimizing caching strategies, WAF rules, and edge performance to reduce latency and improve reliability.
  • Engineered infrastructure supporting millions of requests per second, ensuring high availability, fault tolerance, and multi-region disaster recovery.
  • Orchestrated transition of 300+ microservices from VM-based infrastructure to Kubernetes, boosting scalability and deployment velocity.
  • Executed large-scale AWS → GCP migration using Terraform with zero downtime.
  • Designed and implemented multi-region DR strategies, including backups, failover testing, and recovery runbooks.
  • Built and maintained end-to-end monitoring and alerting using Prometheus, Grafana, OpenTelemetry, CloudWatch, and centralized logging.
  • Designed secure cloud networking, including site-to-site VPNs, VPC peering, private connectivity, and controlled east-west traffic.
  • Implemented node pool–level and workload-level cost optimization, autoscaling strategies, and capacity planning.
  • Designed and enforced resource tagging strategies across AWS and GCP, enabling accurate cost attribution, chargebacks, and governance.
  • Reduced overall infrastructure cost by ~30% through tagging automation, rightsizing, reporting, and savings strategies.
  • Managed production databases and messaging systems (RDS, backups, DR, Kafka/MSK) with performance tuning and reliability controls.
  • Collaborated with security and platform teams to enforce IAM, RBAC, and network security best practices.
  • Played a key role in IPO readiness, ensuring infrastructure scalability, compliance, reliability, and cost visibility.

DevOps Engineer

SpeakX
Gurugram
08.2021 - 03.2022
  • Managed AWS infrastructure including EC2, RDS, IAM, and networking components.
  • Built and maintained CI/CD pipelines using Jenkins and Docker.
  • Implemented centralized logging and monitoring (ELK, metrics, alerts).
  • Worked closely with developers to improve performance, reliability, and cost efficiency.

CloudOps Engineer Intern

vpods.ai
Remote
01.2021 - 08.2021
  • Assisted in cloud infrastructure setup including IAM policies, load balancers, and networking.
  • Monitored system performance and supported operational reliability initiatives.
  • Helped track engineering expenses and evaluate infrastructure trade-offs.

Education

Master of Computer Applications (MCA) - Computer Engineering

Lovely Professional University
Jalandhar Punjab
07-2021

Bachelor of Computer - Computer Engineering

University of Kashmir
Jammu And Kashmir
01-2019

High School Diploma -

Jammu And Kashmir State Board of Education,
Jammu And Kashmir

High School Diploma -

Jammu And Kashmir State Board
Jammu And Kashmir

Skills

  • Cloud Platforms: AWS, GCP, Azure
  • Containers & Orchestration: Kubernetes, Docker, Helm
  • GitOps & Service Mesh: ArgoCD, Istio
  • Infrastructure as Code & Automation: Terraform, GitHub Actions, Jenkins, Atlantis
  • Observability & Monitoring: OpenTelemetry, Prometheus, Grafana, Loki, SigNoz, CloudWatch
  • Networking & Security: VPC, Load Balancers, VPN (Site-to-Site), VPC Peering, IAM, RBAC, mTLS, WAF
  • CDN & Edge: Akamai, Cloudflare
  • Databases & Messaging: Amazon RDS, Redis, Kafka (MSK)
  • Scripting & Automation: Python, Bash
  • Engineering Practices: SRE, GitOps, Blue-Green & Canary Deployments, Disaster Recovery (DR), Cost Optimization

Certification

AWS Certified Solutions Architect – Associate

Timeline

Senior Site Reliability Engineer

Axelerant
01.2025 - Current

DevOps Engineer

Meesho
03.2022 - 01.2025

DevOps Engineer

SpeakX
08.2021 - 03.2022

CloudOps Engineer Intern

vpods.ai
01.2021 - 08.2021

Master of Computer Applications (MCA) - Computer Engineering

Lovely Professional University

Bachelor of Computer - Computer Engineering

University of Kashmir

High School Diploma -

Jammu And Kashmir State Board of Education,

High School Diploma -

Jammu And Kashmir State Board
Faisal Hussian Shah