Results-driven Senior Site Reliability Engineer with a passion for leveraging emerging technologies to drive operational excellence. Seeking to apply and enhance my expertise in a dynamic environment where I can execute innovative strategies that align with both organizational goals and personal growth. Eager to contribute to impactful projects while continuously advancing skills in cloud automation, DevOps practices, and team leadership.
Professional Summary
Results-oriented Senior Site Reliability Engineer with over 8 years of experience specializing in building fault-tolerant, highly available, scalable, and automated solutions. Proven expertise in leading platform teams, driving cloud migrations, and optimizing CI/CD pipelines. Demonstrated success in mentoring junior engineers, implementing HA/DR strategies, and reducing infrastructure costs. Skilled in leveraging cloud platforms like AWS, automating deployments, and enhancing monitoring for high-traffic environments. Adept at collaborating across teams to troubleshoot, resolve issues, and deliver robust, resilient systems. I have a passion for troubleshooting and identifying the root cause of potential issues. My platform team often relies on me for resolving complex problems, and finding effective solutions gives me immense satisfaction.
Core Competencies
• Cloud Platforms: AWS (EC2, S3, Lambda, CloudFormation, CDK, ECS)
• CI/CD & DevOps: Jenkins, Nexus, Chef, Docker, Kubernetes
• Programming Languages: Python, Ruby, TypeScript
• Infrastructure as Code: CDK , Terraform
• Monitoring & Incident Management: Datadog
• Leadership & Mentoring: Team Management, Training, and Onboarding
Professional Experience
Senior Site Reliability Engineer
Cvent, India | 2019 – Present
• Led and owned the Build and Deploy Platform team, managing all deliverables, including Nexus management and CI/CD pipeline enhancements.
• Implemented Disaster Recovery (DR) and High Availability (HA) strategies for Nexus Pro in staging environments, resulting in improved service reliability.
• Successfully migrated Nexus from a management account in US-East-1 to a dedicated CICD account in US-East-2, optimizing resource usage and security.
• Automated Jenkins patching processes, enhancing efficiency, and supported team members during resource constraints.
• Spearheaded the sharding of Chef Nexus, reducing dependencies on the main Nexus, and ensuring seamless infrastructure builds.
• Delivered over 30% cost savings through optimization of Nexus instance types, reduction in S3 storage costs, and refinement of backup strategies.
• Mentored and onboarded junior engineers, leading to successful project deliveries and consistent support for Nexus and Jenkins.
• Developed and implemented SQL linting, Sonar code coverage, and static code analysis for .NET pipelines, driving significant quality improvements.
• Proactively resolved escalations, documented best practices, and contributed to the overall success of key projects.
Key Achievements:
• Reduced artifact-specific issues on npm/Git packages by 90% by resolving proxy-related errors and implementing routing rules.
• Enhanced platform observability by migrating Nexus logs to Datadog and improving SLI metrics for critical repositories.
• Successfully managed urgent and complex tasks, including Jifflenow Legacy Account Migration, delivering high-quality outcomes under stringent deadlines.
Additional Responsibilities and Projects:
• Managed the Conference Product applications of Cvent, ensuring high availability and implementing comprehensive monitoring solutions.
• Troubleshot and resolved production issues across various environments, collaborating with multiple teams for efficient solutions.
• Set up DR environments for Conference products in AWS and managed cross-account microservices migration in production.
• Built the Silos infra architecture migration plan from on-prem to AWS.
• Co-owned the Nexus platform, focusing on patching, automation, and monitoring to enhance functionality and efficiency.
• Set up Datadog monitoring for Conference applications and deployed ECS clusters with Auto-Scaling Groups for optimal scaling and management.
• Designed and managed the deployment lifecycle for lower, pre-prod, and production environments using Octopus Deploy.
• Automated Jenkins pipelines and workflows, utilizing Bitbucket integrations for streamlined CI/CD processes.
• Configured and managed AWS services including ECS, ASG, API Gateway, Route 53, CloudFront, and Load Balancers.
• Implemented a self-healing automation process for Jenkins EC2 instances using a Ruby-based BOT, addressing resource crunch alerts automatically.
Professional Experience
Client: US Foods, USA
US Foods is one of America’s largest food companies and leading foodservice distributors, partnering with approximately 250,000 restaurants and foodservice operators. With nearly 25,000 employees and over 60 locations, they provide customers with a broad, innovative food offering and a comprehensive suite of e-commerce technology and business solutions.
Role: Consultant (Infrastructure as Code & Configuration Management)
Key Responsibilities:
• Infrastructure as Code (IaC) Implementation: Developed and automated infrastructure setup using Chef, enabling consistent and scalable deployments.
• Chef Server Setup: Installed and configured both public and private Chef servers, and synchronized them with Chef nodes using knife configuration for seamless management.
• Chef Workstation Configuration: Installed and configured Chef Workstation (Development Kit) on local machines (Windows, Ubuntu) and AWS EC2 instances, running in local mode for efficient testing without relying on a central Chef server.
• Chef Node Management: Bootstrapped Chef nodes using knife configuration from Chef DK, ensuring synchronization with Chef servers and applying necessary configuration policies.
• Cookbook & Recipe Management: Created, managed, and uploaded cookbooks to the Chef server, defining roles and policies to automate infrastructure provisioning and configuration.
Client: AT&T Inc., USA
AT&T is the largest provider of fixed telephone services and the second-largest mobile telephone provider in the USA, also offering broadband subscription TV services through DirecTV. The IAB project involves end-to-end application support for AT&T’s Myatt/MyWorld applications across various platforms including desktop (OLAM), mobile (DSS), SMB, and backend services.
Role: Software Engineer (Application & Environment Support)
Key Responsibilities:
• Installed and configured WebLogic servers and deployed applications (EAR, WAR, JAR) via Jenkins and manual processes.
• Integrated web servers with WebLogic application servers and managed deployments across multiple environments.
• Provided 24/7 support for UAT, development, and production environments, including incident resolution and test environment deployments.
• Automated critical tasks using shell scripts, optimizing performance and reducing manual interventions.
• Coordinated and executed hot-fix, minor, and major release activities, ensuring smooth deployments with minimal downtime.
• Led knowledge transfer initiatives, stabilizing support services with detailed documentation and well-defined processes.
• Planned and coordinated non-production activities, audits, and compliance tasks while supporting POCs for new technologies.
• Regularly reported status updates and task prioritization plans to management, ensuring transparency and effective resource allocation.
Datadog Monitoring: Set up comprehensive monitoring for Conference Application
AWS ECS & Auto-Scaling: Deployed and managed ECS clusters with auto-scaling for high availability
Deployment Automation: Designed and implemented CI/CD pipelines across environments using Octopus Deploy
Production Support: Proactively resolved production issues and optimized performance
Nexus Administration: Managed artifact repositories, ensuring seamless deployments
Jenkins & CI/CD: Led Jenkins setup, including master-slave configurations, plugin management, job automation, and pipelines with Bitbucket integration
Infrastructure as Code (IaC): Automated infrastructure with Chef (server, workstation, nodes), Terraform and Docker containerization
AWS Configurations: Set up instances, security groups, S3, API Gateway, Route 53, CloudFront, and load balancers
Scripting & Automation: Developed utilities and automation scripts using Python, Shell, and Groovy
Build & Release Management: Expertise in Maven builds, Git branching strategies, and automated deployments
Technical Stack: Jenkins, Chef, Docker, AWS, GitHub, Nexus, Datadog, WebLogic, Python, Shell Scripting