Summary
Overview
Work History
Education
Skills
Certification
Custom
Timeline
Generic

SIBA SANKAR SAMAL

Summary

  • Having 10+ Years' experience in Infrastructure Development (IAS)and Maintenance withAWS (Amazon web Services), DevOps, MLOps, Platform engineering, SRE
  • Architecting enterprise-grade AWS infrastructure, CI/CD ecosystems, and machine learning operations platforms. Deep expertise spans AWS DevOps, Infrastructure as Code (Terraform, CloudFormation), containerisation (Docker/Kubernetes), CI/CD automation (Jenkins, GitHub Actions, GitLab), and MLOps with SageMaker, MLflow.
  • Managed end-to-end DevOps lifecycle including infrastructure provisioning, CI/CD automation, deployment governance, monitoring, incident management, and platform optimization.
  • Implemented AWS Cloud platform and its features which include EC2, VPC, EBS, AMI, SNS, RDS, EBS, Cloud Watch, Cloud Trail, Cloud Formation AWS Config, Autos calling, Cloud Front, IAM, S3 .
  • Managed end-to-end DevOps lifecycle including infrastructure provisioning, CI/CD automation, deployment governance, monitoring, incident management, and platform optimisation.
  • Written Templates for AWS infrastructure as a code using Terraform to build staging and production environments. Drove infrastructure standardisation and reusable Terraform module development across environments.
  • Managed cross-functional DevOps and Platform Engineering teams to deliver secure, scalable, and highly available cloud infrastructure solutions by implementing CI/CD automation using GitHub Actions and Jenkins while ensuring operational excellence, governance, and SLA compliance.
  • Led technical design discussions, architecture reviews, and platform modernisation initiatives across AWS cloud environments.
  • Collaborated with enterprise architecture, security, compliance, and infrastructure teams to ensure platform alignment with governance standards.
  • Managed Agile sprint planning, technical backlog prioritization, and release coordination activities.
  • Maintained code quality in DevOps by implementing Git-based branching strategies, peer code reviews, automated testing, static code analysis, vulnerability scanning using Wiz and Snyk, through Jenkins and GitHub Actions.
  • Experience in using configuration management tools like Ansible to control the changes for all the servers across all the environments.
  • Building Docker container images and deploying containers in Prod environment.
  • Managed Docker orchestration and Docker containerisation using Kubernetes.
  • Designed and implemented Kubernetes RBAC policies across 50+ namespaces, improving security posture and ensuring least-privilege access controls.
  • Used Kubernetes to orchestrate the deployment, scaling and management of Docker Containers.
  • Architected and implemented Argo CD-based GitOps platform supporting 100+ microservices across development, staging, and production environments.
  • Implemented Prometheus and Grafana monitoring solutions for 100+ Kubernetes workloads across multiple environments.
  • I worked on Python scripting as well.
  • Technically proficient in Unix Commands, Shell scripting and job scheduling in UNIX / Linux Platform.
  • Led enterprise production monitoring, incident management, and platform reliability operations across 24x7 mission-critical environments, ensuring high availability and SLA compliance.
  • Drove incident management, root cause analysis (RCA), and continuous improvement initiatives to enhance operational stability.
  • Directed deployment planning and release activities for production applications and infrastructure with minimal downtime and operational risk.
  • Collaborated with cross-functional engineering, infrastructure, and support teams to enhance operational stability, automation, and platform performance across enterprise environments.
  • Ability to interact effectively with clients, understanding their requirements and providing optimum solutions and support.
  • Monitored team performance, delivery KPIs, SLA adherence, and operational metrics to ensure high-quality service delivery.
  • Promoted automation-first culture by reducing manual operational effort through Terraform, Ansible, Docker, and Kubernetes automation.
  • Managed cross-functional offshore/onshore teams supporting enterprise-scale AWS cloud and platform engineering programs.
  • Providing production support to applications following the ITIL concept and maintaining the SLA.

Overview

13
13
years of professional experience
1
1
Certification

Work History

Senior Platform Eng.

Commonwealth Bank of Australia
01.2023 - Current
  • Managed 500+ repositories and 2,000+ users across multiple GitHub Enterprise organizations.
  • Implemented GitHub Actions pipelines that decreased deployment duration by 60%.
  • Improved security compliance by 95% through organization-wide branch protection and security scanning policies.
  • Migrated 1,000+ repositories from legacy SCM platforms to GitHub Enterprise with zero critical downtime.
  • Integrated GitHub with enterprise identity providers (SSO) for centralized authentication and authorization.
  • Automated provisioning of SageMaker environments using Infrastructure-as-Code (IaC) tools like Terraform and AWS CloudFormation.
  • Architected and deployed infrastructure using Terraform and CloudFormation on AWS, automating deployment processes and reducing manual configuration errors by 40%.
  • Architected and developed MCP Server Integration to ML-Vigil Framework which is designed to continuously track and evaluate the performance of Machine learning Models Post deployments.
  • Built MCP infrastructure integrating LLMs (Claude 3.5 Sonnet, GPT-4) with enterprise-grade guardrails to ensure safe and governed AI usage.
  • Worked on various POCs for platform and later Implemented it non-production and then production.
  • Developed feature for the platforms and integrated EMR, EKS cluster.
  • Worked on training various (AI/ML) model using metadata.
  • Take care of all the BAU activities for the Model which is deployed in AWS.
  • There are various AWS based utility stores has been developed which is used by Data Scientist.
  • The Commonwealth Bank of Australia (CBA), or CommBank, is an Australian multinational bank with businesses across New Zealand, Asia, the United States, and the United Kingdom. It provides a variety of financial services, including retail, business and institutional banking, funds management, superannuation, insurance, investment, and broking services. The Commonwealth Bank is the largest Australian listed company on the Australian Securities Exchange, with brands including Bankwest, Colonial First State Investments, ASB Bank (New Zealand), Commonwealth Securities (CommSec) and Commonwealth Insurance (CommInsure).

Technology Lead/AVP

NatWest group
01.2022 - 12.2022
  • Worked on developing efficient platform for data science user which we called as Kepler by using terraform and cloud formation in AWS Cloud.
  • Developed various Service Catalogue product using cloud formation for the platform so that data science user can launch the product and can build, train and deploy various Model as per use case requirement.
  • Implemented various cloud native applications for the platform like EMR, Athena, Glue enables data science user to prepare the model for their AI/ML implementation.
  • Manages various environments for the product like dev, test and prod and ensure smooth code promotion.
  • On-boarded multi-model inference product and implement Airflow to orchestrate multiple pipelines.
  • Implemented real-time inference and batch inference in the Kepler environment.
  • Prepared various system diagram and implemented on the Kepler environment.
  • Collect the requirement from data science team and Implemented on the Kepler Environment.
  • Created various docker images for AI/ML monitoring based tool like Arthur, Comet into the platform.
  • Created Various ADFS roles, policy to access and restrict various services.
  • Collecting all the non-compliance rule and fix them for the platform.
  • Worked on Agile-scrum method.
  • Worked on code promotion and releases for the platform.
  • NatWest Group plc is a British banking and insurance holding company, based in Edinburgh, Scotland. The group operates a wide variety of banking brands offering personal and business banking, private banking, investment banking, insurance and corporate finance.

Snr. Associate

AT&T
03.2020 - 01.2022
  • Managed and improved build systems and milestone builds across multiple streams of development and assisted developers with the timely resolution of any build failures.
  • Coordinated with the project Management, Development and QA Team in resolving any configuration and deployment issues and to provide smooth release process.
  • Monitored the required queues for incoming build requests and take necessary action to accomplish a request resolution and troubleshot the built related issues.
  • Creating servers, AMIs, storage blocks in S3, taking snapshots, VPCs, subnets, load balancing and auto-scaling in AWS.
  • Played a key role in automating the deployments on AWS using GitHub, Terraform, Ansible and Jenkins.
  • Created a Continuous Delivery process to include support building of docker Images and publish into docker hub.
  • Container management using docker by writing docker files and set up the automated build on docker hub and installed and configured Kubernetes.
  • Deployed microservices using kubernetes istio.
  • Designing and implementing CI (Continuous Integration) (CI) system: configuring Jenkins servers, Jenkins nodes.
  • Deployment of different J2EE & PHP applications.
  • Coordinating post deployment support and delivery of critical fixes.
  • Service requests: Tickets regarding changes in the infrastructure, Increase of memory, hard disk, Number of CPU's and also security issues.
  • Change management: Creating the change ticket regarding changes in infrastructure, and representing this change ticket with client change management team.
  • Implementing the changes to resolve the incidents.
  • Problem management: Resolving problem tickets based on severity with justified Root cause analysis.
  • AT&T Inc. is an American multinational conglomerate holding company, Delaware-registered. It is the world's largest telecommunications company, and the second largest provider of mobile telephone services.

Technology Analyst

Infosys
02.2017 - 03.2020
  • Handling client calls.
  • Attending the daily Stand-up calls and facilitating the solution of issues.
  • Monitoring the daily batches in all the regions: Tokyo, London and New York.
  • Working on the alerts and CDR breaches.
  • Analysing the job failures and checking for SLA violations and escalating it to the appropriate team.
  • Involved in Sanity checking of the application on every Monday (Weekly Basis) before Business hours.
  • Checking for the Database locks if any.
  • Creating or updating run book if it's not a known issue.
  • Giving the appropriate handover to the next shift person if any outstanding issues to be looked on priority basis.
  • Checking the logs on daily basis before the starting of Business Hours.
  • Coordinating post deployment support and delivery of critical fixes.
  • Involved in preparing the User Guide of the application.

Senior Associate Consultant

NTT DATA
11.2014 - 01.2017
  • This project aims to deal with Reporting applications, which includes RDS (Reporting Distribution System), RGS (Reporting Generation System), Comet, REX (Report Explorer), Piper etc.
  • We provide infrastructural support and application support for their products.
  • These products are hosted on Linux and windows servers.
  • As infrastructural support we have monitoring system in place we get alerts and based on the severity of the alerts and their impact we resolve them following standard procedures.
  • For application related user issues they can raise ticket via ticketing tool based on the severity and impact.
  • For all the user escalated ticket we do L2 investigation following our Kw, escalate to concerned team if needed.
  • As a part of L2 we do host outage calls send out communication to users & get the relevant teams involved (e.g: dbau, storage ops etc).
  • We also work on incident/problem and change management.
  • We also perform impact analysis for DBAU & Storage work to check if our application is impacted or not.
  • We do prepare runbooks for weekend work related to TCM, powerdown down activities etc. And get appropriate approvals before executing them.

Trainee

Vidal HealthCare TPA Pvt Ltd
12.2013 - 10.2014
  • Provided timely acknowledgements to policyholders on claim process status, serving a diverse range of policies including corporate, individual, and non-corporate.
  • Maintained comprehensive databases of policyholder and policy details, issued identity cards with unique identification numbers, and handled post-policy issues, including claim settlements.
  • For every hospitalization, the policyholder will be well aware whether the treatment he/she is to undergo is covered under his policy or not.
  • If covered, then he/she can seek cashless facility without having to pay a single rupee at any of empaneled hospitals with TPA.
  • Third Party Administrator (TPA) provides service to Health Insurance policy holders across the country.

Education

B-Tech - Computer Sc. Engg

BPUT
01-2012

Skills

  • AWS DevOps & IaC
  • AWS (EC2, VPC, S3, RDS)
  • Terraform & CloudFormation
  • Ansible & Configuration Management
  • AWS Cost Optimization
  • CI/CD & Orchestration
  • Jenkins & Jenkins Pipeline
  • GitHub Actions
  • GitLab CI/CD
  • Docker & Kubernetes
  • Promethus
  • Graffana
  • Observe

Certification

AWS Certified Solutions Architect - Associate, 2025-01-07, 2028-01-07

Custom

To serve the organization as an efficient Devops Professional which enable to contribute to the acquired technical skills and enhance further technical abilities which can result in continuous learning that stimulator professional and personal growth.

Timeline

Senior Platform Eng.

Commonwealth Bank of Australia
01.2023 - Current

Technology Lead/AVP

NatWest group
01.2022 - 12.2022

Snr. Associate

AT&T
03.2020 - 01.2022

Technology Analyst

Infosys
02.2017 - 03.2020

Senior Associate Consultant

NTT DATA
11.2014 - 01.2017

Trainee

Vidal HealthCare TPA Pvt Ltd
12.2013 - 10.2014

B-Tech - Computer Sc. Engg

BPUT
SIBA SANKAR SAMAL