Summary
Overview
Work History
Education
Skills
Educational Qualification And Certification
Certification
Timeline
Generic

SAURABH SINHA

Bangalore

Summary

Accomplished Senior Machine Learning Engineer with overall experience of 11 years currently working at Tata Consultancy Services, skilled in architecting MLOps pipelines and optimizing cloud resources. Proven track record in deploying high-quality models with CI/CD practices, while mentoring teams to enhance collaboration. Expertise in Python and a strong commitment to continuous improvement drive impactful results in machine learning projects.

Overview

11
11
years of professional experience
1
1
Certification

Work History

Senior Machine Learning Engineer

Tata Consultancy Services
Bangalore
10.2022 - Current

Implemented an end-to-end MLOps pipeline using Azure ML CLI v2 and Azure DevOps for environmental sensor anomaly detection

Designed CI/CD pipelines for continuous integration and deployment of ML models using DevOps/YAML pipeline. Implemented quality gates and testing frameworks for ML code and model validations.

Optimized Compute Resources: Implemented dynamic compute scaling in Azure ML clusters with min-max instance configuration, balancing performance needs, with cost efficiency during model training.

Enabled Continuous Improvement: Built MLflow-integrated pipelines that automatically track experiments, metrics, and model artifacts, facilitating model governance and performance analysis.

Maximized Reproducibility: Implemented versioned environments using Conda specifications and Docker containers, ensuring consistent execution across development and production.

Streamlined Deployment Workflows: Created automated deployment pipelines with traffic allocation controls, enabling safe model rollouts with zero-downtime updates to production endpoints.

Azure Resources Utilizes: Azure ML Studio, Compute Cluster, Online Endpoint, ACR, Blob Storage, Key Vault, DevOps, Pipeline, etc.

Built High-Performing Teams: Cultivated a DevOps mindset within engineering teams, mentoring and guiding junior engineers to adopt best practices in MLOps.

Implemented an end-to-end MLOps solution using AWS CDK and GitHub Actions that automated the full machine learning lifecycle

Leveraged AWS SageMaker for model training, evaluation, and registry, while implementing CI/CD pipelines through GitHub Actions workflows triggered by AWS Lambda functions.

Designed a template-based approach with standardized project structures for build and deploy repositories, enabling consistent ML workflows.

Incorporated infrastructure-as-code principles using AWS CDK (Python) to provision cloud resources including SageMaker endpoints, IAM roles, and S3 buckets.

Utilized XGBoost for model training with automated quality gates that conditionally register models to the SageMaker Model Registry.

Productionize MLflow tracking and deployment server on AWS.

To deploy and host an MLflow dashboard with a backend (tracking server, database, artifact store), integrate it with SageMaker for model deployment, and allow access by users and developers securely through AWS infrastructure.

Infra Provisioning: Implemented Infra provision via CloudFormation for deploying MLFLOW dashboard for model metrics monitoring for multiple project through single browser access. CDK toolkit to deploy a tech stack: ECS, ECR, VPC, ELB, S3, RDS, etc., as Infrastructure as Code. Designed to provide a serverless MLFLOW deployment using AWS Fargate with auto-scaling capabilities.

Secure Storage: Created S3 bucket for MLFLOW artifact storage with appropriate access controls.

ML workflow: Experiment runs tracking in Sagemaker using MLFLOW.

Model deployment: Created deployment pipeline from MLFLOW to sagemaker endpoints. Added model registry workflow for versioning and promotion and configuration of Sagemaker endpoints with auto-scaling for production traffic.

Security: Infra setup is backed up with Secure VPC architecture with public, private and isolated subnets, load balancer for high availability and fault tolerance, and security groups.

Enhanced Collaboration: Visibility on experiments run across teams and standardized workflow with reduced dependency on individual team member.

Automated Risk Classification System Using NLP for HSSE Compliance.

Developed an ML-powered solution to automate the classification of HSSE-related incidents using textual data, improving consistency and response time across the QGC production site.

Trained multiple NLP models (TF-IDF + Logistic Regression, BERT, etc.) And selected the best model based on evaluation metrics.

Integrated the deployed model with Power BI to expose predictions and enable real-time risk dashboards for stakeholders.

Software Engineer - 2

Concentrix – Convergys India Services Private Limited
02.2021 - Current

Collaborated with the team to enhance the machine learning pipeline.

  • Developed features based on the requirements using Deep learning and Machine learning techniques using YOLO and XGBOOST algorithm to get efficiency of 86%.
  • Well versed with ML pipeline, integration of ML code and quantization of model using ONNX for better efficiency.
  • CI/CD pipeline – Deployment of model/Code to Dev, QA and Production.

Model Improvement for Object Detection Using YOLOV4 and evaluations.

  • The goal of project to predict the Violation of the agent working from home by performing Facial Detection, facial authentication & Object Detection and raise ticket to security for further investigation.
  • Data/Image Collection through different source AND writing python script to convert video to frame through Hashing Techniques.
  • Evaluation Metrics: TP, FP, FN, F1-Score.
  • Model Performance: Developed Model with 97% Precision, 98% Recall with 100 TP, 5 FP AND 2 FN images.

Senior Technical Analyst

Cerner HealthCare India Private Limited
02.2015 - Current

Predictive modelling on Clinical EHR data

  • Preprocessing , Statistical Analysis and Feature Engineering . Creating visuals using pythonic way.
  • Algorithmic model development with hyperparameter tuning to best performing model with AUC of 0.66 and able to catch 60% of readmission.

Early Incident Detection

  • To eliminate the Client Outages by identifying Parameter & Instance combination using ML.
  • Undertaking data collection, pre-processing and analysis from different source SQL server and MS Excel.
  • Data mining and data cleansing – Deduplication, record matching and column Segmentation.
  • Best result got with Logistic Regression as compared with SVM and Random Forest.
  • Involvement with Devops team to deploy model in AWS EC2 instance.

EM Metrics Analysis

  • The goal is to reduce issues related to SEV 1 & SEV 2 MTTR miss. Cerner has client agreement to provide 90% MTTR.
  • Responsibilities:
  • Extracting the historical data and performing analysis on it.
  • Optimized data collection procedures and generated reports on a weekly, monthly and quarterly basis based on the metrics.
  • Presented data and conclusion to team in order to improve strategies and operations.

System Engineer

Siemens India Private Limited
03.2014 - 02.2015

Remote Infrastructure Management System

  • Responsibilities:
  • Capable to failover Network traffic from Primary to Secondary and then back to primary as per the requirement with the help of Our Service Provider AT&T.
  • SQL job activity monitoring and Jobs responsible for OLAP and OLTP transaction and maintaining those jobs.
  • Ensuring that client Network is up and running without the congestion by monitoring their environment and proactively acting if anything looks bad.

Education

Master in Computer Application -

Birla Institute of Technology
Mesra, India
01.2013

Skills

  • Programming languages: Python, R
  • ML libraries: NumPy, Matplotlib, Seaborn, Scikit-learn, XGBoost, SHAP (model explainability), TensorFlow, Keras, PyTorch
  • Data Preprocessing: Pandas,SQL
  • Automation, Deployment & Version Control : GIT, DVC, GITHUB, GitHub Actions, GITLAB, CI/CD
  • MLOps tool : MLflow, Docker, Kubernetes, Jenkins, Feature Store
  • Cloud Platform: Azure (Azure DevOps, Azure ML Studio, Compute, Key Vault, Azure Blob Storage), AWS (SageMaker, EC2, S3), management, and cost optimization
  • Data Science : EDA, Feature Engineering, Model Development & Evaluation, Deployment, Labelling
  • Collaboration and Communication : Mentoring and guiding technical teams, fostering collaborative problem-solving, and promoting knowledge exchange
  • Operating Systems & Developer Tools: Linux (Ubuntu), Visual Studio Code, Jupyter Notebook, PyCharm
  • Data Visualization & Reporting: Power BI, Azure ADX Interactive dashboard creation and advanced data visualization

Educational Qualification And Certification

  • Master in Computer Application, Birla Institute of Technology, Mesra, Jharkhand, 08/01/10, 05/31/13
  • Bachelor’s in Mathematics, St. Columba’s College, Hazaribagh, Jharkhand, 08/01/06, 05/31/09
  • Certification in Machine Learning from Applied AI
  • Specialization Course in MLOPS

Certification

  • Machine learning Specialist from Applied AI
  • Specialization Course in MLOPS from Psitron Tech.
  • Microsoft Azure Data Scientist Associate from Udemy
  • Certified in GEN AI E0 & E1 level from TCS AI practice team.

Timeline

Senior Machine Learning Engineer

Tata Consultancy Services
10.2022 - Current

Software Engineer - 2

Concentrix – Convergys India Services Private Limited
02.2021 - Current

Senior Technical Analyst

Cerner HealthCare India Private Limited
02.2015 - Current

System Engineer

Siemens India Private Limited
03.2014 - 02.2015

Master in Computer Application -

Birla Institute of Technology
SAURABH SINHA