Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic
Yash Singh

Yash Singh

Lead Data Scientist
Nagpur

Summary

With over 8.6 years of experience in Machine Learning and analytics, I specialize in GenAI and running pretrained models. My expertise includes building end-to-end AI/ML solutions for applications such as Sentiment Analytics, NLP using NLTK, and supervised and unsupervised machine learning and data analytics. I am proficient in various algorithms such as linear regression, k-NN, decision trees, random forest, k-means clustering, Xgboost, GEN AI, RAG Models, GPT2, Prompt Engineering, Agentic Workflow, Langchain, Langsmith, and more.

Overview

9
9
years of professional experience
4
4
years of post-secondary education
4
4
Certifications

Work History

Data Science Team Lead

NTTDATA
Pune
08.2024 - Current
  • Working on enhancing the feature of AWS Bedrock- Identifying the limitations of AWS Bedrock service
  • I built the entire architecture for the RAG on AWS native services
  • Enhancing the features of AWS Bedrock like, Custom Data Chunking, Metadata extraction, Text Parsing, Sync functionality
  • Using the Python AWS SDK for building the Custom RAG models
  • Implementing the Agentic workflow in the RAG use case
  • Technologies Used AWS Bedrock, AWS Lambda, AWS Textract, AWS Eventbridge, AWS EKS, AWS S3, AWS Sagemaker, Docker, Kubernetes, Langchain, Langsmith, Openserch Vector Database, FIASS, Azure AKS, Terraform Scripts, etc

Data Science Team Lead / Consultant

Accenture
Pune
07.2021 - 08.2024
  • Building Question and Answer Chatbot for one of the Clients- Creating Question and Answer Chatbot using RAG and LLM
  • Testing the LLM and RAG output using the RAGAS framework
  • Used OpenAI and Chat GPT4.o, Prompt engineering, Agentic Workflow
  • Job Description and Profile development for AmerisourceBergen (HealthCare Client)- Creating Job profiles for different Job Roles
  • Did Data Wrangling, Text data Cleaning
  • Created NLP models using Transformers LLM (BERT, BART), NER, Summarization, paraphrasing, IOB tagging etc Libraries used (NLTK, Transformers, Sklearn, Pytorch, Word2Vec etc)
  • Created the summarized Responsibilities and Summary for different job profiles
  • Did Automation for model Building and summarization Process
  • Handled a team of 3 members
  • Creating Data Lake and Data Warehouse for McDonald's (ETL/Data Engineering role)- Academic Details Extracting the client's data from the URL using API, and AWS services (EventBridge, SNS, SQS, Lambda, step -function, and Glue)
  • The whole project is built using the CI/CD pipeline (Jenkins and Codecommit)
  • Extracted data is saved in the S3 bucket and from there, it is Transformed using Python, SQL, and Pyspark
  • The Transformed data is saved in the AWS Redshift, from there this data is used for the Power BI report
  • Handled a team of 6 members
  • Customer Opportunity identification for one of the largest Jewelry companies based out of the US- Handled big data (232 million Cust) using Pyspark, Python, and SQL
  • Created ML models using Pyspark (Mlib) and Python to identify High Potential customers
  • Created Customer Analytical Report (CAR) along with Derived Variables
  • Did Automation for model Building and Scoring Processes using AWS (Glue and Lambda)

Data Science Team Lead

Tata Motors
Pune
04.2018 - 07.2021
  • Optimized existing predictive models through thorough testing and evaluation, leading to increased accuracy rates.
  • Enhanced data analysis efficiency by implementing new machine learning algorithms and techniques.
  • Implemented a standardized project management framework to ensure consistent delivery of high-quality data science solutions across all team initiatives.
  • Mentored junior data scientists, providing guidance in technical skills development and career growth opportunities.
  • Worked as Team Lead at Tata Motors and handled AI/ML-related projects
    Key responsibility includes partnering with business teams and key stakeholders to understand business requirements and help them define the road map for data analytics initiatives for better data-driven decisions using Data Science capabilities and end-to-end ML solution implementation
    Identifying High potential customer for TML Vehicles Objective- To identify leads generated at the dealer's end and to identify customers who are more likely to purchase the vehicle including commercial and Passenger vehicles
    The old model was present for a few dealers in different regions and for BS4 models, our objective is to create a new model for BS6 vehicles and for Pan India level
    Data points are collected from CRM for the preparation of the training dataset
    Random forest, has been used for model preparation
    Sentiment Analysis on Twitter Data Objective- To determine the sentiment of people about TML and its competitors on different topics (Brand Related, Service Quality, Vehicle Performance, Stocks, Workshop Performance (Pre Sales and Post Sales) and Miscellaneous) on a Daily, weekly, and Quarterly basis
    Collect and store daily tweets using Twitter API and Get old tweets library (GOT)
    Once the data is downloaded, data preprocessing and data cleaning are done (Tokenization, Noise removal (Lemmatization, Stemming, stop words are removed) using NLTK library

Analyst

Edulocus Consultant
Nagpur
04.2016 - 11.2017
  • Worked as data Analyst, handled sales data, performance forecasting, Advanced analytics, etc

Education

B.E - Electronics and Telecommunication

RTMNU
Nagpur
01.2012 - 01.2016

12th Standard - undefined

KV Ambajhari

10th Standard - undefined

KV Ambajhari

Skills

  • Machine Learning

  • Python

  • Pyspark

  • Bigdata

  • AWS/Azure

Data Engineering

  • CI/CD

  • MLOPS

  • SQL

  • Architecture Design

  • Generative AI (GenAI)

Certification

Certificate in Data Science Masters from iNeuron Technologies (Physics Wala)

Timeline

Data Science Team Lead

NTTDATA
08.2024 - Current

Data Science Team Lead / Consultant

Accenture
07.2021 - 08.2024

Data Science Team Lead

Tata Motors
04.2018 - 07.2021

Analyst

Edulocus Consultant
04.2016 - 11.2017

B.E - Electronics and Telecommunication

RTMNU
01.2012 - 01.2016

12th Standard - undefined

KV Ambajhari

10th Standard - undefined

KV Ambajhari
Yash SinghLead Data Scientist