Summary
Overview
Work History
Education
Skills
Certifications
Timeline
Generic
Vaibhav Shrivastav

Vaibhav Shrivastav

Pune

Summary

Lead Data Scientist and Solutions Architect with 11+ years of end-to-end experience delivering enterprise-scale AI, Machine Learning, Deep Learning, and Advanced Analytics solutions across Banking & Finance, Healthcare, Manufacturing, Sales, Recommendation Systems, Fraud Analytics, Anomaly Detection, Time Series Forecasting, Decision Support Systems, Optimization, NLP, and Computer Vision domains.

Also worked onsite as Lead Data Scientist at Avangrid, New York, driving high-impact data science initiatives, leading cross-functional and global teams, and translating complex business problems into scalable, production-ready AI solutions. Proven ability to bridge data science, architecture, and business strategy to deliver measurable outcomes.

Possess 1.5+ years of hands-on experience in Generative AI and Agentic AI, including large language models (LLMs), intelligent agents, retrieval-augmented generation (RAG), attention-based architectures, and AI-driven automation to enhance decision-making and operational efficiency.

Highly skilled in Python-based AI/ML development and API design, with strong expertise in Flask, Kubernetes, MLOps, and cloud-native deployments on Azure, enabling robust, scalable, and production-grade AI platforms. Extensive experience designing and deploying supervised, unsupervised, deep learning, and reinforcement learning models using TensorFlow, Keras, PySpark, Scikit-Learn, NLP libraries, and Computer Vision frameworks.

Strong background in data engineering, data warehousing, and big data analytics, leveraging SQL, Spark, Hadoop, Hive, and cloud services to manage structured and unstructured data. Adept at advanced modeling techniques including CNNs, RNNs, LSTM/BI-LSTM, autoencoders, BERT, topic modeling, dimensionality reduction, clustering, classification, and regression models.

Recognized for technical leadership, excellent communication, and stakeholder management, with extensive experience working onsite and offshore. Proven track record of mentoring teams, driving technical delivery, building analytical dashboards (Power BI, Qlik Sense), and delivering data-driven insights that support revenue growth and strategic decision-making.

Work Authorization: Currently on H-1B visa, authorized to work in the United States.

Overview

2026
2026
years of professional experience

Work History

Lead Data Scientist

Wipro Technologies LTD
Pune
10.2025 - Current

Banking Domain Chatbot using Gen AI and LLM

•Developed an intelligent chatbot powered by Gen AI and LLM for the banking domain. The chatbot was fine-tuned with domain-specific data, including banking policies, loan documents and general banking FAQs, to accurately understand and respond to user queries in a conversational manner.

•Technologies Used: LLAMA 2, GPT-4, Python, PyTorch, Hugging Face Transformers ,Flask(API integration),Docker,Kubernetes,Azure Cloud , NLP.

Lead Data Scientist

Wipro Technologies LTD
Pune
07.2021 - Current
  • Developed machine learning models for fraud analysis across various BOA transaction types (Bill payment, ACH, Wire, P2P). Conducted performance evaluations using visualization plots, such as AUC-ROC curves, on new datasets, and presented comprehensive reports to BOA clients.
  • Implemented real-time monitoring functionality to detect anomalies in features utilized for ML alert model, leveraging Pyspark and descriptive modeling techniques.
  • Currently, in the process of converting developed encoding Python scripts into Pyspark scripts for all BOA fraud ML models, facilitating deployment on the production server via Git.
  • As a Data Scientist Tech Lead, handled end-to-end project management, including development, support, and team leadership, utilizing technologies like AI, ML, Python, Data Science, Pyspark, Hive, ML visualization plots, and descriptive modeling.
  • Managed the entire application development lifecycle using Python, Pyspark, Hadoop, and Hive. Oversaw all Data Science activities, delegated tasks within the data science team, and efficiently managed production activities in the client environment.

Lead Data Scientist

Wipro Technologies LTD
Pune
07.2021 - Current
  • Developed the Chatbot which give the user information of their Account details, loan, mortgage etc.
  • MS Azure based chatbot –Handled the multiple user intent and free asked question.
  • Chabot developed Using Luis, App Service, bot channel, QnA maker of MS Azure cloud services.
  • Chatbot developed with different flow on tenant ID bases.
  • Deployed In Production for Comerica bank in MS Azure Cloud.
  • Also developed the admin dashboard using the azure application insights service.
  • Role - Data-Scientist Tech lead, developer, Support, Team Handling.
  • Technology - Data Science using Python, Azure Cloud Services, Git, ML libraries, Flask API, Docker Deployment, Python, and standard Python packages.
  • Responsibilities - Handled and Manage the End-to-End project along with team.
  • Complete Architecture Design & application development using Python and create the admin dashboard using Azure application insight.
  • Chabot developed Using Luis, App Service, bot channel, application insight, QnA maker of MS Azure cloud services.
  • Handled the UAT, Production Deployment and support.

Lead Data Scientist

Wipro Technologies LTD
Binghamton
03.2025 - 12.2025

Obligation Extraction from Contract Data using GPT-4.0

•Developed a GenAI-powered obligation extraction solution using GPT-4.0 to identify contractual obligations, responsibilities and conditions from contract documents.

•Applied domain-specific prompt engineering and NLP techniques to extract structured obligation data from unstructured legal text.

•Designed structured outputs (JSON-based responses) to enable downstream integration with compliance, workflow and reposting systems.

•Technologies Used: GPT-4.0 , Gen AI,Python,NLP,Prompt Engineering, Flask API, Azure Cloud.

Data Science Lead

Wipro Technologies LTD
Binghamton
03.2025 - 08.2025

AI-Powered Customer Default Prediction and Risk Segmentation.

Build an end-to-end ML model to predict customer default risk using transactional and behavioral data, enabling early risk identification.

Implemented risk segmentation (Low/Medium/High) and monitored model performance using AUC-ROC, KS, and feature stability metrics (PSI/FSI).

Enabled early warning signals for business teams to support collections strategy, credit risk management, and loss prevention.

• Designed the solution for production scalability, supporting proactive credit risk, and collections strategies.

• Technologies Used: Python, ML, SQL Server, Scikit-learn, XGBoost, Feature Engineering, model monitoring, Azure Cloud.

Data Scientist/AI/ML

Capgemini Technology Services
Pune
11.2021 - 07.2021
  • Extract AIML based insights from the unstructured incident data and find the most frequently occurred incident, published on Qlik Sense dashboard server and automate the process.
  • Role - Data Scientist, Developer.
  • Technology - Python, NLP, Cloud SQL, Big query, Data Mining, Excel, Multiprocessing, Flask API, Qlik Sense, ICP, GCP, Airflow.
  • Responsibilities - Handled End to End project from Requirement Gathering to Production Deployment.
  • Handled all client discussion.
  • Developed the NLP based Similarity Algorithms using Python to identify the most frequently occurred incident.

Data Scientist

Capgemini Technology Services
Pune
11.2021 - 07.2021
  • Extract the Signature Image from the PDF and Image from the bank Loan form and mortgages Forms.
  • Bank authority use this information to build the database.
  • Role - Data Scientist and Developer.
  • Technology - Python, OpenCV, YOLO, Tesseract, EasyOCR, PDFBox, skimage, Kubernetes deployment.
  • Handled Client Discussion and End to End Project.
  • Using Python library like OpenCV and YOLO, Developed the algorithm to Detect and Extract the Signature Image from the PDF and Image bank form.

Data Scientist

Capgemini Technology Services
Pune
11.2021 - 07.2021
  • ZOps AIML web-based end to end solution which automatically handle the system generated incident using classification model.
  • Role - Data Scientist and Developer.
  • Technology - Python, ML, NLP, Data Mining, Excel, Flask API, Oracle DB, Integration with External API, Docker.
  • Responsibilities - Perform Lemmatization, Stemming, N-Grams to extract the required words from Description column.
  • Performed word2vec vectorization, try multiple text classification algorithm and evaluate the model based on accuracy matrix and confusion matrix.
  • Finalize the XGboost algorithm and created the pickle file.
  • Create Flask API and deployed in Linux env.
  • Shared and integrate the flask API with the UI application.

Data Scientist - NLP Expert

Optra Health System
Pune
06.2017 - 09.2019
  • Health care researcher want the automatic process to find the medical terminology from the research paper and want the relationship between the different terminology like polarity, Gene, and medical terms with their confidence score.
  • Role - Data Scientist -NLP Expert and Developer.
  • Technology - Python, Spacy, NLTK, NLP, MySQL, Data Mining, Flask API, Kubernetes.
  • Responsibilities - Using the python data mining libraries like Spacy, NLTK, Text blob etc.
  • Developed the Dependency tree and find out the text relationship between the sentences and created the rule-based algorithm for the new text.
  • Create Flask API and deployed in Linux env.
  • Shared and integrate the flask API with the UI application.

Data Scientist / NLP Expert

Optra Health System
Pune
06.2017 - 09.2019
  • Developed the Voice based medical domain and medical report chatbot which allow end user to extract the relevant answer from his/her medical report and give general information of medical field.
  • Role - Data Scientist -NLP Expert and Developer.
  • Technology - Python, Spacy, NLTK, NLP, MySQL, Data Mining, Flask API.
  • Responsibilities - Classification Model for identify intent of Question.
  • Using NLP identify the Important Terms from question.
  • Crawl the web using python and create the Database for the Chat bot.
  • Created the for identify the Answer to Asked Question.
  • Create Flask API and deployed in Linux env.
  • Shared and integrate the flask API with the UI application.

Jr Data Scientist

Clodura System Pvt Limited
Pune
02.2017 - 05.2017
  • Find Leads for B2B Sales Product using Machine Learning, Data Mining, NLP, Python.
  • Extract Important Leads and News from the different data sources and with the help on EDA process the data.
  • Using Topic modelling Algorithms identify the best topics from the unstructured dataset.
  • Create the Classification model to predict the topics from the new data and reduce the processing time.
  • Developed the API of all the ML and NLP features.
  • Deployed the ML and NLP features on the cloud and Linux server.
  • Integrated the python API with the UI and Other technologies.

Software Engineer

Atos-Syntel Pvt LTD
Pune
10.2014 - 01.2017
  • Identify patients who will be admitted to a hospital within the next year using historical claims data.
  • Identify the length of stay of patients who will be admitted to a hospital within the next year using historical claims data.
  • Role: Developer and data analyst & Tools and Skills: R, SQL, MS Excel.

Data Scientist

Atos-Syntel Pvt LTD
Pune
10.2014 - 01.2017
  • Predicts the pattern or relationships between spare part failures in a warranty database that occur over time for the manufacturing client.
  • Role - Data Scientist and Developer.
  • Technology - SparkR, Association rule mining algorithm, MySQL.
  • Using the SparkR programming developed the Association Rule Mining Algorithms which predict the pattern and relationship between spare part from the warranty database.

Lead Data Scientist/AI/ML

Wipro Technologies LTD
Pune
  • Spearheaded the design and implementation of cutting-edge fraud analytics solutions as a Solution Architect, leveraging advanced machine learning algorithms for enhanced detection capabilities.
  • Led a dynamic team as the Lead Data Scientist and Developer, crafting and deploying robust ML models to fortify fraud prevention strategies.
  • Architected and implemented end-to-end solutions, showcasing a unique blend of technical expertise and strategic vision in developing fraud analytics systems.
  • Played a pivotal role in crafting and fine-tuning machine learning algorithms, contributing to a significant reduction in fraudulent activities and strengthening overall security measures.
  • Demonstrated proficiency in driving innovation by integrating state-of-the-art technologies into fraud analytics, establishing a reputation for delivering high-impact, data-driven solutions.
  • Orchestrated the successful development and deployment of fraud detection mechanisms, showcasing a comprehensive skill set encompassing architecture design, data science, and software development.

Education

Bachelor of Engineering -

RAJIV GANDHI TECHNICAL UNIVERSITY
Madhya Pradesh, India
06.2014

Skills

  • Data science
  • GenAI
  • LLM - GPT4,RAG,GANs
  • Linux
  • EasyOCR
  • Data Analytics
  • NumPy
  • Scikit image
  • Artificial Intelligence
  • Pandas
  • Azure cloud
  • Machine Learning
  • Scikit-learn
  • Azure cloud Services
  • Deep Learning
  • NLP
  • Airflow
  • NLTK
  • Mlops
  • Python
  • Word2Vec
  • Qlik Sense Visualization tool
  • Python API
  • Pytesseract
  • Predictive Modelling
  • R programming
  • Pdfminersix
  • Data Extraction
  • Pyspark
  • Topic Modeling
  • Image Processing
  • Flask or Fast API
  • CNN
  • Anaconda
  • Django
  • RNN
  • Jupiter Notebook
  • Docker
  • Keras
  • VSCode
  • Kubernetes
  • Spacy
  • PyCharm
  • Kubernetes Operators
  • TensorFlow
  • Sublime
  • SQL
  • Matplotlib
  • Oracle
  • Seaborn
  • Big Query
  • Git
  • Hive
  • OpenCV
  • Hue
  • YOLO
  • Windows
  • Tesseract

Certifications

  • Completed GCP CCAI certificate
  • MLOps Fundamentals: CI/CD/CT Pipelines of Pipelines of ML with Azure
  • Azure Databricks and Spark for data engineers (PySpark, SQL)
  • Deploying AI and machine learning models for business
  • Python,Executive Briefing: Computer Vision
  • Python for Computer Vision with OpenCV and Deep Learning
  • Completed Coursera Sequence Models course
  • Completed Coursera Neural Networks and Deep Learning
  • Completed Rapid Minor Certification, Completed Professional NLP certification
  • Introduction of Gen AI,Gen AI Fundamentals
  • Azure OpenAI for Business, Introduction of Gen AI,Gen AI Fundamentals
  • Azure OpenAI for Business
  • Prompt Engineering for ChatGPT
  • Started with Generative AI APIs Specialization
  • Image with DALL-E
  • Build AI Apps with ChatGPT,
  • Dall-E and GPT-4,WINGS-Basic

Timeline

Lead Data Scientist

Wipro Technologies LTD
10.2025 - Current

Lead Data Scientist

Wipro Technologies LTD
03.2025 - 12.2025

Data Science Lead

Wipro Technologies LTD
03.2025 - 08.2025

Data Scientist/AI/ML

Capgemini Technology Services
11.2021 - 07.2021

Data Scientist

Capgemini Technology Services
11.2021 - 07.2021

Data Scientist

Capgemini Technology Services
11.2021 - 07.2021

Lead Data Scientist

Wipro Technologies LTD
07.2021 - Current

Lead Data Scientist

Wipro Technologies LTD
07.2021 - Current

Data Scientist - NLP Expert

Optra Health System
06.2017 - 09.2019

Data Scientist / NLP Expert

Optra Health System
06.2017 - 09.2019

Jr Data Scientist

Clodura System Pvt Limited
02.2017 - 05.2017

Software Engineer

Atos-Syntel Pvt LTD
10.2014 - 01.2017

Data Scientist

Atos-Syntel Pvt LTD
10.2014 - 01.2017

Lead Data Scientist/AI/ML

Wipro Technologies LTD

Bachelor of Engineering -

RAJIV GANDHI TECHNICAL UNIVERSITY
Vaibhav Shrivastav