Summary

Overview

Work History

Education

Skills

Websites

Certification

Languages

Accomplishments

Timeline

Prabhat Shukla

Bhopal

Summary

Data science expert with over 9 years of dedicated experience in data science, contributing to more than 12 years of overall professional experience. Specializing in machine learning, NLP, and predictive modeling, I excel at converting complex datasets into actionable insights that enhance decision-making and business strategies. Proficient in Python, R, SQL, and leading data visualization tools, I have a proven track record of improving operational efficiency and driving revenue growth. Skilled in stakeholder management and leading cross-functional teams, with experience collaborating directly with third-party partners to integrate cutting-edge AI solutions.

Overview

years of professional experience

Certification

Work History

Sr. Data Scientist

Yash Technologies Pvt Ltd

Pune

02.2020 - Current

Client: John Deere

Automated Warranty Responsibility Code Assignment: Designed and deployed a hybrid ML-driven solution leveraging Large Language Models (LLMs) to automatically assign responsibility codes to warranty claims (Supplier/Deere liability). The LLMs analyzed the complaint cause correction text along with the entire warranty worksheet data to accurately predict the responsible party.
Impact: Recovered over $55 million from previously rejected or pending claims, significantly increasing the claim acceptance rate and reducing rejections due to accurate and automated responsibility assignments.
Machine Translation Quality Evaluation Using LLMs: Developed an automated system leveraging LLMs (ChatGPT) and Galileo evaluation to assess machine-translated text quality. The system's prompts were enriched with metadata from a vector database, including approved terminology and Deere-specific style guides, ensuring adherence to brand and domain standards. Evaluated translations across fluency, accuracy, terminology, style, and local conventions, categorizing quality and severity (major, minor, critical, neutral).
Impact: Phase 1 achieved a 20% reduction in manual effort from linguistics experts, with a target to reduce effort by at least 50% to achieve significant cost savings in translation quality assessment.
Enhancing Extended Warranty Data in Palantir Foundry: Improved the quality, accessibility, and actionable insights from extended warranty data by utilizing Palantir Foundry's capabilities. Designed and implemented robust data ingestion (ETL) pipelines using Palantir Foundry and PySpark for complex data enrichment.
Impact: Enhanced data quality and accessibility for key stakeholders, leading to more informed decision-making and a 25% reduction in time-to-insight for warranty claims analysis.
Attachments, Parts, Forecasting (Time Series): Implemented an ML-driven time series forecasting mechanism for over 2477+ attachment parts. The models (Prophet, ARIMA, Holt-Winter) were developed considering factors such as historic sales, base-coded, and non-base-coded attachments, with performance evaluated using RMSE and MAPE. This initiative aimed to improve overall forecast accuracy and develop a new strategy to feed these forecasts into the Advanced Planning and Optimization (APO) system, adapting to changing business scenarios.
Impact: Improved Attachment Forecast accuracy (achieving 90% forecast accuracy), leading to a 15% reduction in inventory discrepancies, and establishing robust new strategies for seamlessly integrating forecasts into the APO system.
Automated Reporting and Data Pipelines (Power BI & Power Apps): Built automated data collection and cleaning pipelines using Databricks to streamline various reporting workflows across operations.
Impact: Reduced manual reporting efforts by 40% and improved data freshness, enabling quicker access to critical business insights through timely and accurate dashboards.
Key Initiatives under this program include:
Automated Order Response and Execution Report: Developed an automated weekly Power BI report to monitor the "hit and miss" of invoice orders, providing insights through multiple filter types (slicers, dropdowns, multiple tabs).
Impact: Saved ~450 human hours annually and ensured on-time delivery of critical data for proactive decision-making and action.
Plant Site Logistics Shipping Report: Automated a Power BI report for plant site logistics, identifying machines to be picked and shipped. Features include on-demand CSV export and flagging orders without serial numbers or shipping modes.
Impact: Automated reporting saved ~350 human hours annually by streamlining logistics monitoring.
Automated Factory Delivery Date Report of Finished Goods: Created an automated Power BI report with scheduled refresh capabilities, serving as a Key Performance Indicator (KPI) for factory performance. This report helps monitor factory delivery date volatility.
Impact: Saved ~300 human hours annually and provided crucial insights for monitoring and managing factory delivery timelines.

Client: PALL Corporation

Flowstar (Filter Integrity Test Instrument): Visual and Predictive Analytics: Created data pipelines to collect, clean, and feature-engineer XML log files from Flowstar machines. Developed visual reporting and predictive analytics solutions, including an ML model (Scikit-learn), to predict integrity test outcomes at an early stage.
Impact: Improved operational insights, enabled proactive issue identification for medicinal drug manufacturing, and reduced potential production downtime by 10% by predicting integrity test failures.

Sr. Data Scientist

Accenture Solutions Pvt Ltd

Gurgaon

07.2018 - 02.2020

Client: Google, Inc.

Overall Contribution (Google Retail): Applied machine learning and statistical modeling to solve complex business problems in Google Retail, translating insights into actionable recommendations. Built and deployed prototype solutions using Google Cloud applications and Python scripting.
Market Basket Analysis: Generated association rules (Apriori algorithm) to link products, identifying high-confidence relationships in customer purchasing patterns.
Impact: Provided actionable insights for cross-selling strategies, product bundling, and optimizing product placements to enhance sales.
Email Sent Time Optimization: Developed and deployed regression and classification models to predict optimal email send times.
Impact: Maximized email open rates and click-through rates, improving campaign effectiveness and customer engagement.
Segmentation of Site Visitors Using Spark: Developed K-means clustering models in PySpark to segment website visitor data, identifying distinct and natural user types.
Impact: Enabled the creation of highly targeted marketing campaigns and personalized user experiences based on identified visitor behaviors and preferences.

Data Scientist

Northout Solutions

Indore

12.2017 - 07.2018

Client: John Hancock Financial.

Spending Habits Analysis

Description: Analyzed user spending habits and transactional behavior from bank and credit card data, integrating demographic and lifestyle information. Created visualizations and predicted user transactions for upcoming months using ARIMA and regression models.
Impact: Provided clients with deeper insights into their financial behavior, enabling more effective budgeting, personalized financial advice, and improved long-term financial planning.

Machine Learning Engineer

Bonsmat Group

Ludhiana

09.2017 - 11.2017

ChatBot Assistant (bgpay.in)

Description: Developed a conversational dialogue system for mobile recharge offers and services, accessible via Rest API. Incorporated advanced features such as weather forecasting, news summarization, and a comprehensive question-answering system. The solution involved building and deploying ML models for robust text classification and Named Entity Recognition (NER).
Impact: Enhanced user engagement and provided instant access to information and services, streamlining the user experience for mobile recharge offers and beyond.

Data Engineer

Constalytics

Mohali

04.2017 - 08.2017

Knowledge Graph Platform

Description: Designed and developed an unstructured text data processing platform capable of name entity extraction, topic modeling, and sentiment analysis. The platform integrated Neo4j for robust knowledge graph creation, enabling the extraction and visualization of complex relationships between entities. Custom ML models were developed for advanced text classification and Named Entity Recognition (NER).
Impact: Transformed raw, unstructured text into actionable, interconnected insights, significantly improving data discovery, relationship analysis, and enabling more informed decision-making from large volumes of textual information.

Machine Learning Researcher

Data Science Research Institute

Bengaluru

08.2016 - 03.2017

Cluster Analysis Using Spark: Performed cluster analysis on weather data for California (2011-2014) using the K-means algorithm in Spark to identify significant patterns and groupings.
Cricket Prediction and Analysis: Scraped extensive player data and developed a predictive model utilizing the CricketR package in R to analyze and forecast player performance.

Software Developer

Predictive Research

Bengaluru

03.2015 - 08.2016

Contributed to diverse software development projects, including coding, testing, and feature implementation to achieve overall project objectives.

Web Developer

Freelancer

Bhopal

01.2012 - 02.2015

Architected and upheld websites while working as freelance web developer.

Education

PGD - Big Data Analytics

Siddaganga Institute of Technology

Tumkur

01.2017

B.E. - Computer Science

RKDF College of Engineering

Bhopal

01.2011

Skills

Programming & Fundamentals

Python and R Programming
Python Libraries: Pandas, NumPy
SQL (MySQL, DB2)
Version Control: Git, GitHub

Data Engineering & Big Data

Big Data Processing (PySpark)
Distributed Computing (Dask)
Real-time Data Streaming
Databricks/Delta Lake

Core AI/ML & Analytics

Statistical Analysis
Data Mining Techniques
Machine Learning Techniques
Deep Learning & Neural Networks
Natural Language Processing
Time Series Analysis

Generative AI & AI Agents

Generative AI Applications

Large Language Model (LLM) APIs & Models (GPT series, Gemini)
Vector Databases (Qdrant)
Hugging Face Transformers
Prompt Engineering
Model Fine-tuning
Retrieval Augmented Generation (RAG)

Enterprise AI Platforms & MLOps

C3ai (C3 AI Platform, C3 AI Applications)
Palantir (Foundry, AIP)

Data Visualization & BI

BI Tools: Power BI, Tableau
Interactive Dashboarding: Streamlit
Python Visualization Libraries: Matplotlib, Seaborn, Plotly/Dash

Cloud Platforms & Deployment

Cloud Providers: AWS, GCP
Containerization & Orchestration: Docker, Kubernetes
Infrastructure as Code (IaC): Terraform
CI/CD for MLOps (eg, GitHub Actions)
Cloud - services: EC2, S3, Lambda

Websites

https://www.linkedin.com/in/prabhatcs

Certification

C3.AI V8 data science
Deep Learning Specialization (deeplearning.ai, Andrew Ng)
Machine learning with big data (University of California, San Diego - Coursera)
Graph analytics for big data (University of California, San Diego - Coursera)

Languages

Hindi

First Language

English

Proficient (C2)

Accomplishments

Star Achiever Award
Prime Player Award

Timeline

Sr. Data Scientist

Yash Technologies Pvt Ltd

02.2020 - Current

Sr. Data Scientist

Accenture Solutions Pvt Ltd

07.2018 - 02.2020

Data Scientist

Northout Solutions

12.2017 - 07.2018

Machine Learning Engineer

Bonsmat Group

09.2017 - 11.2017

Data Engineer

Constalytics

04.2017 - 08.2017

Machine Learning Researcher

Data Science Research Institute

08.2016 - 03.2017

Software Developer

Predictive Research

03.2015 - 08.2016

Web Developer

Freelancer

01.2012 - 02.2015

PGD - Big Data Analytics

Siddaganga Institute of Technology

B.E. - Computer Science

RKDF College of Engineering

Prabhat Shukla

Summary

Overview

Work History

Sr. Data Scientist

Sr. Data Scientist

Data Scientist

Machine Learning Engineer

Data Engineer

Machine Learning Researcher

Software Developer

Web Developer

Education

PGD - Big Data Analytics

B.E. - Computer Science

Skills

Websites

Certification

Languages

Accomplishments

Timeline

Sr. Data Scientist

Sr. Data Scientist

Data Scientist

Machine Learning Engineer

Data Engineer

Machine Learning Researcher

Software Developer

Web Developer

PGD - Big Data Analytics

B.E. - Computer Science

Similar Profiles

Nora DI CHENNora DI CHEN

Sravanthi TatineniSravanthi Tatineni

Nikita BachaniNikita Bachani

Agustin Ferreira PoseAgustin Ferreira Pose