Summary

Overview

Work History

Education

Skills

Timeline

VIVEK SINGH

Data Scientist

IIT Kharagpur

Summary

Grew up @ IIT Kharagpur, solving real-time problems is what fills my dopamine void. I'm sharply skilled in Data Science, Analytics, Software development and end-to-end deployment. I'm currently working for a in the Quick Commerce industry solving Demand Forecasting and replenishmnet problems using Data Science.

I've interned in top Mnc's such as Mahindra Groups, Yatra. com, and some really cool startups like LoveLocal, which has helped me to grow my technical skillset and made me an excellent real-time problem solver, along with this it has also helped me to develop a deep sense of responsibility towards my work. I've dealt with the problems using Data Science from a diverse set of domains ranging from E-commerce, Retail, travel-hospitality, marketing, heath-care, to Finance, which has also taught me how businesses in various industries work.

Overview

years of professional experience

years of post-secondary education

Work History

Data Scientist

Blinkit

02.2022 - Current

Project 1: Demand Forecasting Engine using ML:

Built an end-to-end demand forecasting engine using ML scaled to ~34 stores(pan India), responsible for replenishing ~11k unique items(~ 0.1 Mn unique time-series) from the warehouse to dark-stores, leading to significant automation of the process
Prepared the data train, val and inference datasets for the 34 dark stores, with overall ~1 lacks unique time-series with 92 features
Implemented a splitting logic with training incorporating last 5 months and validation and inference data incorporating the 7 days
Created a logic to segregate the high-selling items and the low-selling items based on the mean quantity sold slot for the items
Formulated an accuracy metric based on business logic, ie, either percent diff should be between 80-120, or value diff less than 1
Implemented Facebook-Prophet for ~5k time-series and improved the overall accuracy by 25% and savings on dump by 5cr per week
Tuned hyperparameters prior_scales and fourier orders) of prophet using optuna leading to further ~10% improvement in accuracy
Experimented with growth(logistic) and trend parameters of prophet, along with log transformation and improved the model accuracy
Implemented the LGBM and Catboost model for the ~1 lacs, low-selling time series and improved the overall model accuracy by 15%
Improved item availability logic by adding weights for each item at the city level, which improved the availability and decreased dump

Project 2: INSIDER-Residual Analysis Tool

Built a tool to identify time-series on which model performs poor, along with the reasons if there’s noise in data or problem in model
Applied statistical tests and domain logics to tag the high residuals and a rolling mean-based approach to get reasons for poor forecasts
Implemented a logic based on residual analysis to select the model which gives the best forecasts for each ~5k time series
Automated the process along with the report generation having interactive plots and metric values, tracking models with residual analysis

Data Scientist

CRED

10.2021 - 01.2022

Eagle-Anomaly Detection Tool:

Built an end-to-end time-series based Anomaly detection tool, which tags the anomalous points based on several time series factors
Implemented statistical logics to find the contribution of the feature categories leading to the anomaly on a specific day
Formulated data distribution-based logic for deviation in feature-category and its contribution to the deviation in the target-metric
Incorporated the weights to each feature category based on which category changes correlate maximum to the target metric
Dashboarded the results for monitoring the metrics and displayed each anomalous point along with its reasons in the priority manner

Data Science Intern

LoveLocal

06.2021 - 08.2021

Project 1: Product recommendation system and market basket analysis

Built a product Recommendation system using User-item( interaction between the customer to the product) and item-feature(featuresof the items) based sparse coo matrix(using scipy) for hybrid collaborative filtering using LightFM library with an AUROC value of 85%
Obtained frequently bought together products combination(~1000 optimum association rules) using ARM over a market basket dataset
Designed custom dictionary to correct product names using the Levenshtein distance and uesd inflexion point analysis to get threshold
Project 2: Binary Classification Model For Uninstallation Of The Application:
Implemented a binary classification model to identify the retailers who are likely to uninstall the LoveLocal retail application
Clustered the stores using k-means clustering with a haversine distance matrix to fetch the centroid data from Google-places API
Implemented RF classifier and XGBoost on an imbalanced dataset with an AUROC value of 87%, and the model went into production

Data Science Intern

Mahindra Groups

05.2021 - 06.2021

Created Propensity model via supervised learning techniques to create target groups
Created propensity model via supervised learning techniques, used to create target groups for cross-selling marketing campaigns
Selected optimal features, trained the model and best performance (Recall:84 &AUROC:85) is achieved by, weighted XG-Boost algorithm
Calculated the true churn rate per group, and the top decile contains 10% of the population which is most likely to buy SCV Cargo

Software Developer Intern

Yatra.com

01.2021 - 04.2021

Yatra support email classifier bot:

Implemented the multi-class text classification which categorized customer’s query into 5 classes (ticket cancellation, refund etc)
Applied LDA(genism) to cross verify the miss-classified labels for the email body and corrected the labels if not in top 5 optimum words
Cleaned the email body followed by basic pre-processing, tokenization, stop words removal and lemmatization using spacy and genism
Vectorized the text data using Tf-Idf, selected optimal features using Chi2, and trained the model using random forest classifier
Achieved the AUROC value of 82% (macro average), dockerized and then deployed the text classification application using Flask

Education

Bachelor of Science - Geophysics

IIT Kharagpur

Kharagpur

07.2017 - 07.2021

Master of Science - Geophysics

IIT Kharagpur

Kharagpur

07.2021 - 07.2022

High School Diploma -

Green Wood School

India

06.2015 - 06.2017

Skills

Python

Git

SQL

Machine learning

Time series analysis

AWS

AWS S3

Deep Learning

Statistical analysis

Intelligence gathering

Agile framework understanding

Data Mining

Data Structures and Algorithms

Probability and statistics

Active Listening

Decision-Making

Flask

Django

Rest APIs

Docker

Timeline

Data Scientist

Blinkit

02.2022 - Current

Data Scientist

CRED

10.2021 - 01.2022

Master of Science - Geophysics

IIT Kharagpur

07.2021 - 07.2022

Data Science Intern

LoveLocal

06.2021 - 08.2021

Data Science Intern

Mahindra Groups

05.2021 - 06.2021

Software Developer Intern

Yatra.com

01.2021 - 04.2021

Bachelor of Science - Geophysics

IIT Kharagpur

07.2017 - 07.2021

High School Diploma -

Green Wood School

06.2015 - 06.2017

VIVEK SINGH

Summary

Overview

Work History

Data Scientist

Data Scientist

Data Science Intern

Data Science Intern

Software Developer Intern

Education

Bachelor of Science - Geophysics

Master of Science - Geophysics

High School Diploma -

Skills

Timeline

Data Scientist

Data Scientist

Master of Science - Geophysics

Data Science Intern

Data Science Intern

Software Developer Intern

Bachelor of Science - Geophysics

High School Diploma -

Similar Profiles

Maya NazranMaya Nazran

Upendhranaidu NaraUpendhranaidu Nara

Bishwarup TiwaryBishwarup Tiwary

Sumedh SonawaneSumedh Sonawane

Manjil GhimireManjil Ghimire