Experienced Senior Data Scientist with over 14 years of industry experience, specializing in data science for the last 9 years. Possesses a strong understanding of various machine learning algorithms, techniques, and concepts. Proven track record in managing multiple projects and teams of over 15 members. Skilled in client engagement and delivering quality solutions.
My experience summary also includes a focus on Pricing & Promo analytics in the CPG & Retail industry, with a strong emphasis on recent accomplishments.
A Certified Gen AI Developer possesses a working experience in a real time Gen AI use case.
Member Acquisition:
Designed comprehensive ML solution to target potential members for our wholesale client. Applied various data science principles, ML algorithms like Trip Propensity Model, Spend Propensity Model etc, used CID/CD tools such as Jenkins, Bitbucket , to deploy the end to end solution in AWS.
Gen AI Driven Sales Analytics
Created a proprietary solution integrating Gen AI technology into the system, enabling users to extract meaningful sales data insights via a user-friendly chat interface.
Implemented multiple cutting-edge Gen AI concepts such as RAG, LLM and Chat Completions to enhance the effectiveness of language models.
Revenue Growth Management (RGM) - Base Value Drivers (BVD)
Utilized Double Machine Learning to identify the significant contributors and their respective shares in driving product sales growth.
Applied diverse preprocessing approaches for instance EDA, feature selection and elimination.
Employed diverse smoothing techniques like Exponential smoothing and Savitzky-Golay filtering to refine time series data.
Utilized algorithms like RandomForestRegression and Log Log Regression .
Deployed the PySpark solution in distributed environment for effective implementation and productivity.
Increased solution performance through the application of PySpark concepts like broadcast join, pandas UDF, partition by, and window functions.
Revenue Growth Management - Price Elasticity
Developed a data science solution to analyze the impact of price changes on volume sales.
Leveraged expertise in multiple data preprocessing techniques, including exploratory data analysis, creating new features, transforming existing ones and treating missing values, eliminating features etc.
Implemented different machine learning algorithms, including Log Log regression, Multiple linear regression, and random forest algorithm.
Evaluated model performance by analyzing regression summary with statistical metrics like Standard Error, P-Value, Adjusted R Square, MAPE.
Deployed the PySpark solution in distributed environment for effective implementation and productivity.
Increased solution performance through the application of PySpark concepts like broadcast join, pandas UDF, partition by, and window functions.
Revenue Growth Management - Demand Forecasting
Developed a high demand product forecasting model that accurately predicts sales based on historical data.
Applied preprocessing techniques to enhance data quality, including exploratory data analysis, creating new features, and transforming existing ones, EWMA smoothing , Savitzky Golay smoothing etc.
Applied Auto Arima, Neural Network Model, and Facebook's Prophet Model to accurately forecast Unit Sales and Dollar Sales using various algorithms.
Created AWS data lake architecture to pull 1.5 million customer support data from SQL database.
Saved data in AWS S3 buckets then organized and accessed these data through AWS Athena service.
Performed EDA, data preprocessing on the data. Executed Various NLP techniques such as tokenization, stemming, lemmatization, noun phrase extraction, word cloud, sentence vector, word vector, topic modelling etc.
Experimented various clustering algorithms such as K-Means , DBSCAN, Hierarchical clustering etc.
Competitive Intelligence - Text AnalyticsA NLP use case to collect and analyze competitors information.
Determined data sources - google news, tweets, jobs portals. Collected data from these data sources through web scrapping tools in python. Then stored these data in AWS S3 bucket.
Extracted data from S3 buckets and then ran text analytic models to get meaningful insights from these data.
Conceptualized & implemented a sentiment analysis tool to rate the tweets by the competitors.
Categorized tweets and news related to competitors. Built topic modelling.
Developed a cosine similarity model to find similar tweets and news. Derived various reports from these processed data.
A use case of supervised learning algorithm where aim was to predict test result of a manufacturing unit , given signals/features received from sensors. Highly imbalanced dataset with 1600 instances , 600 numerical attributes.
Performed data cleaning, preprocessing, EDA. Used dimensionality reduction algorithms to short dimensions.
Derived feature importance scores and selected top few features. Used techniques to handle imbalances.
Experimented building classification models such as SVM, Random , SGD Classifier etc.
Certificate of Deposit Prediction - Classification ModelA use case of supervised learning algorithm where aim was to predict sentiment of high valued customers on availing certificate of deposit from the insurance service provider.
Collected customers data from No-SQL DB. Dataset was too small and highly imbalanced.
Applied EDA, data preprocessing techniques on the data , Handled data imbalance. Experimented classification algorithms - Logistic Regression, Decision Tree, SVM etc. .Compared models performance using various metrics. Served the model using FLASK REST API.
AWS Data PipelineProvisioned end to end data lake , data warehouse architecture using AWS services such as Redshift, Lake Formation, S3, Glue etc.
Built data pipeline to collect on-premise data from multiple sources and move them to AWS cloud.
Involved in end to end design, architecting , implementing the solution.
Was a backend developer for the product CRAMER.
Developed new features , enhanced existing modules in the product. Involved in the UAT support extensively.
Technical Skills
Domain Skills
Management Skills