Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Nazrul Miya

Bengaluru

Summary

Experienced Senior Data Scientist with over 14 years of industry experience, specializing in data science for the last 9 years. Possesses a strong understanding of various machine learning algorithms, techniques, and concepts. Proven track record in managing multiple projects and teams of over 15 members. Skilled in client engagement and delivering quality solutions.


My experience summary also includes a focus on Pricing & Promo analytics in the CPG & Retail industry, with a strong emphasis on recent accomplishments.


A Certified Gen AI Developer possesses a working experience in a real time Gen AI use case.

Overview

14
14
years of professional experience
1
1
Certification

Work History

Assistant Vice President

Genpact India Pvt Ltd
10.2021 - Current

Member Acquisition:

Designed comprehensive ML solution to target potential members for our wholesale client. Applied various data science principles, ML algorithms like Trip Propensity Model, Spend Propensity Model etc, used CID/CD tools such as Jenkins, Bitbucket , to deploy the end to end solution in AWS.


Gen AI Driven Sales Analytics

Created a proprietary solution integrating Gen AI technology into the system, enabling users to extract meaningful sales data insights via a user-friendly chat interface.

Implemented multiple cutting-edge Gen AI concepts such as RAG, LLM and Chat Completions to enhance the effectiveness of language models.

Revenue Growth Management (RGM) - Base Value Drivers (BVD)

Utilized Double Machine Learning to identify the significant contributors and their respective shares in driving product sales growth.

Applied diverse preprocessing approaches for instance EDA, feature selection and elimination.

Employed diverse smoothing techniques like Exponential smoothing and Savitzky-Golay filtering to refine time series data.

Utilized algorithms like RandomForestRegression and Log Log Regression .

Deployed the PySpark solution in distributed environment for effective implementation and productivity.

Increased solution performance through the application of PySpark concepts like broadcast join, pandas UDF, partition by, and window functions.

Revenue Growth Management - Price Elasticity

Developed a data science solution to analyze the impact of price changes on volume sales.

Leveraged expertise in multiple data preprocessing techniques, including exploratory data analysis, creating new features, transforming existing ones and treating missing values, eliminating features etc.

Implemented different machine learning algorithms, including Log Log regression, Multiple linear regression, and random forest algorithm.

Evaluated model performance by analyzing regression summary with statistical metrics like Standard Error, P-Value, Adjusted R Square, MAPE.

Deployed the PySpark solution in distributed environment for effective implementation and productivity.

Increased solution performance through the application of PySpark concepts like broadcast join, pandas UDF, partition by, and window functions.

Revenue Growth Management - Demand Forecasting

Developed a high demand product forecasting model that accurately predicts sales based on historical data.

Applied preprocessing techniques to enhance data quality, including exploratory data analysis, creating new features, and transforming existing ones, EWMA smoothing , Savitzky Golay smoothing etc.

Applied Auto Arima, Neural Network Model, and Facebook's Prophet Model to accurately forecast Unit Sales and Dollar Sales using various algorithms.

Sn Data Scientist

Tata Consultancy Services
08.2019 - 10.2023
Customer Support Data Clustering

Created AWS data lake architecture to pull 1.5 million customer support data from SQL database.

Saved data in AWS S3 buckets then organized and accessed these data through AWS Athena service.

Performed EDA, data preprocessing on the data. Executed Various NLP techniques such as tokenization, stemming, lemmatization, noun phrase extraction, word cloud, sentence vector, word vector, topic modelling etc.

Experimented various clustering algorithms such as K-Means , DBSCAN, Hierarchical clustering etc.

Competitive Intelligence - Text Analytics

A NLP use case to collect and analyze competitors information.

Determined data sources - google news, tweets, jobs portals. Collected data from these data sources through web scrapping tools in python. Then stored these data in AWS S3 bucket.

Extracted data from S3 buckets and then ran text analytic models to get meaningful insights from these data.

Conceptualized & implemented a sentiment analysis tool to rate the tweets by the competitors.

Categorized tweets and news related to competitors. Built topic modelling.

Developed a cosine similarity model to find similar tweets and news. Derived various reports from these processed data.

Data Scientist

Aricent Technologies Pvt Ltd
05.2016 - 08.2019
Semiconductor Manufacturing Unit Test - Classification Model

A use case of supervised learning algorithm where aim was to predict test result of a manufacturing unit , given signals/features received from sensors. Highly imbalanced dataset with 1600 instances , 600 numerical attributes.

Performed data cleaning, preprocessing, EDA. Used dimensionality reduction algorithms to short dimensions.

Derived feature importance scores and selected top few features. Used techniques to handle imbalances.

Experimented building classification models such as SVM, Random , SGD Classifier etc.

Certificate of Deposit Prediction - Classification Model

A use case of supervised learning algorithm where aim was to predict sentiment of high valued customers on availing certificate of deposit from the insurance service provider.

Collected customers data from No-SQL DB. Dataset was too small and highly imbalanced.

Applied EDA, data preprocessing techniques on the data , Handled data imbalance. Experimented classification algorithms - Logistic Regression, Decision Tree, SVM etc. .Compared models performance using various metrics. Served the model using FLASK REST API.

AWS Data Pipeline

Provisioned end to end data lake , data warehouse architecture using AWS services such as Redshift, Lake Formation, S3, Glue etc.

Built data pipeline to collect on-premise data from multiple sources and move them to AWS cloud.

Involved in end to end design, architecting , implementing the solution.

Software Engineer

Wipro
06.2011 - 11.2015
AMDOC's CRAMER

Was a backend developer for the product CRAMER.

Developed new features , enhanced existing modules in the product. Involved in the UAT support extensively.

Education

Master of Science - Computer Science

BITS Pilani
2015

Master of Science - BCA

The University of Burdwan
2011

Skills

    Technical Skills

  • Python
  • PySpark
  • SQL
  • Data Science & Machine Learning
  • Machine Learning
  • Large Language Models (LLMs) / Gen AI
  • NLP
  • Bitbucket
  • Jenkins
  • Git
  • AWS

    Domain Skills

  • CPG Analytics
  • Retail Analytics
  • Pricing and Promo Analytics
  • Membership Analytics
  • Customer Analytics
  • Management Skills

  • Project management
  • Interview management
  • Team Leadership
  • Client Handling

Certification

  • Microsoft Certified: Azure Fundamentals (AZ900)
  • Gen AI Developer Track Certified by Genpact's CoE Talent Academy

Timeline

Assistant Vice President

Genpact India Pvt Ltd
10.2021 - Current

Sn Data Scientist

Tata Consultancy Services
08.2019 - 10.2023

Data Scientist

Aricent Technologies Pvt Ltd
05.2016 - 08.2019

Software Engineer

Wipro
06.2011 - 11.2015

Master of Science - BCA

The University of Burdwan

Master of Science - Computer Science

BITS Pilani
Nazrul Miya