Senior Data Scientist with 5 years of experience, specializing in PySpark, Python, SQL, Machine Learning, MLOps, Azure ML, Predictive Modeling, H2O.ai, Generative AI, and Statistical Analysis.
Overview
5
5
years of professional experience
1
1
Certification
Work History
Senior Data Scientist
CitiBank
Bengaluru
03.2023 - Current
Developed an XGBoost-based response model on PySpark for an ECM Retail Bank campaign, optimizing direct mail targeting. Achieved a ~10% increase in response rate and generated approximately $150 million in incremental MMA deposits.
Designed a regression model to predict MMA balances for customer rank ordering, integrated into an optimization engine to drive approximately $100 million in annualized MMA deposits.
Partnered with a vendor to create a predictive mortgage model that identified customers likely to pursue a mortgage within 30-120 days, optimizing direct mail strategy and achieving ~$400K in expense save.
Developed an end-to-end automated solution on PySpark using H2O Sparkling Water to build benchmark models for the team, resulting in automation and increased efficiency by reducing the time required to build benchmark models.
Built standardized, scalable data pull code in PySpark, eliminating manual effort and enhancing team efficiency.
Collaborated with the strategy team to design test plans for an ECM Retail Bank campaign, and launched a response model to optimize customer targeting.
Collaborated with the strategy team to recommend customer targeting approaches based on Digital Engagement Index analysis.
Data Scientist
Infosys Ltd
Bengaluru
07.2020 - 03.2023
Developed a machine learning model on Azure ML using Python to predict employees at high risk of attrition, leading to cost optimization, improved employee retention, and better resource planning for the client.
Developed a regression model in Python to forecast spot prices, assisting transportation planners in procuring carrier capacity from spot market brokerage services for product transportation between origins and destinations.
Developed an automated Python framework to track the performance of newly launched products in the European market, enabling the client to discontinue underperforming products.
Constructed an Azure ML pipeline using DevOps to automate and streamline model deployment to production.
Education
MA - Economics
Madras School of Economics
Chennai
06.2020
BA (Hons) - Economics
Deshbandhu College, Delhi University
Delhi
06.2018
Skills
Programming languages: Python, PySpark, SQL
Machine learning and AI: Linear Regression, Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, XGBoost, LightGBM, Statistical Modeling, Predictive Modeling, Generative AI
Data Processing and Automation: PySparkling H2O, Turing Hub
Big Data and Cloud: Hadoop, Azure
Tools and Platforms: H2Oai, MS Office, Azure Machine Learning, ML Ops
Certification
Completed a training course on PySpark and Python on Udemy
Microsoft Certified Azure Data Scientist Associate
Generative AI for Beginners from Udemy
Accomplishments
Silver Award for Citi Ignite Innovation Challenge, 11/01/202424