Data Scientist II with over 3 years of experience delivering production-grade machine learning solutions for customer analytics in the automotive domain. Adept at working with large-scale customer data to build predictive models that optimize marketing performance, retention strategies, and personalization. Hands-on experience in Python, SQL, AWS, Databricks, and end-to-end ML pipeline automation. Proven ability to translate complex data into high-impact business strategies, driving measurable revenue gains.
KEY PROJECTS
In-Market Propensity Model – Predicting Purchase Readiness
Developed a model to predict which existing customers were most likely to buy a new vehicle in the next 90 days using behavioral, demographic, and ownership data.
Helped the marketing team prioritize outreach to high-propensity customers
Significantly improved offer efficiency and conversion rates
Model scores used in Personalized Private Offer (PPO) campaigns and ongoing A/B tests
Incentive Sensitivity Modeling – Personalizing Offer Strength
Created regression-based models to predict how responsive a customer is to different incentive types (e.g., discounts, promotions).
Enabled dynamic allocation of incentive budgets based on predicted sensitivity
Personalized offers improved conversion and reduced incentive spend waste
A core component in generating $14.2M in incremental sales margin (in combination with In-Market model)
Customer Defection Prediction – Reducing Attrition Risk
Built ML models to identify customers likely to stop engaging with the brand by disposing of all vehicles within 90 days.
Supported segmentation of customers based on likelihood to defect
Informed targeted outreach strategies to encourage retention or trade-in
Outputs actively used to define personalized intervention plans by campaign teams
Vehicle Affinity Model – Recommending the Right Car
Designed a multi-class classification system to predict top 3 vehicle nameplates each customer is most likely to buy, based on past preferences and lead history.
Used Random Forest, XGBoost depending on customer segment
Enabled hyper-personalized product recommendations for PPO and sales campaigns
Final scores combined with churn and in-market predictions for 360° customer targeting
Unified Customer Scoring – Normalization Pipeline
Built and deployed a cross-population scoring system that consolidated outputs from multiple ML models (defection, in-market, churn) into a single comparable score.
Unified scores across diverse customer types (new, lease, used, non-owners)
Scores consumed by downstream models, including Customer Lifetime Value and accessory upsell campaigns.
Fully automated via AWS S3, SageMaker, Databricks, and CloudWatch
Customer Churn Prediction
Developed a predictive model to identify customers at risk of not renewing or repurchasing within a 30- or 90-day horizon.
Daily customer-level scoring integrated with CRM workflows
Helped campaign teams focus retention efforts where most needed
Achieved strong performance in capturing likely churners, supporting timely and cost-effective interventions
Machine Learning & Forecasting
Logistic Regression, Random Forest, XGBoost, KNN, Naive Bayes, K-Means
Time Series: ARIMA, Prophet, Holt-Winters, LSTM, RNN, TFT
Programming & Data
Python, SQL, Pandas, NumPy, Scikit-learn, Seaborn, Matplotlib, Plotly,
Platforms & Cloud
Treasure Data,Databricks, AWS (EC2, S3, SageMaker), Jupyter, VS Code
MLOps & DevOps
Git, GitHub, CI/CD pipelines, model deployment
Visualization & Business Tools
PowerPoint, Excel, Loop, Figma
Soft Skills
Analytical Thinking, Client Empathy, Storytelling, Communication, Problem Solving