Lead Data Scientist. |Commercial Analytics (Pharma)
- Led an end-to-end ML initiative to identify future prescribers (HCPs) for a flagship pharmaceutical brand, driving targeting optimization, revenue growth, and market share expansion.
- Owned delivery from requirements, competitive desk research, through model development, validation, and production deployment, leading a team of data scientists and analysts.
- Engineered ~20K features from 60–70M patient-level claims records (Rx, Dx, Px, labs, HCP attributes, SDOH, geography, specialty) using PySpark on Databricks.
- Applied advanced feature selection and stability techniques (variance filtering, multicollinearity checks, Boruta, mutual information, and domain-driven curation) to improve interpretability and robustness.
- Built and evaluated XGBoost and LightGBM classifiers, generating prescriber-level propensity scores, with temporal back-testing on holdout months.
- Deployed a production-grade ML pipeline on Databricks, with Snowflake integration, enabling the identification of approximately 2,000 incremental writers per month, with a 10% lift.
Lead Data Scientist | Patient Risk and Outcome Modeling (Pharma)
- Led the design and delivery of a patient-level ML solution to predict future LDL-C value ranges, enabling patient-based targeting, and proactive treatment identification.
- Owned end-to-end execution, leading a team of three data scientists across problem framing, feature engineering, modeling, inference, and deployment.
- Built scalable PySpark pipelines on 50–60M+ claims records (Rx, Dx, Px, labs, longitudinal histories), engineering approximately 3K patient-level features using Boruta and domain-driven curation.
- Designed a two-step hierarchical framework (At Goal vs. Not At Goal → Near Goal vs. Far From Goal) and applied LightGBM regression to generate actionable LDL-C predictions.
- Iterated and productionized inference pipelines to ensure model stability, clinical relevance, and scalable patient scoring.
Senior Data Scientist – HCP Prioritization & Alerting
- Supported a commercial analytics program focused on driving incremental prescriptions from existing HCP writers through targeted engagement and insight-driven alerts.
- Leveraged large-scale claims NBRx data to analyze prescribing behavior across the brand drug and competitive classes, such as SGLT2 and GLP-1.
- Engineered HCP-level metrics, including drug class mix, prescribing intensity, historical trends, and market potential, enable differentiation among existing writers.
- Helped design logic to identify high-priority HCPs for the upcoming three-month window, balancing current performance with future growth opportunity.
- Developed contextual insight narratives (e.g., strong class adoption, but under-indexing on brand) to support targeted sales messaging and alert generation.
- Worked under the guidance of a Lead Data Scientist, contributing to model logic, feature design, and business interpretation of results.