
Extremely passionate Lead Data Scientist/Engineer who worked on heavy data intensive, high throughput projects through out the career.
Modeling
Graph ML, Gradient Boosting, Markov Models, Monte Carlo Simulation, Probabilistic Risk Modeling, Cost-Sensitive Learning, Calibration, Imbalanced Classification
Statistical Methods
Sampling & Bias Correction, Propensity Score Methods, Bootstrapping, Scenario Simulation
Technologies
Python (Pandas,Polars), Apache Spark, SQL, PyTorch (PyTorch Geometric), XGBoost
Designed A/B testing frameworks to evaluate network-aware alert prioritization strategies, comparing traditional probability-based ranking against exposure- and impact-aware scoring approaches.
Measured improvements using investigator action rates, alert precision, and downstream case conversion metrics to ensure model changes translated to operational value.
Applied for US Patent::Application no 62/561,967.
Currently, I am leading the Quantitative Data team within Financial Crime space at Barclays where i lead Quantitative and high profile data projects including Network aware AML System, Statistical Segmentation of Payments, modelling cross-border payments. As a team lead I not only manage Data science aspects but also have a say in the data Engineering aspects as well.
I lead a total of 5 member team which include Data Engineers, Data Scientists and Quants, Business Analysts.
I regularly conduct 1-1s with my reports, coach and mentor them.
These high profile projects are pivotal in Fincrime/Quant space in delivering accurate forecasts.
Following is Flagship project that i am currently leading
Graph-Based AML Risk Modelling System
Core Technologies: Python, Apache Spark, SQL, PyTorch (PyTorch Geometric), XGBoost
This diverse and extensive experience underscores my ability to deliver robust, low-latency applications, tackle complex engineering challenges, and ensure compliance with regulatory standards in dynamic and demanding environment of financial technology at Barclays.
1. Worked on Risk and Forecasting solutions platform building on Cloudera eco system from scratch.
2. I was a lead developer in this project worked extensively in Python environment right from
architecture, design, coding and deployment and maintenance.
3. Designed and implemented complex multi-host load-balanced product with high availability and
fast-response time.
4. It involved working with different pieces of big data platform like HDFS, Impala, Spark etc.
5. Built a complex business logic involving 400+ Macro economic variables
6. Thoroughly supported and developed quant libraries which are used in Economic variable
calculation
7. The numbers generated for Economic variables were used to submit to CCAR.
8. Thoroughly supported all successful Review programs like CCAR, CECL, ICAAP etc
9. Used different ML Libraries to develop mathematical models for various risk programs.
10. Extensively involved in Code Walks, Peer Code Reviews and Design Discussions and
new Technology Incubations.
Client: Product Development targeted at different Retail Clients like Mars,Wallmart etc.
Project: Large Valued Operations and Enhancements: (LVOE):
• LVOE is Ambitious BigData Project taken up by JDA.This Work here involves replacing existing
ETL Logic of Data Warehouse with Hadoop for ease of Operations,Cost factors and implementing
Hadoop based solutions for the Retail Problems Involving Huge data Sets which are diverse and
inconclusive of size aroud 10-100TB.
Primary Roles and Responsibilities
• As a Senior Developer Worked with Large Customer/Onsite Team to gather Big data Problems
that Needs to be addressed.
• Setup the Hadoop Eco system and managing day-to-day Deliverables.
• Problems addressed from MapReduce Perspective: Using Numerical Summarization in
aggregating the Large Product volume that is delivered/Sold across various WalMart stores across
the Globe.
• Detecting Fraud Coupon Codes amongst the millions of codes generated every second using
advanced data structure called Bloom Filter.
• Bloom Filtering is extensively used to handle scenarios where we need to compare Input Value to a
Huge amount of existing dataset to know whether the value exists or not Transforming Rowbased
RDBMS data to JSON or XML Hierarchical Data from RDBMS structured data.
At National Grid, I Worked as a senior SAP NetWeaver Portal Java developer, I was instrumental in
developing a DashBoard to track the In and Out Timings of workers at GDFO(Gas Distribution Front
Office)
Roles and Responsibilities
• Being a senior developer I was responsible for mentoring Junior Developers and guiding
them.
• Understood the SAP Java Plugin completely and picked up NetWeaver Portal Concepts
quickly
• Was instrumental in developing a dashboard in a very quick time which was one of the
dashboards that is very extensively used at National Grid.
• Extensively used Core Java,Hibernate Concepts along with other J2EE Components like
EJB, Servlets etc.
.
• Actively Involved in Architectural Discussions and New Project Road map.
• Acted as Onshore Lead to take decisions and to carry out Code reviews in a timely manner.
• Responsible for integrating Google Maps API with existing dashboard solution so that it will
be easier to find the exact location of Field workers.
• Involved in development of Use cases,Test case Scenarios and thorough Unit Testing
Received constant appreciation for delivery both at JP Morgan and at Barclays