Email Phish Detection: Designed and deployed a high-scale email phishing detection pipeline processing over 5 billion emails per day using a funnel-based ensemble approach. Combined tree-based models on metadata features with semantic models powered by Transformer architectures like BERT, and fine-tuned localized LLMs using Ollama. Automated data labeling with a Retrieval-Augmented Generation (RAG) workflow. Built scalable inference pipelines using AWS SageMaker, integrating with Kubernetes and Airflow for orchestration and deployment.
Low Volume Fraud Detection:
Architected a real-time fraud detection system tailored for low-volume, high-risk events using Beta probability distribution-based anomaly scoring to enhance detection granularity. Combined statistical modeling with deep learning for adaptive thresholds. Deployed on AWS with Airflow for scheduling, Kubernetes for scalability, and integrated real-time alerts via Datadog. Reduced false positives by over 35% while maintaining high detection sensitivity.
Mentoring & Planning:
Led ideation, stakeholder alignment, and roadmap planning across ML initiatives. Authored detailed approach documents and technical roadmaps to guide system design and delivery. Hired and mentored junior engineers, conducted technical reviews, and ensured cross-functional collaboration through regular updates and demos
Cleartrade – Intelligent Document Processing Platform:
Led the end-to-end design and development of Cleartrade, the company’s flagship AI platform for automating trade document workflows. Architected a deep learning pipeline using BERT and LayoutLM to extract structured data from complex, scanned trade documents. Delivered high-accuracy entity extraction across diverse document types, enabling significant operational cost savings for clients.
Team Building & Execution:
Built and scaled a high-performing team of 10 data scientists and ML engineers. Fostered a culture of technical excellence, peer collaboration, and experimentation. Guided project execution from requirement gathering through production rollout, while ensuring alignment with compliance and business objectives.
Technical Planning & Roadmap:
Defined the ML architecture and authored detailed approach documents, technical roadmaps, and delivery plans. Partnered with stakeholders across compliance, product, and engineering teams to translate business needs into scalable AI features.
Mortgage Document Digitization – Computer Vision:
Designed and implemented computer vision algorithms to automate the digitization of mortgage documents, including handwritten forms and scanned PDFs. Used CNN-based models for layout detection and OCR correction, significantly reducing manual processing time and error rates in the mortgage approval workflow.
Batch Process Automation:
Created robust Unix shell scripts to automate daily and monthly batch processing tasks, improving data availability SLAs and reducing manual intervention.
Business Reporting Automation:
Developed and maintained PL/SQL procedures for generating production-grade business reports critical to underwriting, claims, and policy analysis workflows.