Summary
Overview
Work History
Education
Skills
Websites
Certification
Personal Information
Overall Profile
History enthusiast
Timeline
Generic
Soutrik Chowdhury

Soutrik Chowdhury

Bangalore

Summary

  • Nearly 8 years of experience as a Machine Learning Engineer, excelling in both individual contributor (IC) roles and leading Data Science and Machine Learning Engineering teams.
  • Expertise in conceptualizing, building, and deploying robust machine learning solutions that seamlessly integrate with complex business requirements.
  • Proficient in data pre-processing, feature engineering, model selection, and hyperparameter tuning, with a strong focus on large language models (LLMs), SLMs and prompt engineering.
  • Skilled in leading machine learning frameworks such as TensorFlow, PyTorch, and Scikit-learn, with emerging expertise in tools like LangChain.
  • Extensive experience in managing end-to-end ML pipelines using tools like DVC for version control and CML for continuous model training and integration.
  • Hands-on expertise in AWS services for scalable and secure cloud-based deployments, including infrastructure optimization and automation.
  • Proficient in distributed computing technologies and leveraging Azure Cloud Services for building scalable solutions.
  • Strong proficiency in GitHub workflows, fostering efficient collaboration and deployment.
  • Proven ability to deploy large scale ML systems using FastApi, Docker and Kubernetes.
  • Demonstrates critical thinking, problem-solving skills, and attention to detail, with a commitment to staying ahead of emerging technologies and methodologies.
  • Passionate about delivering innovative machine learning solutions using cutting-edge technologies to drive impactful business outcomes and advance the field of AI.

Overview

11
11
years of professional experience
1
1
Certification

Work History

LLM Application Engineer

AB InBev
04.2023 - Current

End-to-End NL to SQL and Deep Analysis Pipeline for Business Intelligence (BI) teams
• Objective:
Simplify database interactions for CXO-level reporting and decision-making by enabling natural language querying of structured data.
• Leadership:
Led a cross-functional team of Data Scientists and ML Engineers to deliver a scalable, high-performance system.
• Key Contributions:
• Multi-layered NL to SQL Conversion System:
• Implemented a query decomposition module to break down complex natural language questions into sub-queries for parallel and accurate processing.
• Built a query classification mechanism using a fine-tuned BERT model to route simple vs. complex queries along optimized paths.
• Enhanced Named Entity Recognition (NER) with fine-tuned BERT for precise entity detection, enabling granular Redis-based searches.
• Performance Optimization:
• Integrated a Redis caching layer to accelerate query response times and reduce database access costs.
• System Architecture:
• Delivered a microservices-based architecture that maintains session context for multi-turn interactions and follow-ups.
• Deployed services using FastAPI, Redis, and RabbitMQ, containerized with Docker, and hosted on VM infrastructure.
• Outcome:
Streamlined BI workflows with a robust, scalable, and fault-tolerant NL to SQL system that significantly improved access to structured data and enabled faster CXO-level insights.

LLM-Powered Retrieval-Augmented Generation (RAG) System for Logistics, HR and Supply
• Objective:
Enhance unstructured document processing and retrieval for operational teams by leveraging advanced ML and LLM techniques.
• Leadership:
Led a team of Data Scientists and ML Engineers in the design, development, and deployment of the system.
• Key Contributions:
• Document Classification & Parsing:
• Developed a YOLO-based document classifier to identify and separate simple vs. complex pages.
• Applied regular parsers for simple pages; leveraged GPT-4 for complex table and chart extraction.
• Integrated LayoutLM for accurate table extraction in structured document layouts.
• Data Storage & Hybrid Retrieval:
• Generated vector embeddings from documents and indexed them in Redis Vector DB (RedisVL) for hybrid semantic search.
• Employed ColBERT for re-ranking results using fine-grained distance-based scoring.
• Query Handling & User Interaction:
• Enabled NER-driven search refinement and GPT-based summarization chains for in-depth query responses.
• Maintained session-based context to support multi-turn queries and dynamic follow-ups.
• Scalability & Feedback Loop:
• Used the Ray framework for parallelized embedding generation.
• Integrated a real-time feedback loop for model fine-tuning and iterative performance enhancement.
• System Deployment:
• Delivered as a microservice-based architecture deployed via FastAPI, Redis, and RabbitMQ, containerized using Docker and hosted on VMs.
• Outcome:
Delivered a robust, efficient, and user-friendly system that significantly improved data access, operational decision-making, and productivity across Logistics and HR domains.

Agentic Framework for Multi-Tool Orchestration using LangGraph
• Objective:
Enable dynamic, context-aware query resolution and multi-hop task execution using an agentic architecture across multiple tools and data modalities.
• Key Contributions:
• Developed an agentic framework with LangGraph to support:
• Multi-hop query resolution via dynamic state transitions.
• Adaptive workflows for query refinement, sub-task chaining, and contextual task switching across API-bound tools.
• Implemented Agentic RAG (Retrieval-Augmented Generation) with automatic query regeneration for ambiguous or under-specified queries.
• Built a Corrective RAG pipeline to improve retrieval precision and employ external tools (e.g., calculators, search APIs, summarizers) for aided answering.
• Leveraged MCP (Model Context Protocol) to streamline and standardize structured tool invocation across agents.
• Introduced an Agentic Supervisor layer to manage complex workflows involving both structured (SQL, APIs) and unstructured (PDFs, text) sources, enabling unified handling across modalities.
• Ensured session-based state management for real-time, contextual continuity and seamless user interactions.
• Outcome:
Delivered a powerful, extensible agentic system capable of intelligent tool orchestration, contextual multi-step reasoning, and adaptive problem-solving across both structured and unstructured data domains.

Deployment & Infrastructure Engineering
• Objective:
Deliver scalable, modular, and production-ready ML and application services through standardized, efficient deployment practices.
• Key Contributions:
• Modular Service Architecture:
• Architected each major module as an independent framework with sub-services exposed via FastAPI-based APIs, ensuring clean separation of concerns and reusability across projects.
• VM-Based Deployment & Docker Compose:
• Conducted isolated service testing and end-to-end integration in VM environments using Docker Compose, enabling reproducible local and staging deployments.
• Load Handling via Queueing Mechanisms:
• Integrated RabbitMQ and Celery to decouple components and efficiently handle high-throughput workloads across multiple services.
• Scaled Orchestration with Kubernetes:
• Deployed containerized services in Kubernetes clusters for horizontal scaling, high availability, and fault-tolerant service orchestration with environment-specific Helm charts and autoscaling policies.
• CI/CD with GitHub Actions:
• Managed deployment and release pipelines using GitHub Actions for automated testing, container builds, image publishing, and versioned deployments to staging and production environments.

ML Engineer

AB InBev
10.2021 - 03.2023

Invoice Duplicate Detection Solution Development

Objective:

• Led a high-performing team to develop a robust solution for detecting duplicate invoices, reducing financial risks, and enhancing operational efficiency.

Key Contributions:

• Semantic/Text-Matching Engine Development:

• Built an advanced semantic/text-matching engine to identify duplicate invoices, improving fraud prevention and financial integrity.

• Integrated a machine learning layer to enhance predictive accuracy and reduce false positives, streamlining invoice processing.

• Solution Architecture and Development:

• Designed a scalable solution on Azure ML Studio, adhering to best coding and architectural practices.

• Implemented DVC for data versioning, MLFlow for model lifecycle management, and Deepchecks/EvidentlyAI for data quality assessments.

• Enhanced model interpretability using the Shapely framework.

• CI/CD Pipeline and Performance Monitoring:

• Architected CI/CD pipelines on Azure DevOps/GitHub Actions for automated Azure ML pipeline deployment, accelerating delivery.

• Instituted Azure Application Insights for performance monitoring, enabling proactive bottleneck resolution.

Outcome:

• Successfully delivered a solution that improved invoice detection and fraud prevention, leading to reduced financial risks and enhanced operational efficiency.

Forecasting Framework for Accurate Predictions

Objective:

• Designed a versatile forecasting framework integrating models like ARIMA, Prophet, and LSTM for accurate predictions.

Key Components:

• Anomaly Detection & Correction:

• Developed an anomaly detection and correction framework using Prophet + ADTK and regularized ARIMA, improving forecast reliability.

• Scalable Forecasting with Parallel Computing:

• Implemented parallelized and vectorized computing on Databricks clusters for scalable, high-performance forecasting.

• Temporal Regression Framework:

• Architected a temporal regression framework leveraging Light GBM and XGBoost, addressing large-scale forecasting challenges with intelligent grouping and parallelized training.

• Automated Feature Extraction:

• Automated feature extraction pipelines, optimizing precision for product clusters.

• Platform Flexibility:

• Designed the solution for seamless operation on Azure ML Studio or Databricks, providing platform flexibility.

Outcome:

• Enabled precise, high-performance forecasting, enhancing decision-making across the organization.

Additional Contributions & Tools:

• Managed development and deployment workflows via Azure DevOps, ensuring efficiency and scalability.

• Delivered multiple POC projects showcasing innovative solutions.

• Proficient in Power BI and Tableau for impactful data visualizations.

• Created an open-source framework (Docker, GitHub Actions, MLFlow, PostgreSQL) to mimic Azure ML ecosystems for on-premise clients.

• Developed FastAPI templates for quick plug-and-play analytics solutions.

Senior Data Scientist

o9 Solutions
11.2020 - 09.2021
  • Implemented end-to-end Demand Planning and Sensing solutions, leveraging advanced statistical algorithms, machine learning algorithms, and deep neural networks.
  • Successfully executed multiple proof-of-concept (POC) projects aimed at addressing complex forecasting challenges, including hierarchical forecasting.

Data Scientist

Fractal Analytics
06.2018 - 11.2020
  • Company Overview: Bengaluru
  • Distinguished as a highly accomplished Data Scientist/Decision Scientist, possessing a wealth of expertise in effectively framing and solving intricate business problems.
  • Proficiency extends to data manipulation and exploration, as well as developing cutting-edge data science and analytical algorithms and solutions.
  • Furthermore, have excel in seamlessly deploying sophisticated analytical models into production environments, ensuring their optimal utilization and integration within complex operational systems.
  • Bengaluru

Business Development Officer

Life Insurance Corporation of India Ltd
09.2015 - 12.2016

Junior Engineer Maintenance

Deys Power System (P) Ltd.
07.2014 - 07.2015

Education

PGPBA - Data Science & Machine Learning

Praxis Business School
Kolkata, India
03-2018

Bachelor of Technology -

West Bengal University of Technology
Suri
05.2014

Skills

  • Python
  • Docker
  • SQL
  • MLOps
  • Gradio
  • AI-Agents
  • Pytorch
  • FastAPI
  • Github Actions
  • Hugging Face
  • Minikube
  • Git
  • LangChain
  • LLM & SLM
  • AWS
  • EKS
  • Streamlit
  • LLM Agents

Certification

  • Deep Learning Specialization, Coursera
  • LangChain: Chat with Your Data, Deeplearning.ai
  • Docker Training Course for the Absolute Beginner, Kode Kloud
  • EMLO, The School of AI
  • ChatGPT Prompt Engineering for Developers, Deeplearning.ai
  • Functions, Tools and Agents with LangChain, Deeplearning.ai
  • FastAPI in Hard Way, Udemy
  • LangChain for LLM Application Development, Deeplearning.ai
  • Github Actions, ERA-V2

Personal Information

Date of Birth: 04/04/91

Overall Profile

  • Nearly 8 years of experience as a Machine Learning Engineer, excelling in both individual contributor (IC) roles and leading Data Science and Machine Learning Engineering teams.
  • Expertise in conceptualizing, building, and deploying robust machine learning solutions that seamlessly integrate with complex business requirements.
  • Proficient in data pre-processing, feature engineering, model selection, and hyperparameter tuning, with a strong focus on large language models (LLMs), SLMs and prompt engineering.
  • Skilled in leading machine learning frameworks such as TensorFlow, PyTorch, and Scikit-learn, with emerging expertise in tools like LangChain.
  • Extensive experience in managing end-to-end ML pipelines using tools like DVC for version control and CML for continuous model training and integration.
  • Hands-on expertise in AWS services for scalable and secure cloud-based deployments, including infrastructure optimization and automation.
  • Proficient in distributed computing technologies and leveraging Azure Cloud Services for building scalable solutions.
  • Strong proficiency in GitHub workflows, fostering efficient collaboration and deployment.
  • Proven ability to deploy large scale ML systems using FastApi, Docker and Kubernetes.
  • Demonstrates critical thinking, problem-solving skills, and attention to detail, with a commitment to staying ahead of emerging technologies and methodologies.
  • Passionate about delivering innovative machine learning solutions using cutting-edge technologies to drive impactful business outcomes and advance the field of AI.

History enthusiast

I am passionate about history, constantly exploring different eras and historical events. I enjoy researching and learning about the past to gain insights into the present

Timeline

LLM Application Engineer

AB InBev
04.2023 - Current

ML Engineer

AB InBev
10.2021 - 03.2023

Senior Data Scientist

o9 Solutions
11.2020 - 09.2021

Data Scientist

Fractal Analytics
06.2018 - 11.2020

Business Development Officer

Life Insurance Corporation of India Ltd
09.2015 - 12.2016

Junior Engineer Maintenance

Deys Power System (P) Ltd.
07.2014 - 07.2015

PGPBA - Data Science & Machine Learning

Praxis Business School

Bachelor of Technology -

West Bengal University of Technology
Soutrik Chowdhury