Summary
Overview
Work History
Education
Skills
other projects
Languages
Accomplishments
Timeline
Generic

Mitanshu Khambayte

Bengaluru

Summary

Data Scientist | Generative AI | RAG | LangGraph | LLM Systems with experience in Generative AI, Retrieval-Augmented Generation (RAG), and multi-agent AI systems. Skilled in building LLM-powered applications using LangChain and LangGraph, including AI chatbots, NL2SQL systems, and enterprise knowledge assistants.

Overview

2
2
years of professional experience

Work History

Senior Associate data scientist

Publicis sapient
bengaluru
04.2025 - Current

Project: Autonomous AI Data Analyst

  • Designed and implemented an autonomous AI data analyst system using LangGraph multi-agent architecture to automate dataset analysis and insight generation.
  • Built specialized agents for analysis planning, dataset retrieval, Python code generation, execution, and business insight generation.
  • Executed LLM-driven Python code generation for dynamic analysis of structured datasets using Pandas and SQL queries, enhancing data insights.
  • Developed self-healing workflows using LangGraph loops, enabling automatic regeneration of analysis code when execution errors occur.
  • Integrated natural language data querying, allowing users to ask business questions without writing SQL or Python code.
  • Built a secure code execution environment for running AI-generated analysis scripts safely.
  • Created REST APIs with FastAPI for seamless integration with enterprise analytics platforms, improving accessibility of data resources.
  • Implemented data visualization generation and automated reporting for key business metrics.
  • Optimized system performance using caching and efficient data retrieval techniques.
  • Containerized the system using Docker for scalable deployment in production environments.

Project: Bodhi Agentic NL2SQSystem

  • Designed and implemented an agentic NL2SQL system using LangGraph multi-agent architecture, enabling automatic conversion of natural language queries into optimized SQL statements.
  • Orchestrated a multi-agent workflow (Supervisor, Column Selection Agent, SQL Generation Agent) to collaboratively process user queries and generate production-ready SQL.
  • Developed advanced prompt engineering and state management logic to improve reasoning accuracy and reliability of AI agents.
  • Implemented semantic search using Elasticsearch to intelligently identify relevant database columns based on user intent.
  • Built a Query Validation framework using PySpark to verify SQL syntax, schema compatibility, and execution feasibility before deployment.
  • Designed a testing and benchmarking pipeline to automatically evaluate generated SQL queries against expected outputs for business analytics scenarios.
  • Architected the backend system with FastAPI and asynchronous APIs, enabling scalable and flexible integration with enterprise platforms.
  • Engineered multi-tenant architecture with client and project isolation, facilitating secure data access and enabling scalable deployment.
  • Implemented real-time communication using WebSockets to support interactive query generation and feedback loops.
  • Containerized the platform using Docker and CI/CD pipelines, enabling efficient deployment and environment configuration.

Data Scientist

Digital Suncity
jaipur
07.2024 - 03.2025

Project : Medical Chatbot using RAG

  • Developed an AI-powered medical chatbot using Retrieval-Augmented Generation (RAG) to deliver accurate and context-aware responses from medical knowledge sources.
  • Implemented semantic search with Pinecone vector database to enhance retrieval of relevant medical documents and research content.
  • Integrated large language models (LLMs) to generate grounded responses from retrieved context, reducing hallucinations.
  • Designed and optimized document ingestion and embedding pipelines for medical datasets including research papers and clinical guidelines.
  • Implemented secure authentication mechanisms (OAuth, SSL encryption) to ensure safe access and protect sensitive medical information.
  • Built scalable AI APIs and pipelines and deployed them on cloud infrastructure to support high availability and reliable performance.
  • Enhanced response relevance through context-aware prompt engineering and retrieval optimization techniques.

Education

B.Tech- Bachelor of Technology / Engineering - Computer Engineering

Rajiv Gandhi Proudyogiki Vishwavidyalaya (RGPV)
India
07-2024

Skills

Python

  • SQL
  • NLP

ML & NLP

  • Semantic Search
  • Embeddings
  • AI frameworks
  • LangChain
  • LangGraph
  • RAG techniques

Prompt Engineering

  • Multi-Agent Systems
  • LLM Orchestration
  • Model Evaluation
  • Vector Databases

Pinecone

  • Vector Similarity Search
  • Data Analysis

Data analytics

  • Pandas
  • PySpark
  • Backend Development

FastAPI

  • REST API Development
  • WebSockets
  • Asynchronous Programming
  • Docker

Cloud & Infrastructure

  • Programming Languages

other projects

  • Multi-Agent Research Assistant (LangGraph)
  • Intelligent Document QA System (RAG)
  • AI-Powered Data Insights Generator
  • Automated Resume Screening System

Languages

English
Proficient (C2)
C2
Hindi
Native
Native
Marathi
Proficient (C2)
C2

Accomplishments

Gold Medalist in Kho Kho at the National Level

Timeline

Senior Associate data scientist

Publicis sapient
04.2025 - Current

Data Scientist

Digital Suncity
07.2024 - 03.2025

B.Tech- Bachelor of Technology / Engineering - Computer Engineering

Rajiv Gandhi Proudyogiki Vishwavidyalaya (RGPV)
Mitanshu Khambayte