Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic
KESHAV KUMAR

KESHAV KUMAR

New Delhi

Summary

Results-driven Senior Data Scientist and Applied AI Engineer with over 8 years of experience in designing and deploying production-scale AI systems across diverse domains, including Computer Vision, OCR, Machine Learning, and Generative AI. Expertise in developing LLM-powered applications utilizing LangChain, LangGraph, RAG architectures, vector databases, and Gemini models, complemented by a strong foundation in computer vision frameworks such as YOLO. Proven track record of architecting scalable AI solutions on AWS using tools like Lambda, API Gateway, ECR, Terraform, and Docker. Successfully delivered intelligent document processing systems, automated vehicle inspection solutions, Text-to-SQL assistants, and enterprise AI applications that significantly enhanced operational efficiency and expedited business decision-making.

Overview

1
1
Certification
8
8
years of professional experience

Work History

Senior Data Scientist

TVS Digital Limited
05.2024 - Current

Computer Vision, OCR Projects, Generative AI, and NLP Projects.

  • Built YOLOv11-based computer vision models to detect image quality issues, including blur, low-light conditions, and glare, while classifying vehicle viewpoints (front, rear, speedometer, side view), improving annotation quality by 30%, and significantly reducing manual review effort.
  • Designed and deployed scalable MLOps pipelines using AWS Lambda, Amazon ECR, API Gateway, and Terraform, enabling automated infrastructure provisioning and reliable, production-grade model serving.
  • Developed an OCR automation solution using AWS Textract for Philippine national ID documents, extracting and structuring key information from unstructured text, and reducing manual form-entry effort by 60%.
  • Implemented intelligent post-processing and field-mapping logic to transform OCR outputs into validated, structured records, improving downstream system integration accuracy.
  • Led the development of an end-to-end vehicle inspection platform, leveraging YOLOv11 for image quality assessment and viewpoint classification, improving dataset reliability by 30%, and accelerating model training workflows.
  • Delivered an OCR-driven document automation system for Philippine ID processing using AWS Textract, reducing manual data-entry efforts by 60%, and streamlining customer onboarding workflows.
  • Using Docker, Lambda, API Gateway, ECR, and Terraform, enable scalable and reproducible ML model deployments across environments.
  • Engineered a RAG-powered Text-to-SQL assistant using LangChain, Gemini 1.5 Flash, and ChromaDB, enabling business users to query enterprise data in natural language, and reducing ad-hoc reporting turnaround time by 80%.
  • Improved SQL generation accuracy to over 95% through semantic retrieval-based few-shot prompting, dynamically injecting contextually relevant historical queries using ChromaDB.
  • Designed a multi-layer security framework incorporating query validation, access controls, syntax verification, and EXPLAIN-based auto-retry mechanisms, preventing destructive SQL execution, while ensuring reliable query generation.
  • Built a production-grade RAG-based Text-to-SQL solution using LangChain, Gemini 1.5 Flash, and ChromaDB, achieving over 95% SQL accuracy while reducing business reporting turnaround time by 80%.

Data Scientist

Cipher Square Technologies (subsidiary of RNFI)
05.2022 - 05.2024
  • Cleaned and enhanced ID images using custom preprocessing techniques for OCR pipeline for Indian Government IDs.
  • Fine-tuned YOLOv8 to locate text and integrated Tesseract and PaddleOCR for extraction.
  • Developed Flask API delivering structured JSON responses from ID images.
  • Built ETL pipeline to consolidate, transform, and visualize operational data, cutting processing time by 50%.
  • Created ML-driven insights dashboards using Plotly and SSRS for cross-functional teams.
  • Built segmentation models from scratch by analyzing data and training models that predict high risk customer using Random Forest and XGBoost.
  • Reduced marketing cost by 15% through targeted segmentation and data-driven features.

Associate Data Scientist

Cognizant (3rd-party payroll)
09.2020 - 05.2022
  • Processed and analyzed 6 months of transactional data to identify key customer segments for transaction-based customer segmentation project.
  • Created segmentation models revealing 6 distinct classes of user behavior.

Software Engineer

Cognizant (3rd-party payroll)
07.2018 - 09.2020
  • Built Flask-based API to scrape Gmail data using Google OAuth and deployed with Docker/Nginx for Gmail Data Mining API project.
  • Enabled analytics to extract customer purchase patterns for personalized marketing.

Education

Master of Science - Data Science and AI

Woolf University
San Francisco, California
12-2027

Bachelor of Engineering - Electronics & Communication

RGPV
Bhopal, India
06-2014

Skills

  • Programming & Databases: Python, SQL
  • Machine Learning & Deep Learning: Scikit-learn, XGBoost, Random Forest, TensorFlow, PyTorch
  • Computer Vision & OCR: YOLOv8, YOLOv11, OpenCV, Image Processing, OCR, AWS Textract, PaddleOCR, Tesseract
  • Generative AI & LLMs: LangChain, RAG, Prompt Engineering, Vector Databases, ChromaDB, Gemini, LLM Application Development, Text-to-SQL Systems
  • Cloud & MLOps: AWS Lambda, API Gateway, ECR, SageMaker, AWS Glue, Terraform, Docker, CI/CD
  • Backend Development: FastAPI, Flask, REST APIs, Microservices
  • Data Engineering & Analytics: ETL Pipelines, Data Modeling, Feature Engineering, Data Visualization, Plotly, SSRS
  • Tools & Platforms: Git, Linux, Azure, AWS

Certification

  • AI Workshop: Advanced Chatbot Development - LinkedIn Learning
  • Fine-Tuning for LLMs: from Beginner to Advanced - LinkedIn Learning

Timeline

Senior Data Scientist

TVS Digital Limited
05.2024 - Current

Data Scientist

Cipher Square Technologies (subsidiary of RNFI)
05.2022 - 05.2024

Associate Data Scientist

Cognizant (3rd-party payroll)
09.2020 - 05.2022

Software Engineer

Cognizant (3rd-party payroll)
07.2018 - 09.2020

Master of Science - Data Science and AI

Woolf University

Bachelor of Engineering - Electronics & Communication

RGPV
KESHAV KUMAR