Summary
Overview
Work History
Education
Skills
Websites
Leadership Experience
Certifications Awards
Projects
Timeline
Generic
AYUSH SINHA

AYUSH SINHA

Noida,UP

Summary

AI/ML Engineer specialising in production LLM and GenAI systems. At AventIQ, shipped a multimodal RAG support assistant processing structured and unstructured enterprise data - cutting document retrieval effort by 35% and reducing extraction latency from 16s to 5s. Fine-tuned and deployed domain-specific transformer models (BERT, GPT-2, T5, Pythia) using LoRA/QLoRA, and integrated Azure OpenAI to automate ticket resolution at scale. Experienced across the full GenAI stack: RAG pipelines, vector databases, multi-agent orchestration, and AWS inference deployment.

Overview

1
1
years of professional experience
4
4
years of post-secondary education

Work History

Independent AI/ML Researcher

Self-Employed
09.2025 - Current
  • Building a production E-commerce Product Recommendation RAG system using AWS S3, SageMaker, and OpenSearch Serverless as a portfolio project.
  • Deepening expertise through advanced coursework in quantization and multi-agent systems (deeplearning.ai, CrewAI).
  • Open-source contributions and applied GenAI experimentation across LLM fine-tuning and retrieval pipeline optimisation.

AI/ML Engineer - LLM & GenAI Systems

AventIQ
New Delhi
10.2024 - 08.2025
  • Worked on production multimodal RAG support assistant querying PDFs, Excel, CSV, and image documents using PGVector, FAISS, and Hugging Face embeddings - reducing manual document search effort by 35% across enterprise teams.
  • Integrated Azure OpenAI APIs for automated ticket resolution, cutting average resolution time from ~8 min to ~3 min across the support pipeline.
  • Fine-tuned and quantized (FP16/8-bit LoRA/QLoRA) 12 domain-specific LLMs including BERT, GPT-2, T5, and Pythia on internal datasets - deploying 4 models to production on Hugging Face Hub with improved inference speed and reduced compute costs.
  • Built an AI business card scanner with AWS Textract and FastAPI, reducing extraction latency from 16s to 5s (68% improvement) in production.
  • Designed an AI-driven resume parsing pipeline using AWS SQS for async message processing, improving ingestion throughput by 40% under peak load.
  • Evaluated models using Accuracy, Precision, Recall, F1, and BLEU metrics across 6 production use cases to validate deployment readiness.

Data Science Intern

Celebal Technologies
05.2023 - 07.2023
  • Built and evaluated ML/DL classification and object detection models in Python and TensorFlow; implemented full data preprocessing and feature engineering pipelines for 3 internal datasets.

Education

B.Tech - Computer Science & Engineering (AI & ML)

Galgotias University
08.2020 - 05.2024

Skills

  • Python
  • PyTorch
  • Transformers
  • FastAPI
  • Pythia
  • BERT
  • RoBERTa
  • T5
  • GPT-2
  • Whisper
  • RAG pipelines
  • LoRA/QLoRA fine-tuning
  • Agentic AI
  • Prompt Engineering
  • PGVector
  • FAISS
  • Pinecone
  • ChromaDB
  • AWS (Lambda, Bedrock, S3, SQS, Textract, IAM)
  • Azure OpenAI
  • Accuracy
  • Precision
  • Recall
  • F1-Score
  • BLEU
  • C
  • Nodejs
  • TensorFlow
  • Scikit-learn
  • Git
  • Postman
  • VS Code

Leadership Experience

  • Led a 4-person engineering team delivering 3 AI application features to production, coordinating design, development, and cloud deployment across a 6-month cycle.
  • Mentored 2 junior developers in Python, LLM fine-tuning workflows, and AWS-based AI deployment - both now working independently on production systems.

Certifications Awards

  • Employee of the Month, AventIQ, 2025-06-01
  • CrewAI Multi-Agent Systems, Course Credential
  • Fundamentals of Quantization, deeplearning.ai

Projects

  • E-Commerce Product Recommendation RAG System, Python, AWS S3, SageMaker, OpenSearch Serverless, End-to-end production RAG pipeline for personalised product recommendations; AWS-native data ingestion with SageMaker-hosted embeddings and OpenSearch Serverless for vector retrieval. Currently in active development.
  • AI Business Card Scanner Pro, Python, AWS Textract, FastAPI, Intelligent document extraction backend with optimised inference pipeline; reduced extraction latency from 16s to 5s (68% improvement) in production.
  • LLM Fine-Tuning & Quantization Pipeline, Python, PyTorch, Transformers, Hugging Face, Applied FP16 and 8-bit LoRA/QLoRA quantization to reduce model size for production deployment. Fine-tuned on domain-specific datasets; shipped 4 models to Hugging Face Hub.
  • AI-Driven Resume Parser (ATS), Python, AWS SQS, NLP, Async resume parsing and document classification pipeline using AWS SQS for message queuing; improved ingestion throughput by 40% under concurrent load through non-blocking architecture.

Timeline

Independent AI/ML Researcher

Self-Employed
09.2025 - Current

AI/ML Engineer - LLM & GenAI Systems

AventIQ
10.2024 - 08.2025

Data Science Intern

Celebal Technologies
05.2023 - 07.2023

B.Tech - Computer Science & Engineering (AI & ML)

Galgotias University
08.2020 - 05.2024
AYUSH SINHA