Summary
Overview
Work History
Education
Skills
Websites
Certification
Projects
Timeline
Generic

Yash Tripathi

Kanpur

Summary

Data Scientist specializing in AI-driven solutions, query generation, RAG, and real-time fraud detection. Proficient in Python, TensorFlow, and cloud platforms (GCP, AWS). Skilled in Machine Learning, Computer Vision, NLP, and MLOps. Certified GCP Professional ML Engineer and Associate Cloud Engineer. Published research in landmark detection using Variational Autoencoders (VAEs).

Overview

1
1
year of professional experience
1
1
Certification

Work History

Data Scientist

LUMIQ
Noida
07.2023 - Current
  • Developed AI-driven solutions for automating processes and improving data accuracy.
  • Gained expertise in query generation, intent classification, and retrieval-augmented generation.
  • Designed domain-specific chatbots with safeguards for reliable information delivery.
  • Built real-time fraud detection models using OCR and embeddings, optimizing efficiency in large datasets.
  • Strengthened skills in API development, database management, and AI-based automation.

Education

B.Tech - Electrical Engineering

National Institute of Technology
Silchar, ASSAM, India
05.2023

XII (CBSE) -

Archies Higher Secondary School Kanpur
Kanpur, Uttar Pradesh, India
05.2018

Skills

  • Programming: Python, SQL, C
  • Gen AI: RAG, GraphRAG, Transformers, LLMs, GPT
  • AWS Services: S3, EC2, RDS, Bedrock, Redshift, SageMaker (for ML model training)
  • GCP Services: BigQuery, Vertex AI, AutoML, AI Hub, Dataflow, Cloud Storage
  • Vector Stores: ChromaDB, Milvus, Pinecone, Knowledge Graph
  • AI/ML: Computer Vision, NLP, Deep Learning, Machine Learning (Supervised, Unsupervised, Reinforcement Learning), Time Series Analysis, Predictive Modeling
  • Data Science: Data Wrangling, Feature Engineering, Data Visualization (Matplotlib, Seaborn), Statistical Analysis (Hypothesis Testing, Regression)
  • ML Frameworks: TensorFlow, PyTorch, Scikit-learn, XGBoost
  • DevOps: Git, Docker, CI/CD for ML (MLOps), MLflow
  • APIs: FastAPI

Certification

  • Google Cloud Platform Associate Cloud Engineer
  • Google Cloud Platform Professional ML Engineer

Projects

Text to SQL : Developed a system that processes user input, classifies query intent, and filters business-related requests. Generated SQL queries using Chroma Database, Claude Sonnet 3.0, and Graph RAG search from a Postgres server with 200+ tables. Improved query accuracy by 85% through initial data dictionary population. 

Guardrail Chatbot :Designed a chatbot to deliver accurate, domain-specific information from a curated dataset. Integrated guardrails to prevent off-topic responses, increasing response accuracy by 60% and improving user trust. 

Negative News Search : Built a user profile creation system using advanced tagging (negative tags) for personalized recommendations. Utilized FastAPI and GPT-3.5, improving recommendation accuracy by 67% and reducing profile processing time by 25%

Claim Dedupe : Deployed a fraud detection system for insurance claims using OCR, image, and text embeddings to detect alterations. Integrated a Vector Database, improving fraud detection accuracy by 65% and reducing redundant embeddings by 25%, cutting processing time by 20%.

Timeline

Data Scientist

LUMIQ
07.2023 - Current

B.Tech - Electrical Engineering

National Institute of Technology

XII (CBSE) -

Archies Higher Secondary School Kanpur
Yash Tripathi