Summary
Overview
Work History
Education
Skills
Websites
Languages
Timeline
Generic

Anmol Jain

Pune

Summary

Dynamic AI Engineer with experience building scalable, production-grade AI systems across healthcare, insurance, and B2B SaaS domains. Skilled in Python, LLMs, and distributed architectures, with a strong focus on RAG pipelines, multi-agent systems, and document intelligence. Proven track record of deploying end-to-end AI solutions at scale, processing millions of documents and supporting thousands of users. Passionate about designing high-impact AI products that combine deep technical innovation with real-world business outcomes.

Overview

4
4
years of professional experience
2023
2023
years of post-secondary education

Work History

AI ENGINEER

Telomere
Columbia
08.2025 - Current
  • Built an agentic AI platform to process large-scale surrogacy documents (up to 2GB per case) and generate structured, citation-backed clinical summaries
  • Designed multi-stage pipelines for document ingestion, parsing, embedding, retrieval, and summarization across fragmented medical records
  • Implemented agentic workflows to extract pregnancy timelines, family medical history, prenatal and delivery summaries, and vaccination records
  • Generated high-impact, structured 4-page clinical summaries, significantly reducing manual chart review effort
  • Engineered robust parsing pipelines for unstructured medical data (PDFs, scans, reports) with high accuracy
  • Developed a contract-aware RAG-based system to validate surrogate expense requests against complex legal agreements
  • Designed multi-agent decision pipelines combining retrieval, contract understanding, and reasoning for real-time validation
  • Built logic to evaluate expense eligibility based on financial limits, timelines, and contractual conditions
  • Automated DR request workflows, enabling real-time expenditure validation and reducing manual intervention
  • Leveraged LLM-based retrieval pipelines to interpret legal documents dynamically and enforce compliance
  • Deployed and managed scalable infrastructure using AWS CDK and CloudFormation
  • Configured load balancers and implemented horizontal scaling to handle high-throughput workloads
  • Built monitoring and alerting systems to track performance, latency, and cost spikes in production environments

AI Engineer

Accacia
Bangalore
01.2025 - Current
  • Designed and deployed an AI-driven outreach automation platform for hyper-personalized B2B messaging:
  • - Built full-stack architecture using FastAPI, Supabase ,PostgreSQL, with scalable cloud-native deployment
  • - Implemented multi-agent pipelines for lead research, tone control, and qualification logic, enabling modular personalization at scale, onboarding 10k+ users while tracking adoption with PostHog
  • - Applied entity extraction + semantic scoring to dynamically tailor messages to company, industry, and persona context
  • Engineered an AI-powered Chrome extension for sales workflows:
  • - Automated LinkedIn and Gmail outreach
  • - Enabled real-time, context-aware message generation powered by OpenAI + Perplexity, improving outreach efficiency and conversion
  • Built an AI-powered pitch deck generation system, combining graph databases with generative AI:
  • - Modeled company, industry, and persona relationships in Neo4j, enabling GraphRAG-style enrichment
  • - Queried the graph to assemble structured context, then orchestrated AI pipelines to generate tailored PowerPoint sales decks automatically
  • Delivered production-ready AI pipelines with polling, chunking, and fallback logic, ensuring reliability, fault tolerance, and scalability in customer-facing applications
  • Shaped AI application architecture by integrating modern stacks (vector search, multi-agent workflows, CRM APIs) with real-world customer needs, setting benchmarks for scalability and measurable business impact

Software Engineer

Turtlemint
Pune
08.2022 - 10.2024
  • Architected and deployed an advanced document understanding system:
  • - Fine-tuned LayoutLM transformer, achieving 95% F1-score on 103 entity classes
  • - Engineered robust data preprocessing with attention mechanisms and spatial embeddings
  • - Processed 5000+ diverse document formats with spatial-aware token classification
  • Developed state-of-the-art extraction pipeline with CLAUDE Haiku LLM:
  • - Built a sophisticated prompting system for 98% extraction accuracy across 40+ insurers
  • - Engineered a configurable framework for Prompts, Extraction Keys, and Key Types
  • - Optimized token usage through intelligent document chunking
  • - Successfully processed 10M+ policy documents
  • Innovated a custom in-house PDF parser using pymupdf and pdfminer:
  • - Implemented advanced preprocessing and token optimization techniques
  • - Achieved 99% accuracy across motor, life, and health insurance documents
  • - Unified data from 40+ insurers into structured formats
  • Led end-to-end MLOps implementation and deployment:
  • - Deployed models using TorchServe with AWS SageMaker integration
  • - Built a scalable high-throughput architecture with RabbitMQ and Kafka
  • - Handled 10K+ requests per hour with automated retraining and performance monitoring
  • Engineered a comprehensive preprocessing pipeline:
  • - Used SVM-based text classification for intelligent page ranking
  • - Integrated AWS Textract OCR for optimized extraction
  • - Reduced processing time from 11 minutes to 11 seconds on 80-page docs
  • - Processed over 1 million documents with 99% accuracy

Tech Intern

Turtlemint
Pune
02.2022 - 08.2022
  • Developed comprehensive dashboards for various business verticals using Apache Superset and Metabase, showcasing data accuracies and insights, powered by complex queries in MySQL and PostgreSQL.
  • Designed and implemented APIs for calculating data accuracies using OCR confidence levels and string matching with ground truth data, enhancing data quality and reliability

Education

BTech - Computer Science

Vishwakarma Institute of Technology
Pune, India

Skills

Skills

Languages & Frameworks:

Python, Nodejs, TypeScript, Django, FastAPI, React

AI / ML:

LLMs, RAG, Multi-Agent Systems, NLP, OCR, BERT, Word2Vec, XGBoost, Random Forest, Logistic Regression, LSTM, RNN

AI Engineering & Data Systems:

Prompt Engineering, Document Intelligence, GraphRAG, Graph Databases, LangChain

Cloud & Infrastructure:

AWS (EC2, Lightsail), AWS CDK, CloudFormation, Docker, Kubernetes, Load Balancing, Distributed Systems

Data & Backend Systems:

PostgreSQL, MySQL, MongoDB, Neo4j, Apache Kafka, RabbitMQ, Celery

Analytics & Monitoring:

Supabase, Supabase Edge Functions, Sentry, Kibana, Apache Superset, Metabase

Languages

English
Advanced
C1

Timeline

AI ENGINEER

Telomere
08.2025 - Current

AI Engineer

Accacia
01.2025 - Current

Software Engineer

Turtlemint
08.2022 - 10.2024

Tech Intern

Turtlemint
02.2022 - 08.2022

BTech - Computer Science

Vishwakarma Institute of Technology
Anmol Jain