Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

KARTHIK BOINEPALLY

Senior AI Engineer
Bengaluru

Summary

Senior AI Engineer with 5+ years across backend engineering and applied GenAI, owning production-grade RAG and agentic systems in regulated legal and finance workflows. Led end-to-end development of a B2C/B2B GenAI legal research product serving ~700 daily users, architecting hybrid retrieval at scale (Milvus, TF-IDF + dense embeddings, RRF, cross-encoder reranking) to deliver high precision and low-latency performance. Strong in LLM reliability engineering (schema-safe outputs, rate-limit resilience, multi-model fallbacks, token optimization), OCR-driven document pipelines, and decision automation systems with human-in-the-loop controls. Proven technical leader who authors architecture docs, drives design reviews, mentors engineers, and ships measurable outcomes.

Overview

5
5
years of professional experience
4
4
Certifications
1
1
Language

Work History

Consultant – Tax Tech Transformation

Ernst & Young LLP
03.2025 - Current
  • Owned the end-to-end build of taxmann.ai, a production-grade GenAI legal research assistant (B2C & B2B), designing a layered Agentic RAG architecture (hybrid sparse+dense retrieval, cross-encoder reranking, context windowing) to reduce legal research time from hours to ~4 minutes for ~700 daily active users.
  • Architected large-scale hybrid retrieval on Milvus (TF-IDF + BGE embeddings) with Reciprocal Rank Fusion, improving Top-3 precision from 65% → 92% over a ~4M-document corpus while maintaining p95 latency < 800ms under production load.
  • Productionized multi-agent workflows using LangGraph (plan → retrieve → reason → validate) to deliver auditable, citation-backed responses at 94% decision accuracy, aligned with Big-4 legal and regulatory compliance requirements.
  • Re-architected LLM inference and orchestration for production reliability, reducing per-query token usage from ~400K → ~100K tokens by implementing adaptive chunking, schema-validated outputs (Pydantic), exponential backoff for rate limits, and multi-model fallbacks, resulting in lower latency, stable responses, and predictable costs at scale.
  • Designed and built a near-real-time Input Tax Credit (ITC) recommendation system, ingesting Excel-based line items, validating transactions via Kafka streams from internal ERP systems, and combining heuristics, embedding-based categorization, LLM confidence scoring, and LLM-as-Judge workflows with human-in-the-loop escalation; currently processes ~50K line items/day.
  • Architected a production Legal Notice Submission system supporting Hindi and English, incorporating OCR-based document ingestion, RAG, and tool-calling workflows to extract verbatim citations and generate legally compliant submissions, significantly reducing manual drafting effort for legal teams.
  • Led and mentored a team of 6 junior engineers (intern → full-time) by authoring architecture documents, driving design reviews, decomposing ambiguous problem statements into executable systems, and owning critical features end-to-end across multiple GenAI products.
  • Reviewed PRs and guided execution across 3 concurrent GenAI projects, ensuring architectural consistency, production readiness, and high code quality through peer programming and hands-on ownership of complex components.
  • Implemented persistent user context and long-term memory using Mem0 + Neo4j to enable persona-aware recommendations, and instrumented RAG pipelines with RAGAS-based evaluation for both development-time decisions and production monitoring (retrieval quality, drift, response performance).
  • Built MCP-based tool-calling infrastructure to enable research-oriented agent workflows, coordinating multiple tools via structured schemas and remote MCP servers for scalable, modular reasoning pipelines.

Intern – GenAI Engineer – Tax Tech Transformation

Ernst & Young LLP
10.2024 - 03.2025


  • Developed a GenAI-enabled PPT generator tool that automates presentation creation compliant to Big 4 standards.


Software Applications Engineer – Wireless Backend Engineer

Extreme Networks
02.2023 - 04.2024
  • Implemented Apache Flink for stream processing, enhancing real-time data processing and anomaly detection by 25%.
  • Led the development and optimization of APIs using gRPC and RESTful services, enhancing system integration and scalability, collaborating with four global teams to align backend services with front-end user interfaces.
  • Optimized Elasticsearch queries, reducing data load time on customer screens from 1 minute to 5 seconds.
  • Led the development of a Q&A Chatbot, automating response generation and reducing user escalations by 40%.

Software Engineer – Java Full Stack Developer

Societe Generale Global Solution Centre
12.2020 - 01.2023
  • Migrated back office data from HBase database to MongoDB, increasing data processing efficiency by 60%.
  • Automated processes with Selenium and Cucumber, increasing Business Analyst’s testing efficiency by 50%.
  • Migrated 1M records from legacy Oracle databases to newer versions, saving $10M annually and boosting performance.
  • Led migration of client data from On-Premise servers to Azure leveraging Azure Data Factory and Databricks, saving bank $300K annually and improved data accessibility.
  • Developed REST APIs using Spring Boot, JPA, and Hibernate, achieving 100ms to 1sec. response times, measured with Apache JMeter, for upstream and downstream applications.
  • Developed Python-based tool using Pandas to automate data cleaning and structuring, saving $25k per sprint.

Education

Post Graduation - Artificial Intelligence And Data Science

Jio Institute
Navi Mumbai
03-2025

Bachelor of Engineering - Computer Science And Engineering

Sir M Visvesvaraya Institute of Technology
Bengaluru, India
08-2020

Skills

GenAI / Agents / RAG:Agentic RAG, LangChain, LangGraph, Tool Calling, MCP, A2A, OpenAI, Claude, Gemini, Mistral

Certification

Microsoft Certified: Azure Fundamentals - Issued by Microsoft

Timeline

Consultant – Tax Tech Transformation

Ernst & Young LLP
03.2025 - Current

Intern – GenAI Engineer – Tax Tech Transformation

Ernst & Young LLP
10.2024 - 03.2025

Software Applications Engineer – Wireless Backend Engineer

Extreme Networks
02.2023 - 04.2024

Software Engineer – Java Full Stack Developer

Societe Generale Global Solution Centre
12.2020 - 01.2023

Post Graduation - Artificial Intelligence And Data Science

Jio Institute

Bachelor of Engineering - Computer Science And Engineering

Sir M Visvesvaraya Institute of Technology
KARTHIK BOINEPALLYSenior AI Engineer