Summary

Overview

Work History

Education

Skills

Certification

Timeline

KARTHIK BOINEPALLY

Senior AI Engineer

Bengaluru

Summary

Senior AI Engineer with 5+ years across backend engineering and applied GenAI, owning production-grade RAG and agentic systems in regulated legal and finance workflows. Led end-to-end development of a B2C/B2B GenAI legal research product serving ~700 daily users, architecting hybrid retrieval at scale (Milvus, TF-IDF + dense embeddings, RRF, cross-encoder reranking) to deliver high precision and low-latency performance. Strong in LLM reliability engineering (schema-safe outputs, rate-limit resilience, multi-model fallbacks, token optimization), OCR-driven document pipelines, and decision automation systems with human-in-the-loop controls. Proven technical leader who authors architecture docs, drives design reviews, mentors engineers, and ships measurable outcomes.

Overview

years of professional experience

Certifications

Language

Work History

Consultant – Tax Tech Transformation

Ernst & Young LLP

03.2025 - Current

Owned the end-to-end build of taxmann.ai, a production-grade GenAI legal research assistant (B2C & B2B), designing a layered Agentic RAG architecture (hybrid sparse+dense retrieval, cross-encoder reranking, context windowing) to reduce legal research time from hours to ~4 minutes for ~700 daily active users.
Architected large-scale hybrid retrieval on Milvus (TF-IDF + BGE embeddings) with Reciprocal Rank Fusion, improving Top-3 precision from 65% → 92% over a ~4M-document corpus while maintaining p95 latency < 800ms under production load.
Productionized multi-agent workflows using LangGraph (plan → retrieve → reason → validate) to deliver auditable, citation-backed responses at 94% decision accuracy, aligned with Big-4 legal and regulatory compliance requirements.
Re-architected LLM inference and orchestration for production reliability, reducing per-query token usage from ~400K → ~100K tokens by implementing adaptive chunking, schema-validated outputs (Pydantic), exponential backoff for rate limits, and multi-model fallbacks, resulting in lower latency, stable responses, and predictable costs at scale.
Designed and built a near-real-time Input Tax Credit (ITC) recommendation system, ingesting Excel-based line items, validating transactions via Kafka streams from internal ERP systems, and combining heuristics, embedding-based categorization, LLM confidence scoring, and LLM-as-Judge workflows with human-in-the-loop escalation; currently processes ~50K line items/day.
Architected a production Legal Notice Submission system supporting Hindi and English, incorporating OCR-based document ingestion, RAG, and tool-calling workflows to extract verbatim citations and generate legally compliant submissions, significantly reducing manual drafting effort for legal teams.
Led and mentored a team of 6 junior engineers (intern → full-time) by authoring architecture documents, driving design reviews, decomposing ambiguous problem statements into executable systems, and owning critical features end-to-end across multiple GenAI products.
Reviewed PRs and guided execution across 3 concurrent GenAI projects, ensuring architectural consistency, production readiness, and high code quality through peer programming and hands-on ownership of complex components.
Implemented persistent user context and long-term memory using Mem0 + Neo4j to enable persona-aware recommendations, and instrumented RAG pipelines with RAGAS-based evaluation for both development-time decisions and production monitoring (retrieval quality, drift, response performance).
Built MCP-based tool-calling infrastructure to enable research-oriented agent workflows, coordinating multiple tools via structured schemas and remote MCP servers for scalable, modular reasoning pipelines.

Intern – GenAI Engineer – Tax Tech Transformation

Ernst & Young LLP

10.2024 - 03.2025

Developed a GenAI-enabled PPT generator tool that automates presentation creation compliant to Big 4 standards.

Software Applications Engineer – Wireless Backend Engineer

Extreme Networks

02.2023 - 04.2024

Implemented Apache Flink for stream processing, enhancing real-time data processing and anomaly detection by 25%.
Led the development and optimization of APIs using gRPC and RESTful services, enhancing system integration and scalability, collaborating with four global teams to align backend services with front-end user interfaces.
Optimized Elasticsearch queries, reducing data load time on customer screens from 1 minute to 5 seconds.
Led the development of a Q&A Chatbot, automating response generation and reducing user escalations by 40%.

Software Engineer – Java Full Stack Developer

Societe Generale Global Solution Centre

12.2020 - 01.2023

Migrated back office data from HBase database to MongoDB, increasing data processing efficiency by 60%.
Automated processes with Selenium and Cucumber, increasing Business Analyst’s testing efficiency by 50%.
Migrated 1M records from legacy Oracle databases to newer versions, saving $10M annually and boosting performance.
Led migration of client data from On-Premise servers to Azure leveraging Azure Data Factory and Databricks, saving bank $300K annually and improved data accessibility.
Developed REST APIs using Spring Boot, JPA, and Hibernate, achieving 100ms to 1sec. response times, measured with Apache JMeter, for upstream and downstream applications.
Developed Python-based tool using Pandas to automate data cleaning and structuring, saving $25k per sprint.

Education

Post Graduation - Artificial Intelligence And Data Science

Jio Institute

Navi Mumbai

03-2025

Bachelor of Engineering - Computer Science And Engineering

Sir M Visvesvaraya Institute of Technology

Bengaluru, India

08-2020

Skills

GenAI / Agents / RAG:Agentic RAG, LangChain, LangGraph, Tool Calling, MCP, A2A, OpenAI, Claude, Gemini, Mistral

Retrieval / Search / Ranking:Milvus, pgvector, Elasticsearch, TF-IDF, Reciprocal Rank Fusion, Voyage Embeddings, Voyage Cross-Encoder Reranking

Backend / Data / Streaming:Python, Java, SQL, FastAPI, Flask, Spring Boot, REST, gRPC, Apache Kafka, Apache Flink

Cloud / Datastores / OCR / DevOps:Azure OpenAI, AI Foundry, Blob Storage, Azure Document Intelligence, AWS Lambda, AWS S3, Vertex AI, PostgreSQL, MongoDB, Neo4j, Mistral OCR, Git, Jenkins

Certification

Microsoft Certified: Azure Fundamentals - Issued by Microsoft

Timeline

Consultant – Tax Tech Transformation

Ernst & Young LLP

03.2025 - Current

Intern – GenAI Engineer – Tax Tech Transformation

Ernst & Young LLP

10.2024 - 03.2025

Software Applications Engineer – Wireless Backend Engineer

Extreme Networks

02.2023 - 04.2024

Software Engineer – Java Full Stack Developer

Societe Generale Global Solution Centre

12.2020 - 01.2023

Post Graduation - Artificial Intelligence And Data Science

Jio Institute

Bachelor of Engineering - Computer Science And Engineering

Sir M Visvesvaraya Institute of Technology