Summary
Overview
Work History
Education
Skills
Certification
Timeline
Key Highlights
Languages
Generic
Manish Kumar Maurya

Manish Kumar Maurya

Senior Technical Lead| Deloitte
Bangalore

Summary

AWS Certified Solutions Architect Associate and Certified AI Practitioner with 15+ years of experience in designing and delivering large scale cloud-native, scalable applications using Java, Spring Boot, microservices, Kafka, Docker, Kubernetes, and AWS. Expertise in architecting and implementing Generative AI, Agentic AI applications with framework LangChain, LangGraph, Open AI agent sdk, Open AI Agent Builder, Amazon Bedrock Agent core using LLMs, RAG, MCP Servers. Built enterprise AI/ML solutions including AI-powered multi use chatbots, Enterprise RAG-based apps, and intelligent Agentic AI multi agents systems. Leveraging GitHub Copilot to accelerate development, improve code quality, and enhance productivity, especially in financial services applications.

Overview

15
15
years of professional experience
6
6
Certifications
2
2
Languages

Work History

Senior Technical Lead | Cloud & AI Solutions

Deloitte USI
04.2019 - Current

Project: Multi use AI Assistence

Designed and implemented a multi agent Agentic AI Platform that serves multiple enterprise use cases (FAQs, order tracking, recommendations, refunds) through a single conversational interface.

The system uses embedding-based intent classification, agent orchestration, Retrieval-Augmented Generation (RAG), and LLMs, and is deployed on Kubernetes with authentication, scalability, and observability built in.

Tools & Technologies Used:

Languages & Frameworks

  • Python, FastAPI

AI / LLM

  • OpenAI GPT models(gtp-4o-mini)
  • Embedding-based intent classification
  • Retrieval-Augmented Generation (RAG)

Data & Caching

  • Redis (response & intent caching)
  • Vector Database

Security

  • JWT Authentication
  • PII masking
  • Rate limiting

DevOps & Cloud-Native

  • Docker
  • Kubernetes
  • Kubernetes Secrets
  • CI/CD

Observability

  • Langfuse

Project: Intelligent Invoice Processing (IDP)

  • Design and implement an AI-driven Intelligent Document Processing (IDP) solution to automatically extract structured json invoice data from unstructured vendor-specific PDF invoices document using Large Language Models (LLMs), ensuring high accuracy, schema compliance, and minimal manual intervention for finance workflows.
  • Architected a modular invoice ingestion pipeline to process multi-page, vendor-specific PDF invoices with robust text extraction and normalization
  • Designed a schema-first extraction approach for invoice fields such as Invoice Number, Vendor Name, Invoice Date, Line Items, Tax, and Total Amount
  • Engineered constraint-based prompts to enforce strict JSON output, null handling, numeric normalization, and ISO date formats for financial accuracy
  • Integrated OpenAI GPT models as the core extraction engine, optimizing for cost, latency, and token usage through prompt control and content truncation
  • Implemented defensive JSON parsing and fallback mechanisms to ensure pipeline reliability for financial systems
  • Designed APIs to expose structured invoice data for downstream systems.
  • Reduced manual invoice data entry effort by 70%
  • Improved invoice extraction accuracy by 40% compared to OCR + regex solutions
  • Accelerated invoice processing turnaround time by 50%
  • Enabled scalable automation for high-volume finance and accounting operations

Tools & Technologies

Python, OpenAI GPT(gpt-4.1-mini), Prompt Engineering,, Pandas, dotenv, Langfuse (Observability)

Project: Context aware LLM optimization plateform

Designed and implemented an enterprise-grade Retrieval-Augmented Generation (RAG) platform to contextualize Large Language Model (LLM) for improved accuracy, cost, and latency. The solution focuses on intelligent context management rather than model selection, enabling scalable, cost-efficient, and reliable LLM adoption for enterprise knowledge systems.

Roles and Responsibilities:

  • Architected a multi-stage RAG pipeline incorporating semantic chunking, query rewriting, reranking, and context compression to improve response accuracy and efficiency.
  • Implemented semantic chunking based on document structure (policy-level sections) to preserve business meaning and improve retrieval recall.
  • Built a high-recall vector retrieval layer using embeddings and Chroma Vector DB to ensure relevant knowledge consistently
  • Developed an LLM-based query rewriting mechanism to normalize user queries into enterprise policy language and handle ambiguity (e.g., SLA response vs resolution).
  • Integrated a cross-encoder reranking layer to improve precision by selecting the most semantically relevant context before answer generation.
  • Implemented context compression to extract only query-relevant information from retrieved documents, significantly reducing prompt size while maintaining correctness.
  • Enforced strict answer grounding and guardrails to prevent hallucination
  • Designed and implemented a metrics and observability layer to track token usage, latency, and cost per request.
  • Conducted comparative analysis (with vs without context compression), demonstrating measurable improvements in system efficiency in term of cost and latency.
  • Languages: Python
  • LLMs: OpenAI GPT 4o model models
  • Embeddings: OpenAI Embeddings
  • Vector Database: Chroma
  • Reranking: Sentence-Transformers Cross-Encoder
  • Metrics: Token accounting, latency measurement, cost analysis

Project: FALCM (Edward Jones)

  • Modernized a Fortune 500 financial services platform for better advisor/branch experience.
  • Designed and implemented Java 17 + Spring Boot microservices on AWS.
  • Built REST APIs, CI/CD pipelines,
  • Contributed to architecture, development, and integration of large-scale systems.
  • Managed team of 12 members .
  • Mentored junior developers for improved coding skills and understanding of best practices, leading to increased overall code quality within the team.
  • Collaborated with cross-functional teams to design and implement innovative technical solutions, fostering streamlined communication channels and efficient problem resolution.
  • Developed technical solutions.

Project: M&H Digital Transformation (AXA)

  • Led end-to-end delivery of integration systems for insurance digital transformation.
  • Designed and delivered 8+ APIs interacting with core systems.
  • Set up build & release pipelines with DevOps teams.

Project: CXP (Kroger)

  • Led design and delivery of customer experience platform.
  • Coordinated with business teams for loyalty, pricing, and promotions modernization.
  • Earlier Projects (IBM, Etisalat, Skidata, Infinite, Capgemini, Torry Harris)

Education

B.Tech - Computer Science

UCER
Naini Allahabad

Skills

Cloud & Architecture: AWS, Cloud-Native Architecture, Microservices, API Design, DevOps, CI/CD

Programming: Java (17), Python

AI/ML & GenAI: LangChain, LangGraph, OpenAI Agent SDK, Open AI Agent builder, Amazon bedrock agent core,TensorFlow, Keras, RAG, LLMs, Amazon Bedrock, SageMaker, MLOps, MlFlow,Vector DB, MCP,LangFuse,LangSmith,Arize, github copilot, multi agent system

Embeddings, NLP, Prompt Engineering, context engineering

Containers & Infra: Docker, Kubernetes

Databases: Oracle, MySQL, DB2, MongoDB

Tools: Git, Bitbucket, Maven, Jenkins, SonarQube, JIRA, Splunk, Grafana, Dynatrace

Agile project management

Certification

AWS Certified Solutions Architect – Associate

Timeline

Senior Technical Lead | Cloud & AI Solutions

Deloitte USI
04.2019 - Current

B.Tech - Computer Science

UCER

Key Highlights

  • 15+ years in Cloud-Native Architecture & Enterprise Application Development
  • Deloitte Certified AI solution architect
  • Expertise in designing and implementing multi agent systems.
  • Hands-on with Generative AI, Machine Learning, and Agentic AI frameworks
  • Strong Financial Services & Insurance domain experience
  • Experienced in leading teams, mentoring developers, and driving digital transformation

Languages

English|Hindi

Manish Kumar MauryaSenior Technical Lead| Deloitte