Summary
Overview
Work History
Education
Skills
Blog Post
Timeline
Generic
Vaibhav Phutane

Vaibhav Phutane

Summary

Sr. Lead Engineer at CommerceIQ with 9 years of experience in AI-driven product development, specializing in GenAI, multi-agent systems, and scalable platform architecture. I designed and led the development of a multi-agent platform and built a Generative AI-powered insights engine that transformed data analysis and automated business decisions. Adept in Python, JavaScript, and TypeScript, with a strong track record of technical leadership, cross-functional collaboration, and delivering production-ready AI solutions that drive measurable business outcomes.

Overview

9
9
years of professional experience

Work History

Sr. Tech Lead

CommerceIQ
Bangalore
08.2021 - Current

Multi-Agent Platform

  • Spearheading the design and development of a scalable multi-agent platform using LangGraph, LangSmith, and Python, enabling rapid orchestration and deployment of complex AI workflows.
  • Engineered a Supervisor Agent to manage sub-agent interactions, enforce task hierarchies, and coordinate autonomous agent behavior under human-in-the-loop or fully autonomous settings.
  • Designed and developed a prompt management system, with prompt versioning and eval control
  • Designed and developed a Data Agent that interprets natural language queries to retrieve data from multiple sources using a custom domain-specific language (DSL).
  • Designed and developed an eCommerce RCA Agent to analyze root causes behind drops in KPIs.

Agent Evaluation & Quality Frameworks

  • Designed a comprehensive evaluation strategy combining automated and LLM-based techniques:
  • Developed an eval dataset generation strategy, creating it manually and from traces using LangSmith
  • Built custom code evaluators for domain-specific use cases.
  • Implemented LLM-as-a-Judge evaluation, enabling nuanced, context-aware assessment of agent outputs.
  • Conducted extensive agent and sub-agent testing, including agent mocking to isolate specific behaviors.
  • Defined evaluation metrics such as Correctness, Precision, Recall, and F1 Score to benchmark performance across workflows.

Multi-Agent Platform

  • Developed an on-premise agent deployment pipeline with integrated CI/CD workflows, ensuring scalable, secure, and compliant rollouts across enterprise environments.

Insights Generation Platform

  • Architected and delivered a GenAI-powered Insights Generation Platform capable of interpreting any data visualization by extracting semantics from metrics (columns) and dimensions (rows), enabling contextual and actionable data analysis using Python, LangChain, LangSmith, JavaScript, and TypeScript.
  • Developed a Neo4j-based metric relationship graph, modeling interdependencies between business metrics (e.g., ROAS → Ad Sales, Ad Spend). Each metric node encapsulated rich metadata: name, description, priority, and business goal, facilitating traceability and automated causal inference.
  • Engineered dynamic insight generation using LLMs, combining structured graph data and semantic context to uncover key drivers and anomalies across multiple business dimensions. The system could intelligently expand dimensional hierarchies, such as navigating from category → subcategory → SKU level, to produce fine-grained insights.
  • Created a suite of transformer functions to translate complex JSON data structures into fluent, human-readable sentences, improving the interpretability and accuracy of AI-generated insights.
  • Implemented adaptive metric inference by integrating metric relationships during insight construction, allowing the platform to auto-supplement missing but relevant metrics based on learned dependencies.

Text to SQL

  • Designed and led the development of a Text-to-SQL agent with modular components for table metadata retrieval, query generation, validation, and human-in-the-loop feedback using LangGraph, LangChain, OpenAI (GPT-4O), and LLaMA models, Python.
  • Evaluated few off the shelf offering like Gennie from databricks, defog, star coder etc
  • Enabled natural language to SQL conversion with robust validation and reflection mechanisms for improved accuracy and usability.

Content Recommendation.

  • Built a GenAI content recommendation system to analyze trending keywords, PDP content (titles, descriptions, FAQs), and retailer rules using Python, LLM (GPT-4o models), LangChain, and TypeScript.
  • Generated optimized titles, bullets, and descriptions for Amazon, Walmart, and Target to improve relevance and compliance.

Sentiment Analysis.

  • Designed and developed a multilingual sentiment analysis Chrome plugin to help category managers monitor product quality, shipping issues, and customer feedback using LangChain, LLMs, Python, and JavaScript.
  • Enabled real-time sentiment insights across languages to drive faster decision-making and issue resolution.

Software Engineer

ThoughtWorks
Pune
10.2020 - 10.2021
  • Led front-end systems for large-scale enterprise products using Javascript, TypeScript, Vue 3, Angular, and Nx mono-repo
  • Partnered with AI teams to support front-end integration of ML outputs into configurable UI widgets and dashboards.
  • Led the transition from a monolithic frontend to a scalable micro-frontend architecture, improving modularity and team autonomy.

Senior Software Engineer

Nihilent Ltd.
Pune
09.2018 - 10.2020
  • Led development for AI-powered analytics tools visualizing neuroscience signals (EEG, facial emotion, eye tracking) using Python, D3.js, and Google Charts.
  • Built a reusable visualization and editing UI for the internal media R&D platform, leveraging Angular, Node.js, and Python microservices.
  • Supported backend integration with early ML models using REST-based pipelines and experimentation environments.

Front-End Developer

DJ Alexander
Pune
03.2017 - 08.2018
  • Delivered Angular 2+ migration for legacy real estate platforms, optimizing loading time and user interaction for property searches.
  • Implemented UI/UX enhancements that improved the search-to-selection conversion rate for real estate clients.

Application Developer

BNY Mellon
Pune
07.2016 - 02.2017
  • Developed batch reporting systems using COBOL and JCL, supporting high-throughput, big data pipelines.
  • Designed and developed anti-money laundering rules using Actimize (Oracle).
  • Developed batch reports using IBM Mainframe (COBOL).

Education

Bachelor of Engineering - E&TC

Pune Institute of Computer Technology (PICT)
Pune
05.2016

Skills

  • LLM and multi-agent systems: LangChain, LangGraph, LagnSmith, RAG pipelines, OpenAI, LLaMA 2, prompt engineering, text-to-SQL, agentic platforms, LLMs
  • AI infrastructure: Pinecone, Neo4j, GraphDB, Databricks Delta Lake, model serving, GCP, AWS
  • Programming and frameworks: Python, JavaScript, TypeScript, Nodejs, Vue 3, React
  • Cloud and DevOps: AWS (Lambda, EC2, Redis, CloudFront, S3), Terraform, Jenkins, GitHub Actions, and Bitbucket Pipelines
  • Testing and tooling: New Relic, Grafana, Jest, Cypress, Vue Testing Library, MSW
  • Architecture: micro frontends, mono-repo (Nx), scalable APIs

Blog Post

  • Evaluating AI Agents- https://vap1231.medium.com/evaluating-ai-agents-5fa61b8a815

Timeline

Sr. Tech Lead

CommerceIQ
08.2021 - Current

Software Engineer

ThoughtWorks
10.2020 - 10.2021

Senior Software Engineer

Nihilent Ltd.
09.2018 - 10.2020

Front-End Developer

DJ Alexander
03.2017 - 08.2018

Application Developer

BNY Mellon
07.2016 - 02.2017

Bachelor of Engineering - E&TC

Pune Institute of Computer Technology (PICT)
Vaibhav Phutane