Vaibhav Phutane

Sr. Tech Lead

CommerceIQ

Bangalore

08.2021 - Current

Multi-Agent Platform

Spearheading the design and development of a scalable multi-agent platform using LangGraph, LangSmith, and Python, enabling rapid orchestration and deployment of complex AI workflows.
Engineered a Supervisor Agent to manage sub-agent interactions, enforce task hierarchies, and coordinate autonomous agent behavior under human-in-the-loop or fully autonomous settings.
Designed and developed a prompt management system, with prompt versioning and eval control
Designed and developed a Data Agent that interprets natural language queries to retrieve data from multiple sources using a custom domain-specific language (DSL).
Designed and developed an eCommerce RCA Agent to analyze root causes behind drops in KPIs.

Agent Evaluation & Quality Frameworks

Designed a comprehensive evaluation strategy combining automated and LLM-based techniques:
Developed an eval dataset generation strategy, creating it manually and from traces using LangSmith
Built custom code evaluators for domain-specific use cases.
Implemented LLM-as-a-Judge evaluation, enabling nuanced, context-aware assessment of agent outputs.
Conducted extensive agent and sub-agent testing, including agent mocking to isolate specific behaviors.
Defined evaluation metrics such as Correctness, Precision, Recall, and F1 Score to benchmark performance across workflows.

Multi-Agent Platform

Developed an on-premise agent deployment pipeline with integrated CI/CD workflows, ensuring scalable, secure, and compliant rollouts across enterprise environments.

Insights Generation Platform

Architected and delivered a GenAI-powered Insights Generation Platform capable of interpreting any data visualization by extracting semantics from metrics (columns) and dimensions (rows), enabling contextual and actionable data analysis using Python, LangChain, LangSmith, JavaScript, and TypeScript.
Developed a Neo4j-based metric relationship graph, modeling interdependencies between business metrics (e.g., ROAS → Ad Sales, Ad Spend). Each metric node encapsulated rich metadata: name, description, priority, and business goal, facilitating traceability and automated causal inference.
Engineered dynamic insight generation using LLMs, combining structured graph data and semantic context to uncover key drivers and anomalies across multiple business dimensions. The system could intelligently expand dimensional hierarchies, such as navigating from category → subcategory → SKU level, to produce fine-grained insights.
Created a suite of transformer functions to translate complex JSON data structures into fluent, human-readable sentences, improving the interpretability and accuracy of AI-generated insights.
Implemented adaptive metric inference by integrating metric relationships during insight construction, allowing the platform to auto-supplement missing but relevant metrics based on learned dependencies.

Text to SQL

Designed and led the development of a Text-to-SQL agent with modular components for table metadata retrieval, query generation, validation, and human-in-the-loop feedback using LangGraph, LangChain, OpenAI (GPT-4O), and LLaMA models, Python.
Evaluated few off the shelf offering like Gennie from databricks, defog, star coder etc
Enabled natural language to SQL conversion with robust validation and reflection mechanisms for improved accuracy and usability.

Content Recommendation.

Built a GenAI content recommendation system to analyze trending keywords, PDP content (titles, descriptions, FAQs), and retailer rules using Python, LLM (GPT-4o models), LangChain, and TypeScript.
Generated optimized titles, bullets, and descriptions for Amazon, Walmart, and Target to improve relevance and compliance.

Sentiment Analysis.

Designed and developed a multilingual sentiment analysis Chrome plugin to help category managers monitor product quality, shipping issues, and customer feedback using LangChain, LLMs, Python, and JavaScript.
Enabled real-time sentiment insights across languages to drive faster decision-making and issue resolution.

Software Engineer

ThoughtWorks

Pune

10.2020 - 10.2021

Led front-end systems for large-scale enterprise products using Javascript, TypeScript, Vue 3, Angular, and Nx mono-repo
Partnered with AI teams to support front-end integration of ML outputs into configurable UI widgets and dashboards.
Led the transition from a monolithic frontend to a scalable micro-frontend architecture, improving modularity and team autonomy.

Senior Software Engineer

Nihilent Ltd.

Pune

09.2018 - 10.2020

Led development for AI-powered analytics tools visualizing neuroscience signals (EEG, facial emotion, eye tracking) using Python, D3.js, and Google Charts.
Built a reusable visualization and editing UI for the internal media R&D platform, leveraging Angular, Node.js, and Python microservices.
Supported backend integration with early ML models using REST-based pipelines and experimentation environments.

Front-End Developer

DJ Alexander

Pune

03.2017 - 08.2018

Delivered Angular 2+ migration for legacy real estate platforms, optimizing loading time and user interaction for property searches.
Implemented UI/UX enhancements that improved the search-to-selection conversion rate for real estate clients.

Application Developer

BNY Mellon

Pune

07.2016 - 02.2017

Developed batch reporting systems using COBOL and JCL, supporting high-throughput, big data pipelines.
Designed and developed anti-money laundering rules using Actimize (Oracle).
Developed batch reports using IBM Mainframe (COBOL).

Summary

Overview

Work History

Sr. Tech Lead

Software Engineer

Senior Software Engineer

Front-End Developer

Application Developer

Education

Bachelor of Engineering - E&TC

Skills

Blog Post

Timeline

Sr. Tech Lead

Software Engineer

Senior Software Engineer

Front-End Developer

Application Developer

Bachelor of Engineering - E&TC

Similar Profiles

Megha ParasarMegha Parasar

Vikas SawantVikas Sawant

Somnath ASomnath A

SAGAR ZODAGESAGAR ZODAGE

Narpati JhaNarpati Jha