Summary
Overview
Work History
Skills
Education
Projects
Certification
Accomplishments
Websites
Timeline
Work Availability
Interests
Generic
DIGVIJAY PHUTANE

DIGVIJAY PHUTANE

Data Scientist
Mumbai

Summary

Data Scientist with 2.6+ years of experience designing and deploying AI-driven solutions and machine learning models. Skilled in Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and AI agent frameworks such as LangChain and LangGraph. Proven ability to build scalable, production-grade applications using Python, FastAPI, and Flask on AWS. Experienced in modern AI development workflows, leveraging tools like Claude and GitHub Copilot for rapid prototyping and efficient development, and proficient in designing and automating AI agent workflows using n8n.

Overview

3
3
years of professional experience
2
2
Certification

Work History

Data Scientist

Neosoft Technologies Pvt Ltd
Mumbai
10.2023 - Current
  • Designed and implemented AI agent-based systems using LangChain, LangGraph, CrewAI, and N8n to automate complex multi-step workflows and enhance model-driven decision-making.
  • Built end-to-end Retrieval-Augmented Generation (RAG) pipelines integrating GPT and LLaMA models with vector databases (Qdrant, Chroma) for intelligent document retrieval and context-aware responses.
  • Developed end-to-end AI/ML pipelines following industry best practices, including data ingestion, preprocessing, model training, evaluation, deployment, monitoring, and retraining.
  • Developed real-time computer vision solutions using OpenCV and YOLO for object detection, image classification, and video analytics, including custom dataset preparation from diverse image and video sources.
  • Built and deployed scalable REST APIs using FastAPI and Flask to serve machine learning models in production environments.

Skills

  • Python
  • Machine Learning
  • Deep Learning
  • Large Language Models (LLMs)
  • Retrieval-Augmented Generation (RAG)
  • Natural Language Processing (NLP)
  • Transfer Learning
  • Model Fine-tuning
  • LangChain
  • LangGraph
  • CrewAI
  • TensorFlow
  • PyTorch
  • Keras
  • Scikit-learn
  • OpenCV
  • YOLO
  • FastAPI
  • Flask
  • Microsoft SQL Server
  • MySQL
  • MongoDB
  • Qdrant
  • Chroma
  • HTML
  • CSS
  • Gradio
  • Streamlit
  • AWS
  • Docker
  • N8n (AI agent workflows)
  • Claude
  • GitHub Copilot

Education

Bachelor of Engineering - Information Technology

VPP College of Engineering
01.2023

HSC 12th - Science

S.S Jr College Seawoods
01.2019

Projects

  • GenAi Employee Profile & Job Description Matching System:

Built an AI employee-job matching system using LLMs (Groq/Ollama) for semantic analysis, improving match accuracy., Implemented RAG pipelines with Qdrant/Chroma for context-aware candidate evaluation., Developed LangChain-based agents for JD parsing, profile analysis, and scoring., Created a weighted scoring system for explainable candidate ranking., Built scalable Python (Flask) APIs for file ingestion and real-time processing., Optimized multi-LLM workflows for low latency and cost efficiency.

  • AI Video Indexer / Video Analyzer:

Developed a full-stack GenAI-powered video indexing and analysis platform using Python, FastAPI, and Next.js to enable intelligent search, faster video insights, and reduced manual review effort., Built an end-to-end video understanding pipeline leveraging LLMs, embeddings, and vector search (ChromaDB) to enable semantic retrieval of relevant video segments from natural language queries., Implemented multimodal AI capabilities using YOLO, OpenCV, and speech-to-text models for object detection, scene segmentation, face recognition, and transcription, transforming unstructured video into structured, searchable metadata., Designed scalable backend architecture with MongoDB, REST APIs, and asynchronous processing to handle large-scale video ingestion, processing, and storage efficiently.

  • SBI Mutual Fund:

Smart Bot, Developed a RAG-based GenAI chatbot using GPT-4 and FastAPI to enable intelligent query handling across financial documents and APIs, improving information accessibility and response accuracy., Built end-to-end document ingestion pipelines including PDF parsing, chunking, and embedding generation using text-embedding-ada-002 for efficient semantic search., Implemented context-aware retrieval using FAISS/Chroma vector databases to deliver accurate, domain-specific, and compliant responses., Enhanced system performance by optimizing RAG workflows and prompt engineering, improving relevance for financial domain queries.

  • GenAI Resume Screener Application:

Developed a GenAI-powered resume screening system using LLaMA 3.2 (Ollama) for automated parsing, summarization, skill evaluation, and question generation., Built an end-to-end pipeline for PDF ingestion, text extraction, embedding generation, and semantic analysis to enable structured candidate evaluation., Implemented LLM-driven summarization and interview question generation with parallel processing for faster response times., Designed a custom scoring framework to classify skills and generate fit-based decisions (Fit/Partial/Not Fit)., Developed an interactive UI with MongoDB integration for real-time processing and result management.

  • Ingenero Digitization:

Built and deployed predictive ML models (Linear Regression, SVM, Random Forest) for operational forecasting and data-driven decision support., Developed Autoencoder-based anomaly detection systems for early identification of abnormal operating conditions and failure prediction., Designed automated feature selection pipelines for high-dimensional datasets (100+ features), improving model performance and robustness., Implemented a Live Benchmarking Model (LBM) to derive optimal benchmark values from historical plant data for performance comparison., Developed a CoilSim-based simulation model to estimate missing process variables (X components) for accurate prediction of target outputs (Y components)., Built a ranking and optimization model to evaluate furnace efficiency and optimize production by adjusting feed rates and COT (Coil Outlet Temperature).

  • SmartServe AI:

Developed a RAG-based food ordering chatbot for a hotel, leveraging LangChain, Groq LLM, and Qdrant to enable intelligent menu search and natural language interactions., Designed and implemented intent classification and a guided, multi-step order management workflow (item selection, customization, quantity, delivery details) with validation and error handling., Built an end-to-end retrieval pipeline using embeddings and vector search to provide context-aware responses and accurate menu recommendations., Integrated WhatsApp communication via Twilio for real-time order confirmations and customer notifications., Engineered a scalable FastAPI backend with session management, REST APIs, and a responsive web interface for seamless user experience., Enabled dynamic menu updates through PDF/DOCX ingestion, automated text processing, and vector indexing for up-to-date knowledge retrieval.

Certification

  • Data Science with Python, Simple Learn Certification
  • Python, IBM Global Certification

Accomplishments

  • Patent, Developed an innovative pothole detection and filling using recycled plastic waste for sustainable infrastructure.
  • E-Summit Maharashtra Hackathon, Runner-Up

Timeline

Data Scientist

Neosoft Technologies Pvt Ltd
10.2023 - Current

Bachelor of Engineering - Information Technology

VPP College of Engineering

HSC 12th - Science

S.S Jr College Seawoods

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Interests

IOT , Painting , Badminton

DIGVIJAY PHUTANEData Scientist