Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
Generic

Sudharshan Sunder Gowrisankar

Chennai

Summary

Seasoned Data Scientist with a proven track record at Blackstraw.AI, leveraging Python and Machine Learning to pioneer AI-driven solutions, achieving an 80% accuracy in invoice data extraction. Demonstrated leadership in optimizing deep learning models, showcasing exceptional analytical skills and a commitment to innovation in data processing and analysis.

Overview

5
5
years of professional experience

Work History

Data Scientist

Blackstraw.AI
Chennai
05.2024 - Current
  • Developed AI-driven solutions for automating invoice data extraction, leveraging LLM fine-tuning, HTML-to-text conversion, and advanced scraping techniques to achieve 80% accuracy in parsing diverse email formats into structured JSON for analytics.
  • Built an AI-based email parsing system using APIs, OCR models, and rule-based engines, automating email-to-JSON conversion, attachment processing, and error handling for efficient data management and reporting.
  • Designed a vector-based system to extract and validate UPC IDs using mobile-captured images, implementing custom computer vision models (ResNet-15) to improve retrieval accuracy and reduce memory usage, achieving 70% MAP and 80% Top-K retrieval rates.
  • Created a robust pipeline for processing large-scale email invoice data, utilizing Python, Pandas, Beautiful Soup, and PyTorch for scalable and efficient automation across diverse datasets.
  • Led data augmentation and fine-tuning initiatives for deep learning models (Mistral 8B, Llama 3.1, Qwen 7B), optimizing hyperparameters and model performance for complex enterprise data extraction tasks.

Software Engineer - Data Science

Expleo
04.2022 - 05.2024
  • Developed and Deployed HR Policy Chatbot Using AI Foundational Technologies: Built a Retrieval-Augmented Generation (RAG) chatbot leveraging LLMs (GPT-3.5), LangChain, and vector databases to provide accurate, contextually relevant answers to HR-related queries (e.g., leave, insurance). Processed HR policy documents (PDFs) by chunking, vectorizing, and storing them in a vector database for efficient retrieval. Integrated the solution into a Streamlit/Chainlit interface with real-time streaming for enhanced user interaction.
  • Implemented RAG Pipeline with Advanced Prompt Engineering: Designed a RAG pipeline to retrieve relevant document chunks using vector databases and feed them to GPT-3.5 for generating responses. Utilized prompt engineering techniques to optimize output quality and evaluated pipeline performance using RAGAS metrics, ensuring high context relevance and accuracy.
  • Developed SQL Schema Comparator Tool with Advanced SQL/Python: Created a Python-based web tool for comparing schemas and detecting anomalies (e.g., missing data, duplicates, mismatches) during SQL database migrations. Leveraged Advanced SQL and Python to process and analyze complex data models, ensuring data integrity. Generated detailed reports with visualizations and exported results in Excel for validation.
  • Built Synthetic Data Generation Tool Using AI and Data Modeling: Designed a Streamlit-based web tool to generate synthetic data from CSV, TSV, SQL tables, and other formats. Utilized the Synthetic Data Vault (SDV) framework and Advanced Python to replicate patterns in original data, enabling realistic data generation for testing, training, and analysis. Enabled easy download or direct database upload for seamless integration.
  • Leveraged AI Tools for Intelligent Automation and Data Analysis: Demonstrated a strong ability to implement intelligent assistants and automation using LLMs, LangChain, and RAG. Processed and analyzed complex data models using Advanced SQL/Python and other programming tools to deliver scalable, customizable, and user-friendly solutions.
  • Enhanced Data Accessibility with AI-Driven Solutions: Focused on improving user experience by integrating AI foundational technologies like LLMs, vector databases, and prompt engineering into tools. Provided real-time interaction, detailed visual reports, and downloadable outputs, ensuring accessibility and usability across all developed solutions.

Data Analyst

Moving Walls
Chennai
10.2020 - 03.2022
  • Clustering unknown latitude - longitudes with known latitude - longitudes to separate groups.
  • Segment and demography analysis of consumers.
  • CRUD Operations of Summary Information.
  • Campaign reports and data visualization.
  • Creating automated reports on platform.
  • Generating insights out of the reports with regards to audience movement data and their profile.
  • Creating Automated Scripts in Python linking with Redshift and MongoDB
  • Research Study projects

Data Science Research Intern

Tech Mahindra
Hyderabad
02.2020 - 08.2020

Education

Post Graduate Program - Data Science And Engineering

Great Lakes Institute of Management
Chennai
12-2019

Bachelor of Engineering - Electrical And Electronics Engineering

Panimalar Engineering College
Chennai
05-2018

Master of Technology (pursuing) - Artificial Intelligence

SRM Institute of Technology
Chennai

Skills

  • Python
  • SQL
  • Machine Learning
  • Natural Language Processing
  • Deep Learning
  • Generative AI

Accomplishments

WOW Award for Q2 2022 Best Performance

Timeline

Data Scientist

Blackstraw.AI
05.2024 - Current

Software Engineer - Data Science

Expleo
04.2022 - 05.2024

Data Analyst

Moving Walls
10.2020 - 03.2022

Data Science Research Intern

Tech Mahindra
02.2020 - 08.2020

Post Graduate Program - Data Science And Engineering

Great Lakes Institute of Management

Bachelor of Engineering - Electrical And Electronics Engineering

Panimalar Engineering College

Master of Technology (pursuing) - Artificial Intelligence

SRM Institute of Technology
Sudharshan Sunder Gowrisankar