Summary
Overview
Work History
Education
Skills
Personal Information
Languages
Some Notable Generative AI Projects:
Timeline
Generic
Prateep Sengupta

Prateep Sengupta

Kolkata

Summary

Adept at spearheading projects with a focus on Generative AI & LLMs and Deep NLP, I enhanced Sinclair Broadcasting Group's live captioning at IBM, showcasing my Python expertise and innovative problem-solving skills. My work, marked by significant efficiency improvements and client satisfaction, reflects a blend of technical proficiency and strategic acumen.

Overview

14
14
years of professional experience
23
23
years of post-secondary education
3
3
Languages

Work History

Lead Data Scientist

IBM
10.2021 - Current
  • Led the "Watson Live Captioning" project for Sinclair Broadcasting Group, focusing on Speaker Change Detection using Deep Networks and LLM-based Punctuation Restoration, for their live captions generated with the news broadcasts.
  • Innovated sentence-transformer based F1 evaluation algorithms for both Speaker Change Detection and Punctuation Restoration, improving accuracy and efficiency.
  • Over the past 1.5 years, I have concentrated extensively on Generative AI, taking the lead in numerous internal POCs and client projects based on LLMs. My role has encompassed both architectural design and hands-on development, utilizing a wide range of technologies, including platforms like IBM WatsonX & Databricks, models ranging from the traditional BERT, Distil-BERT & RoBERTa to the modern ones like OpenAI's GPT3.5 & GPT-4, Google's Gemini, Meta's Llama2 & Llama3, frameworks like Langchain, LlamaIndex and dspy, advanced retrieval techniques like RAG & Hierarchical Index Retrieval, as well as vector databases such as Chroma, FAISS, & Milvus.
  • Some of the GenAI clients I've worked for in the past year and a half are - Toyota, Nestle, Keppel, Unilever UK, Google, Diageo and Electrolux.
  • I am currently working as a Data Science Manager in a GenAI project for one of our clients, Lloyds Banking Group, UK.

Solution Integrator - Data Analytics

Ericsson
05.2021 - 10.2021

Was employed within the BDGS department at Ericsson until early October 2021, contributing to the OSS Analytics team. Primarily served as a subject matter expert in Python and Machine Learning, with a focus on Google Cloud Platform (GCP) development. Engaged in the development of the Ericsson Product Real-Time Performance Management (RTPM) system for MBNL UK, leveraging AI for alarm detection and monitoring.

Software Engineer - Machine Learning

Ericsson
05.2019 - 05.2021
  • As a proficient AI/ML Expert, I contributed significantly to the development of Ericsson's innovative product, VILA (Visually Impaired Life Assistant). Within this role, I worked on various key initiatives, including:
    1. Implementation of Unknown Face Clustering algorithms to enhance user identification capabilities.
    2. Deployment of advanced Face Recognition and Facial Analytics techniques, ensuring robust performance and accuracy.
    3. Development of a Face Image upload and aggregation pipeline using Kafka/Spark, optimizing data handling and processing efficiency.
    4. Design and implementation of a Menu Card Reading system utilizing OCR technology, facilitating accessibility for visually impaired users.
    5. Utilization of SIFT, SURF, and ORB based pattern recognition algorithms to enhance visual understanding and interpretation.
    6. Creation of RESTful services using Django-REST Framework to support seamless integration and communication within the system.
  • I have worked as an AI/ML expert for the Ericsson product CSD (Cognitive Support Desk), leveraging advanced clustering techniques to develop a robust solution for grouping customer emails, enhancing operational efficiency and customer service quality.
  • Furthermore, my expertise extended to British Telecom's specific customization on the Ericsson DevOps product Rosetta (Django), where I functioned as a Python Subject Matter Expert, delivering tailored solutions to meet client requirements effectively.
  • Additionally, I served as an integral member of the core team in Kolkata for the Ericsson product EEA (Ericsson Expert Analytics), contributing to operational strategies and providing occasional L2 support.

Python Developer

Sirchend Softwares
11.2018 - 04.2019

As a Python Developer and Machine Learning Engineer at Sirchend Softwares, the software development division of Incorp Infotech, I spearheaded projects showcasing specialized expertise in Data Science, Natural Language Processing (NLP), and Chatbot development:

1. Led Data Visualization and backend development initiatives for a Health report generator app. Employed Flask (Python), ChartJS, and MySQL (Flask-SQLAlchemy) to enhance user experience and data management efficiency.

2. Conducted Research and Development (R&D) utilizing the IBM Watson Stack, including Watson Assistant and Tone Analyzer, to craft an innovative customer bot tailored for a real estate website.

3. Designed and executed the development of a CRUD (Create, Read, Update, Delete) service integrated with the chatbot utilizing the Sails framework (Sails.js), ensuring streamlined communication and enhanced user interactions.

Associate Engineer - NLP

Web Spiders
06.2018 - 10.2018


  • Advanced Email Bot Framework: Led R&D efforts on a proof-of-concept to enhance an existing email bot (Zoe-Email). Leveraged Flask (Python) and Google Cloud AutoML to achieve improved automation and efficiency.
  • RASA Chatbot Development: Spearheaded the design and development of a chatbot from the ground up using the RASA stack (RASA NLU & Rasa Core), demonstrating proficiency in natural language understanding and conversational AI.
  • Chatbot Optimization & Deployment: Maintained, trained, and tested chatbots built on proprietary platforms (Zoe, e2m) for a diverse client base, ensuring high performance and user satisfaction.
  • Client Engagements & Impact: Successfully deployed and managed chatbots for major brands including Visa, Hammerson UK (Les 3 Fontaines Mall), and Reed Exhibitions (Vision Expo West, G2E 2018, Chicago Comic Conference, etc.)
  • Contributed to additional Python development and machine learning initiatives as needed.

Data Science Developer

PatientMD
04.2017 - 05.2018

As a Data Science Developer at PatientMD (InnovationStrat Consulting LLC), I worked mainly in NLP, Chatbot Development and Web Data Mining, utilizing Python and Scala for coding. Key projects include:

  • Web Data Mining for PatientMD's Doctor App: Utilized Scrapy and Selenium for efficient extraction.
    Concurrent Streaming Application: Developed a real-time stock details streaming app using the Akka Framework (Scala).
  • Sentiment Analyzer Development: Employed Python libraries such as TextBlob, NLTK, and Spacy for sentiment analysis research and implementation.
  • Financial News Crawler: Designed and implemented a crawler to extract financial news articles from various finance websites.
  • Speech Recognition Module: Developed a Speech Recognition Module integrated into the healthcare chatbot as a Flask REST service.
  • Core Backend Functionality: Assisted in debugging and occasional development of core backend functionalities. Developed a preliminary version of the shopping cart API for genomics products using Play framework (Scala).

Senior Technology Associate

IBC Consultants
09.2016 - 03.2017

In this position, I undertook the following key responsibilities:

1. Led project management initiatives.
2. Successfully managed and handled client interactions.
3. Conducted extensive research in technology.
4. Executed email marketing campaigns utilizing Python with platforms such as Zoho & MailChimp
5. Utilized Python for web data mining and web content extraction.
6. Developed Excel VBA solutions and implemented automation processes to streamline workflows.

Technology Associate

IBC Consultants
01.2016 - 08.2016

In this position, my duties encompassed:

1. Conducting web scraping and crawling operations using Python.
2. Employing Python for email marketing strategies and automation, ensuring targeted outreach and increased engagement.

3. Crafting macros utilizing Excel VBA to enhance efficiency and streamline processes.
4. Optionally, engaging in secondary research endeavors as needed.

Junior Project Assistant

Indian Institute of Technology, Kharagpur
11.2010 - 08.2011
  • I served as a Junior Project Assistant at IIT Kharagpur, contributing to the "Automatic Speaker Recognition on VoIP" project supported by Vodafone Essar Ltd. My responsibilities included developing a preliminary version of a speaker recognition system compatible with VoIP platforms such as Skype.
  • Key contributions involved the implementation of algorithms such as MFCC feature extraction for speech files and the GMM-UBM Algorithm to construct a universal speaker model.
  • Throughout this role, I demonstrated proficiency in bash scripting, Perl, MATLAB and a little C Programming when needed.

Education

M.Tech - Electronics And Communication Engineering

SurTech
04.2001 - 08.2015

B.Tech - Electronics And Instrumentation Engineering

Academy of Technology
04.2001 - 07.2010

Skills

  • Generative AI & LLMs

  • Machine Learning Theory

  • Deep Natural Language Processing

  • Python

undefined

Personal Information

Languages

4,5,6

Some Notable Generative AI Projects:

Google: SAP Test Cases Generation

The use-case dealt with using the SAP process design documents (PDD) to generate test cases using GenAI. 

Here, a large repository of SAP best practices documents were used to implement a RAG (Retrieval Augmented Generation).


Diageo: WRICEF Test Cases Generation

The use-case dealt with using the SAP functional specs design documents (FSD/WRICEF) to generate test cases using GenAI. 

Here, a large repository of SAP best practices documents were used to implement a RAG (Retrieval Augmented Generation). 

Also, a customized prompt approach was implemented based on different WRICEF type.


Unilever UK: GenAI for Artwork

1.The user-friendly interface of this tool automatically verifies artwork by comparing it to a range of predetermined elements.
2.The tool identifies any text or elements that are absent but should be present based on the original reference document.
3.Machine learning is employed to detect any inaccurate translations.
4.The tool verifies claims to determine their validity.
5.It examines the artwork to ensure the inclusion of company-specific branding elements.
6.The tool detects any inaccuracies in durability dates, nutrition information, or other mandatory elements.
7. The second part of the use-case was to generate the package design looking into the requirements, logo etc.


Keppel Singapore - GenAI Bot for Project Management

1. A chatbot was developed to answer questions related to - Schedule, Contracts/Suppliers and Risk
2. It leveraged a RAG knowledge-base (Retrieval Augmented Generation) created using Langchain.
3. It used LLM models to find out the suitable answers to be asked by the project managers.


Toyota: Warranty Analytics using GenAI

1. Toyota business user will be asking warranty related queries in Natural Language through a chatbot.
2. Chatbot will be using LLM under the hood and provide answer to those queries based on documents (PDF/Word/Excel etc.) stored in Box, SharePoint, etc.
The System used a Retrieval Augmentation Generation (RAG) knowledge-base to answer the queries.


Nestle: Test Coverage Advisor

The use-case comprised of two parts. The first part was to generate test cases from the SAP process flow diagrams. 

The diagrams were first converted into BPMN xml files, and then the xml files were converted to text. The text was then fed into the LLM to generate the test cases.
The second part was to use the LLM generated test cases and compare those with the standard test cases for the specific SAP process flow and calculate the test coverage percentage.




Timeline

Lead Data Scientist

IBM
10.2021 - Current

Solution Integrator - Data Analytics

Ericsson
05.2021 - 10.2021

Software Engineer - Machine Learning

Ericsson
05.2019 - 05.2021

Python Developer

Sirchend Softwares
11.2018 - 04.2019

Associate Engineer - NLP

Web Spiders
06.2018 - 10.2018

Data Science Developer

PatientMD
04.2017 - 05.2018

Senior Technology Associate

IBC Consultants
09.2016 - 03.2017

Technology Associate

IBC Consultants
01.2016 - 08.2016

Junior Project Assistant

Indian Institute of Technology, Kharagpur
11.2010 - 08.2011

M.Tech - Electronics And Communication Engineering

SurTech
04.2001 - 08.2015

B.Tech - Electronics And Instrumentation Engineering

Academy of Technology
04.2001 - 07.2010
Prateep Sengupta