Summary
Overview
Work History
Education
Skills
Sideprojects
Certificationsandachievements
Skills
Timeline
Generic

SUBHAJIT GUHA THAKURTA

Kolkata

Summary

Senior Data Science Engineer with over 3 years of experience in Artificial Intelligence and Machine Learning, specializing in developing innovative solutions using Generative AI and Large Language Models (LLMs) for the past 1.9 years. Proficient in leveraging advanced AI technologies to drive impactful business outcomes and deliver cutting-edge data-driven insights.

Overview

3
3
years of professional experience

Work History

Senior Systems Engineer/ ML Engineer

Infosys Ltd.
10.2023 - Current
  • Company Overview: Client: US Based Technology Company
  • The client needed efficient document parsing for real estate files (PDFs, images, text) and real-time, natural language queries from databases like Snowflake and AWS S3
  • Implemented a Retrieval-Augmented Generation (RAG) system for document parsing with LLMs, achieving ~22% accuracy
  • Developed Text-to-SQL models using LLMs and PandasAI, improving query efficiency by 30% and enabling real-time insights in plain English
  • To improve short-term sales forecasting accuracy for the client's products, we developed a Time Series Analysis model using ARIMA, which captured immediate trends and resulted in a 37.5% improvement in forecast precision
  • For more accurate long-term predictions, we implemented LSTM-based machine learning algorithms, enabling the model to better capture complex seasonal patterns in client transaction data, improving forecasting accuracy by 40.83%
  • Client: US Based Technology Company

Systems Engineer

Infosys Ltd.
09.2021 - 09.2023
  • Company Overview: Client: US Based Technology Company
  • The client required precise forecasting models for global and client-specific COVID-19 cases to enhance public health and business planning
  • We applied regression techniques to analyze trends and used FBProphet for time-series forecasting, enabling reliable predictions for informed decision-making
  • Collaborated with clients to interpret model outputs, providing actionable insights and recommendations for public health strategies and business continuity planning
  • The client required an automated solution to accurately detect business-specific sentiment in customer reviews and feedback
  • I developed and implemented a sentiment analysis bot using NLP techniques and a BERT model, integrating it with existing customer service platforms for real-time tracking
  • This improved feedback processing speed by 25% and increased sentiment detection accuracy to 43%
  • Developed a data extraction function using OpenCV to efficiently extract structured data from PDFs and text files, significantly reducing errors
  • By collaborating with Data Engineers, we automated the ETL process and optimized data loading into Snowflake, achieving 20% extraction accuracy and cutting data processing time by 10%
  • Client: US Based Technology Company

Education

M.TECH - Information Technology

Kalyani Government Engineering College, MAKAUT
Kalyani, West Bengal
01.2021

B.TECH - Information Technology

Guru Nanak Institute Of Technology , MAKAUT
Kolkata, West Bengal
01.2019

Skills

  • NumPy
  • Pandas
  • Matplotlib
  • PySpark
  • Snowflake
  • MongoDB
  • SQL
  • TensorFlow
  • PyTorch
  • Scikit-Learn
  • FbProphet
  • BERT
  • TextBlob
  • OpenCV
  • EasyOCR
  • Streamlit
  • Flask
  • Docker
  • AWS S3
  • Falcon-7B
  • Client LLM
  • Ollama
  • Milvus
  • Chroma
  • PandasAI

Sideprojects

  • SDE Salary Prediction, Developed a Salary Prediction Model: Built a Python-based application utilizing the Random Forest Algorithm to predict the global salaries of Software Developers for the year 2023, using data from the Stack Overflow Developer Survey., Streamlit Deployment: Hosted the application on Streamlit, providing an interactive platform for users to explore salary predictions by simply clicking a button.
  • Self-Driving Car, Developed an Advanced Steering Angle Prediction System: Engineered a sophisticated machine learning model that almost accurately predicts the steering angle for a self-driving car., Innovated with Convolutional Neural Networks (CNNs): Leveraged a custom CNN architecture to meticulously analyze visual data, translating complex camera feeds into precise steering commands.
  • World Cup Tracker, Real-Time World Cup Statistics: Developed an interactive application that offers up-to-date statistics and information on Cricket and Soccer World Cups., Integrated Technology Stack: Utilized LangChain to handle complex natural language queries. Employed ChromaDB to manage and quickly retrieve large datasets. Implemented the Gemini model to ensure precise and context-aware answers, enhancing user experience with relevant and timely responses.

Certificationsandachievements

  • Infosys certified Machine Learning Programmer.
  • Infosys certified GEN AI Professional.
  • Infosys RISE Award: Led a team of three members on a sales forecasting project, delivering improved prediction accuracy within the stipulated time.
  • Infosys INSTA Award: Completed a POC on Fine-tuning and RAG-based solutions, demonstrating how they can be integrated into the project, which was well-appreciated by the business.

Skills

NumPy, Pandas, Matplotlib, PySpark, Snowflake, MongoDB, SQL, TensorFlow, PyTorch, Scikit-Learn, FbProphet, BERT, TextBlob, OpenCV, EasyOCR, Streamlit, Flask, Docker, AWS S3, Falcon-7B, Client LLM, Ollama, Milvus, Chroma, PandasAI

Timeline

Senior Systems Engineer/ ML Engineer

Infosys Ltd.
10.2023 - Current

Systems Engineer

Infosys Ltd.
09.2021 - 09.2023

B.TECH - Information Technology

Guru Nanak Institute Of Technology , MAKAUT

M.TECH - Information Technology

Kalyani Government Engineering College, MAKAUT
SUBHAJIT GUHA THAKURTA