Summary
Overview
Work History
Education
Skills
Projects
Publications
Additional Information
Timeline
Generic

ABHRANIL DAS

Kolkata

Summary

Interested in and learning Data Science, Machine Learning and Artificial Intelligence, especially Computer Vision and Natural Language Processing. Skilled in Python programming and libraries like Numpy, Pandas, Scikit-learn, Pytorch, Tensorflow and NLTK

Currently exploring Generative AI with strong emphasis on LLMs.

Overview

1
1
year of professional experience

Work History

Data Scientist Intern

TVS Motor Company Ltd
05.2023 - 07.2023
  • Designed dataset-agnostic Data Profiling and Feature Correlation Mining Tool that generates comprehensive data profile for any dataset irrespective of size
  • Performed data cleaning, preprocessing and exploratory data analysis using NumPy and Pandas on various datasets averaging 50000+ entries and 20+ features
  • Used Scikit-learn and Statsmodels to extract trends and relationships between features in dataset, achieving score of over 90% on various metrics across all datasets
  • Displayed results on interactive dashboard using Matplotlib and Streamlit, achieving average profiling time of 4.36 minutes

Machine Learning Engineer Intern

Saptang Labs Pvt Ltd
05.2022 - 07.2022
  • Developed Named Entity Recognition (NER) and Entity Linking for disambiguation of named entities in dataset of 50000+ news articles
  • Fine-tuned DistilBERT model to extract named entities from corpus of 1000+ articles, achieving average inference time of 4.5 s
  • Leveraged entity attributes for disambiguation, mapped entities to unique identifiers in DBPedia knowledge base and queried it using PySPARQL to retrieve RDF triples
  • Utilized information present in RDF triples and created knowledge graph using Spacy to store information about entities and their relationships for efficient information retrieval and semantic analysis

Education

Bachelor of Technology - Electronics and Electrical Engineering

Indian Institute of Technology
Guwahati, India
07-2024

Skills

  • Python programming
  • SQL databases
  • Linear algebra
  • Probability theory
  • Statistical analysis
  • Data Science
  • Machine learning
  • Deep learning and neural networks
  • Computer vision
  • Natural language processing
  • Large language models (LLMs)
  • Generative AI

Projects

LLM Prompt Recovery
An automated pipeline that uses LLMs to recover prompts from AI-generated texts using prompt engineering


• Used the Gemma 2-2B-Instruct model to recover prompts from AI-generated text
• Created a custom fine-tuning dataset from the Wikipedia Movie Plots dataset, consisting of movie plots rewritten using a wide range of prompts
• Fine-tuned the model using PEFT (QLoRA), achieving a ROUGE-L score of 0.37 and a Sharpened Cosine Score (SCS) of 0.59


Anomaly Detection in Surveillance Footages
An automated ML pipeline for the detection of violence, theft, and other anomalies in surveillance footages


• Built an end-to-end CNN + LSTM neural network model for detection of anomalies in surveillance footages using Tensorflow, and deployed using Gradio
• Used pre-trained VGG16 network for extracting spatial features from video frames and LSTM network for modelling the temporal relationship between frames
• Trained the model on clips from the Hockey Fight Detection Dataset, achieving an accuracy of 94% in anomaly detection on the validation dataset
• Extended the model for precise localization of anomalies in longer videos, with scope of extension to real-time monitoring and surveillance


CaptionCraft: Leveraging Transformers for Image Captioning
Enhancing visual storytelling by using transformers to generate intelligent and detailed image captions


• Created an image captioning application using Pytorch and Transformers (HuggingFace)
• Utilized pre-trained Vision Transformer (ViT) to extract high-quality features from images, and GPT-2 decoder to generate comprehensive and contextually-rich captions
• Fine-tuned the model end-to-end on the Flickr 8K Dataset, achieving a ROUGE-L score of 0.28 on the validation split

Publications

Anirban Dasgupta, Abhranil Das, Parishmita Deka, Soham Das, "Smart Embedded Systems - Advances
and Applications", Smart Cabin for Office using Embedded Systems and Sensors, CRC Press, Taylor and
Francis Group, pp.360 [2023] [Link]

Additional Information

  • Notebooks Expert, Kaggle: Best-ever rank of 3053 among more than 19.6 M users on the platform
  • JEE Advanced 2020: Secured AIR 1871 among 1,50,000 candidates appearing for the test
  • KVPY SX 2019: Obtained KVPY fellowship by securing an AIR of 303

Timeline

Data Scientist Intern

TVS Motor Company Ltd
05.2023 - 07.2023

Machine Learning Engineer Intern

Saptang Labs Pvt Ltd
05.2022 - 07.2022

Bachelor of Technology - Electronics and Electrical Engineering

Indian Institute of Technology
ABHRANIL DAS