Summary
Overview
Work History
Education
Skills
EXTRACURRICULAR ACTIVITIES
Certification
RELEVANT COURSES
Timeline
Generic

MD MEHEBUB SHAIKH

Analyst Data Science
Kolkata,WB

Summary

Knowledgeable Data Scientist with strong foundation in data analysis, machine learning, and statistical modeling. Successfully developed predictive models and data-driven solutions to optimize business performance. Demonstrated proficiency in Python and SQL, leveraging data visualization tools to communicate insights effectively.

Overview

1
1
year of professional experience
5
5
Certifications
3
3
Languages

Work History

Analyst Data Science

Infosys Ltd.
08.2025 - Current
  • Dexapp

Currently working on creating a testing environment for the demand planning screen on Dexapp. Created a new page for the Truck Dispatch plan UI. Solved technical issues related to demand planning books, and helped team members to solve the issues. Documented everything and gave KT to team members.

  • RAG-based Document Search Engine.

This project provides a user-friendly web interface to upload documents, ask a question, and retrieve relevant documents based on the query. It consists of an HTML/JavaScript frontend and a Python/Flask backend that simulates the retrieval part of a RAG system. It uses two foundation models, one model for vector embedding, and another model for generating a summary of the retrieved documents.

Data Scientist Intern

Amazon
01.2025 - 06.2025
  • Statistical Impact Assessment Framework Development

The objective of this study is to create an automated framework to identify key drivers for AWS customer support metrics changes between two periods. It takes metric name and period (week, month, year) as user input, and then provides key drivers and data-driven business decisions as output. For this analysis, statistical tests (parametric and non-parametric), correlation analysis, central tendency and dispersion analysis, Autogluon ML model, and language model are used.

  • Escalation prediction for AWS customer support cases

It is a binary classification task to find out whether a support case needs escalation to support engineers or not. For the analysis, new features are created from the first customer and support engineer interaction using prompt engineering techniques. As the dataset is highly imbalanced, the undersampling technique is used. The Autogluon ML model is used to find the best performing model on the training dataset, and it is selected to generate a confusion matrix on the validation dataset.

Education

M.Tech - QROR

Indian Statistical Institute
Kolkata
06-2025

B.E - Mechanical Engineering

Jadavpur University
01-2020

Skills

Programming Languages & Software: Python, MySQL, MS Excel, MS Word, MS PowerPoint

Tools & Libraries: ML,DL,Time Series,NLP,GenAI,Amazon S3,Amazon Sagemaker,Amazon Bedrock,Azure Databricks,Microsoft Foundry

Interests: Machine Learning, Deep Learning, Generative AI,Time Series Modelling and Forecasting,Cloud Platforms(Azure,GCP,AWS)

EXTRACURRICULAR ACTIVITIES

Solved Python and SQL Problems on HackerRank, honored as a FFE scholar from Foundation for Excellence, Certified from ASPIRING MINDS as Data Processing specialist and corporate sales manager, achieved Certificate of proficiency from CII-IPATE Solved Python and SQL Problems on HackerRank, honored as a FFE scholar from Foundation for Excellence, Certified from ASPIRING MINDS as Data Processing specialist and corporate sales manager, achieved Certificate of proficiency from CII-IPATE

Certification

Python For Beginners | Machine Learning with Python [Simplilearn]

RELEVANT COURSES

Statistical Methods | Probability | Operations Research | Programming Techniques & Data Structures | Business Analytics| Advanced Multivariate Analysis | Linear Algebra [ISI-K]

Timeline

Analyst Data Science

Infosys Ltd.
08.2025 - Current

Data Scientist Intern

Amazon
01.2025 - 06.2025

M.Tech - QROR

Indian Statistical Institute

B.E - Mechanical Engineering

Jadavpur University
MD MEHEBUB SHAIKHAnalyst Data Science