Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic
Soumyajit Sahu

Soumyajit Sahu

Data Science Trainee
Bengaluru,KA

Summary

Data Scientist with hands-on experience in ML, SQL, GenAI, and MLOps — building RAG pipelines, forecasting models, and customer analytics solutions with measurable business impact. Skilled in statistical modeling, hypothesis testing, and end-to-end ML lifecycle delivery.

Overview

1
1
year of professional experience
1
1
Certification

Work History

Data Scientist

Sigmoid Analytics
08.2025 - Current

RAG Document Search Application

  • Engineered a hybrid RAG pipeline for internal document search combining FAISS semantic vector indexing with BM25 lexical retrieval, selected based on research showing hybrid approaches outperform pure vector search on domain-specific corpora — achieving 75% retrieval relevance across internal PDF documents and serving results via a Streamlit interface powered by Groq LLM.

Retail Customer Analytics and Demand Forecasting

  • Built customer segmentation models using RFM-based K-Means clustering on 525K+ retail transactions and applied the Apriori algorithm to identify frequent product combinations, enabling targeted cross-sell strategies.
  • Developed SARIMA/Prophet time-series forecasting models for retail demand planning; A/B testing of segment-based campaigns versus random targeting showed 22% higher conversion rate and 18% improvement in cross-sell purchases.

Data Science Intern

Sigmoid Analytics
01.2025 - 08.2025

Telecom Churn Case Study

  • Conducted EDA on telecom customer data to analyze churn patterns, identifying a 26.5% churn rate and key drivers including contract type, tenure, and monthly charges.
  • Built and optimized an AdaBoost churn prediction model using RandomizedSearchCV, achieving 73% accuracy and highest F1-score among evaluated models, enabling proactive retention interventions.

Education

Bachelor of Technology - B.Tech

NIT ROURKELA
Rourkela, Odisha
04-2025

Skills

Cloud: AWS, Microsoft Azure

Languages: Python, SQL, C

MLOps: DVC, MLflow, Docker

GenAI / LLM: LangChain, LangGraph, RAG

Frameworks: FastAPI, Flask, Streamlit, PyTorch , PySpark, Tensorflow

Databases: MySQL, PostgreSQL, FAISS, Chroma, MongoDB Atlas

Statistics: A/B Testing, Hypothesis Testing, Regression, Probability Distributions, Statistical Inference

ML / Algorithms: K-Means, AdaBoost, LSTM, Random Forest, XGBoost , Decision Tree, Logistic Regression , CNN

Certification

All Certifications [Link]

Timeline

Data Scientist

Sigmoid Analytics
08.2025 - Current

Data Science Intern

Sigmoid Analytics
01.2025 - 08.2025

Bachelor of Technology - B.Tech

NIT ROURKELA
Soumyajit SahuData Science Trainee