Summary
Overview
Work History
Education
Skills
Websites
Accomplishments
Planned Learning Upskilling
Certifications Courses
Timeline
Generic

SIDDHARTH JAIN

Data Scientist
Bangalore

Summary

Data Scientist with 3 years and 2 months of experience at Allstate India Private Limited, with expertise in data analysis, PySpark-based pipeline development, and NLP. Proficient in Python, PySpark and beginner in SQL. Delivered end-to-end solutions including NER model creation, end to end Pyspark project and some exposure to AWS deployment. Knowledge in statistics and probability and hypothesis testing some data science case studies.

Overview

3
3
years of professional experience
2020
2020
years of post-secondary education

Work History

Data Scientist

Allstate India Private Limited
04.2022 - Current
  • NER Model Development: Created Named Entity Recognition (NER) models from scratch to extract authentication entities in unstructured transcripts.
  • NLP & Preprocessing: Processed call and chat transcripts using techniques like chunking, buffer creation, and word embeddings.
  • Data Engineering with PySpark: Built scalable PySpark pipelines to handle large volumes of text and interaction data.
  • AWS Model Deployment: Partnered with senior data scientists to containerize and deploy models in AWS production environments.
  • Regex & Rule-based Logic: Implemented regex-driven authentication logic for chat channels, improving classification accuracy.
  • Multi-channel Coverage: Delivered tailored solutions for Chat, Voice Sales, and Voice Service contexts.

Education

B.Tech - Electrical and Electronics Engineering

RVCE

Data Science Master’s Program - undefined

Simplilearn

Skills

  • Python

  • SQL

  • Pyspark

  • Git

  • Confluence

  • Domino

  • Pandas

  • NumPy

  • Data Visualisation

Accomplishments

  • Rising Star Award – Q1 2023
  • Star Performer of the Quarter – Q3 2024
  • 3rd Rank in Nagaland (Class 10) – Felicitated by Governor
  • 1st Prize in EMBARK Business Challenge, E-cell RVCE
  • Senior Associate – Entrepreneurship Cell, RVCE (2017–2018)
  • Man of the Tournament – College Hostel Cricket

Planned Learning Upskilling

  • Microsoft Azure Cloud Certification, Fundamentals + Associate level, 2025
  • Advanced Python Programming, OOP, modularization, testing, decorators, etc., 2025
  • Modular Programming for ML Pipelines, using scikit-learn, MLflow, FastAPI, 2025
  • AWS Production Deployment Best Practices, continuing from current experience, 2025

Certifications Courses

Data Science & Machine Learning Program, Scaler Academy, Ongoing, 11/01/23

Timeline

Data Scientist

Allstate India Private Limited
04.2022 - Current

Data Science Master’s Program - undefined

Simplilearn

B.Tech - Electrical and Electronics Engineering

RVCE
SIDDHARTH JAINData Scientist