Summary
Overview
Work History
Education
Skills
Additional Information
Languages
Timeline
Generic

Shubham Chaudhary

Pune

Summary

Data Scientist & Engineer with 5+ years of experience in data analytics, Python development, and server-based machine learning and data processing solutions.

Proficient in designing and deploying scalable ML pipelines, ETL processes, and automated workflows handling 200K–500K records daily.

Experienced in NLP, OCR, and document automation systems, reducing manual effort by up to 60% and enhancing operational efficiency.

Skilled in distributed computing, optimizing server-based data processing to achieve up to 40% faster performance.

Expert in web automation and data collection using Selenium, Playwright, and Py-AutoGUI, including bypassing anti-bot mechanisms.

Adept at building interactive dashboards and visualizations with Django, Pandas, and Matplotlib to deliver actionable insights.

Proven track record in mentoring and leading teams, ensuring on-time project delivery while reducing timelines by 25%.

Strong ability to manage end-to-end project delivery, from data ingestion, cleaning, and transformation to ML modeling, visualization, and deployment.

Quick learner and problem solver, capable of adapting to evolving technologies and dynamic project requirements in fast-paced environments.

Overview

6
6
years of professional experience

Work History

Data Scientist

AltRr Software Services Limited
04.2025 - Current
  • Built and led a high-performing team of 8 data professionals within the Indian time zone, mentoring juniors and enabling 3 promotions.
  • Achieved 100% on-time project delivery while reducing timelines by 25%.
  • Designed and maintained server-based ML pipelines processing 200K–500K records daily with 99.9% uptime.
  • Deployed NLP-driven document processing systems, cutting manual effort by 60%.
  • Implemented distributed computing solutions on office servers, accelerating data processing by 40%.
  • Delivered data products that improved client retention by 35%, automated workflows saving $300K annually, and supported market expansion into 2 new geographies.

Data Engineer

AltRr Software Services Limited
08.2024 - 04.2025
  • Designed and maintained fault-tolerant ETL pipelines using Apache Spark, processing 100+ GB of data daily on in-house servers.
  • Developed automated data validation frameworks, enhancing accuracy by 45%.
  • Migrated legacy systems to modern server-based big-data frameworks, reducing infrastructure costs by 30%.
  • Optimized database queries and architectures for faster, more reliable reporting.
  • Modularized complex workflows to improve maintainability and scalability.
  • Implemented real-time monitoring and alert systems to proactively detect anomalies.
  • Mentored junior engineers, fostering a collaborative culture and technical growth.

Application Developer

QL2 Software Ltd
08.2022 - 07.2024
    • Specialized in domains including Airfare, Retail, Vacation, Car, Social Media, and Real Estate.
    • Optimized program performance and mitigated live website tracking mechanisms, including canvas and mouse movement detection.
    • Developed machine learning and deep learning solutions for CAPTCHA resolution, enhancing user experience.
    • Assembled large, complex datasets to meet business requirements and collaborated with analytics teams to improve system functionality.
    • Designed, structured, and implemented new websites while maintaining and updating existing ones.

Data Analyst

Ketsaal Retails LLP
12.2019 - 07.2022
    • Collected, cleansed, and analyzed large unstructured datasets to generate actionable business insights.
    • Delivered live market insights through web scraping and structured data analysis.
    • Developed high-quality dashboards using REST GUI to visualize company performance metrics.
    • Applied machine learning and data mining techniques for sales forecasting and back-testing stock market strategies.
    • Managed inventory for seasonal and fast-/slow-moving products to optimize stock levels.
    • Conducted customer sentiment analysis by scraping reviews and ratings from Amazon and Flipkart.
    • Built automated reporting dashboards to monitor company and employee performance.

Education

Computer Science Engineering

Rawal Institute of Engineering & Technology
Faridabad
08.2019

12th & 10th -

D.A.V Public School
Faridabad
04.2015

Skills

  • Programming & Scripting: Python, PySpark, SQL
  • Frameworks & Automation: Django, Flask, REST APIs, Selenium, Playwright, PyAutoGUI
  • Data Engineering & Pipelines: PySpark, ETL Pipelines, Airflow, Dataflow, Data Cleaning & Transformation, Real-time Monitoring
  • Machine Learning & AI: NLP, Machine Learning, Deep Learning, OCR, Object Detection, Data Preprocessing
  • Data Visualization & Analytics: Pandas, Matplotlib, Seaborn, Interactive Dashboards, Reporting Automation
  • Databases & Storage: MySQL, Structured & Unstructured Data Management
  • Deployment & Infrastructure: Server-based deployments, Distributed Computing on Local/Office Servers, Basic GCP & AWS Familiarity

Additional Information

Server-Based ML Pipelines (2025 – Present)
Designed and deployed server-based ML pipelines processing 200K–500K records daily with 99.9% uptime, significantly improving data processing efficiency and reliability.

NLP Document Automation (2025 – Present)
Developed NLP-driven document processing systems to automate unstructured data extraction, reducing manual effort by 60%.

Distributed Computing Optimization (2025 – Present)
Implemented distributed computing solutions on office servers, achieving 40% faster data processing across multiple internal projects.

ETL Pipeline Development (2024 – 2025)
Built fault-tolerant ETL pipelines using Apache Spark, processing 100+ GB of data daily while ensuring high accuracy and reliability.

Data Validation Framework (2024 – 2025)
Developed automated data validation systems, improving accuracy by 45% and enhancing governance across in-house systems.

Legacy System Migration (2024 – 2025)
Migrated legacy systems to modern server-based big-data frameworks, reducing infrastructure costs by 30% without disrupting operations.

CAPTCHA Solving Automation (2022 – 2024)
Created machine learning and deep learning algorithms to efficiently solve CAPTCHA challenges, enhancing user experience and workflow automation.

Website Automation & Data Extraction (2022 – 2024)
Automated complex data collection workflows across domains such as Airfare, Retail, Vacation, Car, Social Media, and Property, while mitigating anti-tracking mechanisms.

Data Analytics & Visualization Dashboards (2019 – 2022)
Built interactive dashboards and automated reporting systems, enabling real-time insights and informed decision-making.

Sales Forecasting & Inventory Prediction (2019 – 2022)
Applied machine learning and data mining to forecast sales trends and manage seasonal inventory, improving operational efficiency.

Customer Sentiment Analysis (2019 – 2022)
Scraped and analyzed customer reviews from Amazon and Flipkart to generate actionable insights for marketing and product strategy.

Languages

Hindi
English

Timeline

Data Scientist

AltRr Software Services Limited
04.2025 - Current

Data Engineer

AltRr Software Services Limited
08.2024 - 04.2025

Application Developer

QL2 Software Ltd
08.2022 - 07.2024

Data Analyst

Ketsaal Retails LLP
12.2019 - 07.2022

Computer Science Engineering

Rawal Institute of Engineering & Technology

12th & 10th -

D.A.V Public School
Shubham Chaudhary