Dynamic data scientist with experience at Celebal Technologies, adept at leveraging Python and SQL to extract insights from complex datasets. Proven ability in building machine learning models and enhancing operational efficiency through innovative solutions. Strong communicator, skilled in data visualization and analysis, driving impactful business decisions.
Languages: English,Hindi
Celebal technologies | Manufacturing | Aug '2023, Use Case-Automated spare part interchangeability(pre- sales) for Microsoft, Celebal technologies|Internal Project | May '2023, Objective : internal Project-HR attrition, Using Seaborn, pandas, sklearn library, classical ML, Gathered and preprocessed diverse HR data, including employee performance, satisfaction, and tenure records. Applied machine learning techniques, including logistic regression and ensemble methods, to create an aurate attrition prediction model., Celebal technologies|Internal Project |Oct '2023, Objective: Internal Project-car price prediction, Using Seaborn, pandas, sklearn library, classical ML, Preprocessed raw data, conducted exploratory data analysis (EDA), and engineered relevant features. Implemented and fine-tuned multiple regression algorithms, achieving a high prediction accuracy of [68% ]on the test dataset, upGrad|Case Study | May '2024, Objective: SQL – RSVP Movies Case Study, Using SQL, RSVP Movies, an Indian film production company, is planning a global release in 2022. They seek data-driven insights from past three years of movie data to guide their strategy. As a data analyst and SQL expert, your task is to analyze this data through specific questions divided into four segments. Write SQL code for each question to draw meaningful insights and provide recommendations for their new project. Submit the SQL script with your solutions, upGrad|Case Study | June'2025, Credit Card Fraud Detection Project, Built an end-to-end machine learning pipeline to detect credit card fraud from a large, imbalanced dataset (1.85M transactions, 0.52% fraud rate), leveraging Python, scikit-learn, and Google Colab., Conducted thorough exploratory data analysis (EDA), feature selection, and preprocessing including handling class imbalance with sampling techniques., Developed baseline linear models and optimized ensemble models with hyperparameter tuning, validating performance via stratified k-fold cross-validation., Evaluated models with metrics prioritizing fraud detection (high recall) and performed a business cost-benefit analysis comparing pre- and post-deployment losses, demonstrating significant potential savings by reducing undetected fraud through a two-factor authentication intervention.