Certified Data Science professional with hands-on experience in projects involving machine learning, data visualization, and statistical analysis. Skilled in Python, Pandas, Scikit-learn, and SQL. Passionate about extracting insights from data and solving real-world problems. Eager to contribute to data-driven decision-making in a collaborative team environment.
💳 Credit Card Consumption:
Business problem-
A leading bank wanted to predict future credit card spending by customers to optimize credit limits, reduce risk, and enhance marketing strategies.
Objective-
To build a machine learning model that forecasts future credit card consumption using historical transactions and demographic data.
Data availability-
· Customer transaction history
· Demographic details like age, income, location
Tools used-
· Python: Pandas, NumPy for data processing, Matplotlib and Seaborn for visualization
· Machine Learning: Linear Regression, Random Forest, Gradient Boosting
Metrics-
·R² score (achieved over 80%)
Tuning parameters-
Used GridSearchCV for model tuning:
· Gradient Boosting: learning_rate, n_estimators
Challenges-
· Handling missing values and imbalanced features
· Avoiding multicollinearity
💬 Customer Review Analysis :
Business problem-
Understand customer sentiment and feedback patterns across products, categories, and demographics to improve product offerings and enhance customer satisfaction.
Objective-
Analyze and classify customer reviews to identify sentiment polarity and key drivers behind product recommendations.
Data availability-
· Customer reviews (text data)
· Associated metadata: category, sub-category, product name, customer age, and location
Tools used-
· Python libraries: NLTK, Scikit-learn for NLP and modeling, WordCloud, Matplotlib
· Text preprocessing: Tokenization, Stopword removal, Lemmatization
Metrics-
· Sentiment classification accuracy: Achieved over 77%
Challenges-
· Handling noisy and unstructured review text
· Balancing uneven sentiment classes (e.g., more positive than negative reviews)
⚡ Electricity Demand Estimation:
Business problem-
Forecast electricity demand to support energy providers in making data-driven supply, pricing, and capacity planning decisions.
Objective-
Estimate future electricity consumption using historical demand data and time-related features to improve forecast accuracy and resource planning.
Data availability-
· Historical electricity usage data
· Time-based variables: hour, day, month, season, holiday indicators
Tools used-
· Python: Pandas, NumPy for preprocessing, Matplotlib, Seaborn
· Time series modeling: ARIMA, Facebook Prophet
Metrics-
· MAPE (Mean Absolute Percentage Error) for forecasting performance
Tuning parameters-
· ARIMA: p, d, q values via ACF/PACF plots and auto_arima
· Prophet: changepoint prior scale, seasonality mode
Challenges-
· Ensuring stationarity of time series data
· Capturing multiple seasonalities (daily, weekly, yearly)
Certified Data Analyst – AnalytixLabs