Well-qualified Data Scientist experienced working with vast data sets to break down information, gather relevant points and solve advanced business problems. Skilled in machine learning, data wrangling and Tableau. Offering 6+ years of experience in improving business operations.
Machine learning
1. Sales Forecasting
Dataset has 5 years of sales data, and we need to predict sales for the next three months. There are ten different stores listed in the dataset, and there are 50 items at each store. To predict sales, we tried out various methods — Regression, vector autoregressive, or deep learning. One method we used for this project is to measure the increase in sales for each month and record it. Then, we build the model on the difference between the previous month and the present month sales. Taking into account factors like holidays and seasonality can improve the performance of our machine learning model.
Languages Used: Python, SQL
Libraries Used: Pandas, sklearn, seaborne
Visualization : Tableau
2. Customer Churn - NEURAL NETWORK
The dataset from the bank records stores customer name, credit score, geography, balance, tenure, gender, etc. Preprocessing, imputing and label encoding are the next steps that occur. The dataset goes through feature extraction at this stage, eliminating less essential fields, making the dataset manageable and more consistent. We used ANN - Deep Learning model and is preferred for this project.
Languages Used : Python, SQL
Libraries Used: sklearn, seaborne, pandas,numpy
Visualization: Tableau
3. Named Entity Recognition
Business wants to monitor the reference text for each book; we need the name of the responsible author, organization, location and date.
The Named Entity Recognition task attempts to correctly detect and classify text expressions into a set of predefined classes. Classes can vary, but very often classes like people (PER), organizations (ORG) or places (LOC) are used.
Libraries Used: NLTK
production setting of a bank.