Summary
Overview
Work History
Education
Skills
Title
Projects
Certification
Timeline
Generic

Heena Shahanaz Shaik

Hyderabad

Summary

Innovative Machine Learning Engineer and Natural Language Processing Expert with around 2 years of experience in machine learning model design, development, testing, and deployment. Good experienced in delivering end-to-end machine learning and data science projects with the right choice of technologies, and algorithms to solve real-world use cases in the retail industry.

Overview

2
2
years of professional experience
1
1
Certification

Work History

Machine Learning Engineer and NLP Expert

INFOVILLE SOLUTIONS INDIA PRIVATE LIMITED
03.2022 - Current
  • Performed large-scale data analysis and developed effective statistical models through various regression and classification algorithms-Linear Regression, Logistic Regression, Decision tree, Random Forest, etc
  • Built Exploratory Data Analysis dashboards in python using Seaborn and Matplotlib.
  • Experienced in python data manipulation for loading and extraction as well as with python libraries such as Numpy and Pandas.
  • Research and implement appropriate ML algorithms and tools, select appropriate datasets, and benchmark models, and perform statistical analysis and fine-tuning
  • Strong coding ability both in producing clean and efficient code as well as debugging and understanding large code bases
  • Extensive hands-on experience and high proficiency with structured, semi-structured, and unstructured data, using a broad range of data science programming languages and big data tools including, Python, SQL, and Scikit Learn.
  • Highly skilled in using pandas, Numpy, Seaborn, Matplotlib, SciKit-Learn, and Nltk in Python for developing various machine-learning algorithms.


Education

B. Tech - Electronics And Communications Engineering

BVRIT HYDERABAD College of Engineering For Women
Hyderabad
07.2022

Skills

LANGUAGES:

  • Python
  • Oracle SQL

PYTHON PACKAGE:

  • Numpy
  • Pandas
  • Scikit-Learn
  • Stats Model

VISUALIZATION:

  • Matplotlib
  • Seaborn

ALGORITHMS:

  • Linear Regression
  • Logistic Regression
  • Naive Bayes
  • Random Forest
  • Decision Tree
  • K Nearest Neighbour
  • K Means Clustering
  • Agglomerative Clustering

Title

MACHINE LEARNING ENGINEER AND NLP EXPERT

Projects

  • Classified the invoice documents into multiple categories based on the document type.

Roles and Responsibility:

  • Collected multiple invoices related to multiple products.
  • Extracted data from invoices by using regex patterns.
  • Used stemming, lemmatization, stop-word removal and Garbage values removal for pre-processing.
  • Used count vectorization,TF-IDF to generate the sparse matrix for the analysis.
  • Built a model using Random Forest Classifier and hyper parameter tuning is done by using GridSearchCV to get the best parameters for the Algorithm.
  • Test the model for a few weeks and after validating the accuracy for those 30 days, pushed the model to the production.
  • After getting new data, combined with old documents and keras, tokenizer and word2vec is used for vectorization of text corpus.
  • Build a Deep Neural Network model to classify the documents.
  • Evaluated the model using a confusion matrix and pushed the model to the production after rigorous testing of the model for one month.


  • Detected emotions of customers through analysis of their reviews by adding Sarcastic reviews also.

Roles and Responsibility:

  • Used stemming, lemmatization,stop-word removal and Garbage values removal for pre-processing.
  • Created a word cloud for visualization and analyzed the sentiments from keywords being used in the review.
  • Used count vectorization,TF-IDF to generate the sparse matrix for the analysis.
  • Built a model using Naive Bayes.
  • Evaluated the model using a confusion matrix and accuracy.
  • Test the model for 30days and after validating the accuracy for those 30 days, push the model to production.
  • After getting new data, combined with old data and keras tokenizer is used for vectorization of text corpus.
  • Build LSTM Classification model to predict the review.
  • Evaluated the model using a confusion matrix and pushed the model to the production after rigorous testing of the model for one month.


  • Identifying the Right Channel that drives more Sales in the Retail Stores.

Roles and Responsibility:

  • Implemented Design Thinking Approach, Prioritization matrix, and data validation Techniques.
  • Exploratory Data Analysis is implemented on the base data set to create Analytical Data Set.
  • Checked for Linear Relation, Autocorrelation, Multicollinearity, Heteroskedasticity, Homoscedasticity, and Normally distributed errors for satisfying Linear Regression.
  • Ordinary Least Squares method is used to predict the coefficients.
  • Implemented Dimension Reduction Technique (Principal Component Analysis) on the base analytical data to improve the accuracy by adjusting for the bias and variance trade-off.
  • Identified the top 5 drives and bottom 5 drivers and Computed R2, Adj R2, RMSE, MAE, and MSE and shared the results with relevant stakeholders.

Certification

  • Certification in Coursera on “Introduction to Web Development” with 93.8% in 2020.
  • Coursera Certification in “Programming for Everybody (Getting Started with Python)” with 99.08% in 2020.
  • Coursera Certification in “Crash Code on Python” with 89.50% in 2020.
  • HackerRank Certificate in “Python (Basic)” on 8th July,2020.

Timeline

Machine Learning Engineer and NLP Expert

INFOVILLE SOLUTIONS INDIA PRIVATE LIMITED
03.2022 - Current

B. Tech - Electronics And Communications Engineering

BVRIT HYDERABAD College of Engineering For Women
Heena Shahanaz Shaik