Summary
Overview
Work History
Education
Skills
Accomplishments
Additional Information
Timeline
Generic
Saurav Mehta

Saurav Mehta

Data Scientist
Noida

Summary

Result oriented data scientist with a demonstrated ability to deliver valuable insights, analytics, reports, and recommendations enabling effective strategic planning across all business units, distribution channels and product lines. Working knowledge of advanced analytic and research approaches, supervised/unsupervised machine learning algorithms. Ability to independently manage engagements from start to finish, delivering actionable insight within established timelines and budget Proficient in Statistical tools, predictive modelling, and machine learning techniques Strong visualization skills Knowledge Purview Data Analytics Data Visualization Statistical Analysis Modelling (Supervised / Unsupervised) Text Analytics (NLP) Automation

Overview

4
4
years of post-secondary education
2
2
Languages

Work History

Lead Analyst (Data Scientist)

CSC
  • CT Rescheduled Prediction and Decision Support Recommendation tool: Telstra Australia – Objective was to design and develop an end-to-end automated real time recommendation engine to support workflow specialists in taking effective decisions and minimize the handling time and increase process efficiency
  • Over 3000 recommendations (more than 50 unique types of recommendations) are generated per day near real time to mitigate risks
  • Build classification model to predict how many tickets are going to be rescheduled for each district
  • Objective was to automate the manual process of managing workforce by using advance machine learning techniques by predicting reschedules
  • And provide recommendations beforehand via DSR tool to workforce team, so that they can provide appropriate recommendations to manage their CT’s in advance and can avoid reschedule and manage workforce
  • Successfully deployed in more than 20 regions across PAN Australia
  • Developed highly optimized Python/SQL based scripts to transform and aggregate raw data getting from various sources (API Integrations, SQL Server, Flat files)
  • Designed and deployed prediction model to identify the potential risk based on various combination of variables (e.g
  • Supply, demand, reschedule, rainfall, urban/rural area)
  • Experienced in container-based programming using Kubeflow pipelines
  • Code repository maintained with Gitlab and CI/CD for UAT/Production changes
  • Scheduled pipeline to automatically execute it in every half hour and designed solution in such a fashion that all time zones were managed in a single run to save computing and processing time
  • Involved into in-depth business & conceptual discussion with SMEs for future implementations/refinements
  • Designed and developed several PowerBI reports to showcase the performance of DSR across PAN Australia
  • It is used by Principal level executive on daily basis
  • Maintained project level documentation (e.g
  • Solution design, Business logic, Technical document, Deployment document) in confluence
  • Followed agile methodologies throughout the project
  • Project management was handled via JIRA
  • Tech Stack - Kubeflow, Python, SQL, PowerBI, JIRA, Confluence
  • Technology Used: R Studio, Feature Engineering, Decision Trees, Random Forest, Multinomial Logistic regression
  • Role : Project Lead (Model Developer)
  • Predictive Maintenance Promise Level: Telstra Australia – Build Model whether a particular ticket is going into Triage or not at 12 PM and 4 PM on each day, so that we can intimate the user beforehand to win customer satisfaction as good will gesture
  • Technology, Used: R Studio, Feature Engineering, Decision Trees, Random Forest, Multinomial Logistic regression.

Project Lead (Model Developer), Lead Analyst (Data Scientist)

  • Mastercard: Entity Matching - Provided due diligence to solving the problem of identifying which records refer to the same real-world entity
  • It is an important data integration task that often arises when data originate from different sources
  • Set up source repository and deployed and managed infrastructure using CI/CD practices, GitHub, and Azure Pipelines
  • To automate the entity matching, directly worked
  • Technology Used: Python, Azure MLops, snowflake, Feature Engineering, Entity-py-matching, textual augmenter, Decision Trees, Random Forest, Logistic regression
  • Mastercard: Entity Matching - To understand the business requirements and implement effective solutions on the ground, which include the automation of logo validation
  • Detect the logo from image and then generate combined score of both text and image
  • Following the successful implementation of the solution, we are able to do the auto logo validation of more than 4 million data
  • Technology Used: Python, image matching, ocr, yolov7, easyocr, strcutural similarity, SSIN and other similarity score of text matching

Lead Analyst (Data Scientist), Tech Lead, Lead (Data Scientist), Technical Lead

  • Saudi Telecom: – Implemented AI Models in the following areas to increase customer retention by the way of analyzing customer behavior across different dimensions of their engagement and events that are registered in their ecosystem
  • As a Business Analyst, gathered the business requirement from stakeholders through a series of workshops, analyzed the requirements, capture the requirements in form of user stories in Agile tools such as JIRA & Confluence and help team to build Power BI dashboards accordingly the relevant Churn identification models to identify customers who are most likely to churn thus allowing marketing teams to design retention strategy by considering Cross sell/upsell or by taking next best action for the customer
  • Churn Predictions Cross sell/ Up sell NBA
  • Technology, SAS viya, Feature Engineering, Decision Trees, Random Forest, Multinomial Logistic regression
  • Role :, Contact Optimization: Customer profiling - Customer Segmentation and Recommendation for offline retail stores, to send promotional messages of the products to the customers based on their profiles
  • Technology Used: R Studio, Feature Engineering, Decision Trees, Random Forest, KNN and Collaborative filtering Role :, (Model Developer)
  • Predictive Maintenance: Mining companies make significant investments in Equipment and expect high levels of availability
  • Traditional approaches to ensuring high-availability through scheduled maintenance and on- demand maintenance support in the event of unscheduled downtime are reactive and sub-optimal
  • Also, unscheduled downtime affects a mine’s productivity besides causing damage to expensive components
  • Technology Used: R-Studio, Feature Engineering, Robust K-means Clustering, R-Shiny app.

Technical Lead (Model Developer)

  • Incident’s Classification: We built an ML model for Incidents ticketing for DXC for many projects
  • Problem which almost every project was facing to classifying Incident failure tickets and server failure tickets
  • As server failure tickets are more critical and needs immediate attention to avoid breach of SLA
  • So, we created the text classification model based on NLP on description field of the ticket to identify type of ticket and assign to correct team for resolution
  • Technology Used: R-Studio, Feature Engineering, text mining NLP, Tokenization, stemming, and n-grams, cosine similarity between documents, SVD.

Technical Lead (Model Developer), Senior Analyst, Data Analyst, Data Analyst (Model Developer)

CSC
  • Project Description: Unsupervised machine learning to simulate determining the priority of customer feedback based on the content of unstructured text
  • We build an AI designed to simulate the act of reading through instances of customer feedback (in the form of unstructured text) and automatically determine the relative priority of each
  • In the experiment, we represent customer feedback document-term-matrix and simulate prioritization using k- means clustering
  • We evaluate the experiment by using decision trees to profile the criteria for the different priorities
  • Technology Used: R Studio, Feature Engineering, NLP, Text mining, Decision Trees, K-means clustering
  • Role, (Model Developer)
  • Project Description: HR Analytics – Predicting the attrition behavior of the employee – this was an internal project for CSC
  • Studio, Feature Engineering, Logistic Regression, Project Description: Credit Bill for Matson – Predicting if the client should be allowed for the credit bill facility and how much credit should be allowed
  • Technology Used: R-Studio, Feature Engineering, Multiple linear regression, Logistic regression
  • Role

Analyst Programmer

Syntel

Programmer Analyst/ Software Engineer

  • Involved in design, development, testing, implementation, maintenance and support of different projects like Booking/Billing, Financial reporting clients like HUMANA Health Insurance USA using the mainframe technology.

Education

B. Tech -

PTU (SUSCET College)
01.2002 - 04.2006

Senior secondary school - undefined

CBSE board (MGN Public School)

undefined

CBSE board (SHCSD Public School)

Skills

Machine Learning,Text mining, AI,Deep learning Python,R

undefined

Accomplishments

  • STATISTICAL ANALYSIS
  • Exploratory Analysis
  • Data Quality
  • Test of Hypothesis
  • Design of Experiments
  • Regression Models: Functional Models
  • Classification models
  • Forecasting models
  • Dimension Reduction: PCA
  • Cluster analysis
  • Recommendation s
  • Associations
  • Model Assessing
  • Model Validation
  • Model Diagnostics: Validating assumptions and removing violations
  • MACHINE LEARNING
  • CHAID &CART
  • Random Forest
  • Text Mining/NLP
  • Sentiment Analysis
  • KNN
  • Neural Networks
  • Clustering
  • STATISTICAL TOOLS
  • R-Studio
  • R Shiny
  • GGPLOT2
  • PROGRAMMING
  • Python
  • DB2
  • SAS Viya
  • Azure ML

Additional Information

Completed 1 year of Executive Program in Business Analytics / Data Sciences from Greatlakes Institute of Management , Gurugram ( Top 10 Students in Batch of 60 ) in 2015-16

Timeline

B. Tech -

PTU (SUSCET College)
01.2002 - 04.2006

Lead Analyst (Data Scientist)

CSC

Project Lead (Model Developer), Lead Analyst (Data Scientist)

Lead Analyst (Data Scientist), Tech Lead, Lead (Data Scientist), Technical Lead

Technical Lead (Model Developer)

Technical Lead (Model Developer), Senior Analyst, Data Analyst, Data Analyst (Model Developer)

CSC

Analyst Programmer

Syntel

Programmer Analyst/ Software Engineer

Senior secondary school - undefined

CBSE board (MGN Public School)

undefined

CBSE board (SHCSD Public School)
Saurav MehtaData Scientist