Summary
Overview
Work History
Education
Skills
Additional Information
Timeline
Generic

Shweta Nayak

Data Scientist
Pune

Summary

6+ years industrial experience in the field of Data Science. AI/ML researcher with interest in Knowledge Graphs, NLP and recommendation systems. Meticulous Data Scientist accomplished in compiling, transforming and analyzing complex information. Demonstrated success in identifying relationships and building solutions to business problems. Familiar with gathering, cleaning and organizing data for use by technical and non-technical personnel. Advanced understanding of statistical, algebraic and other analytical techniques.

Overview

8
8
years of professional experience

Work History

Data Scientist

Enlyft Inc
Pune
12.2022 - Current
  • Developed predictive models to identify if the organization is B2B or B2C segmentation and successfully implemented models into the production environment.
  • Explored data to uncover hidden patterns and trends, providing valuable insights for business decisions.
  • Collaborated with cross-functional teams to develop data-driven solutions for business problems
  • Created and implemented forecasting models to find data patterns for the clients

Principal Data Scientist

CutShort.io
Pune
06.2022 - 12.2022
  • Quality Grader for resume ranking: Streamlined hiring process. Utilized machine learning techniques to develop automated resume ranking system, reducing manual efforts for shortlisting candidates. Model was integrated into CutShort.io's hiring process, resulting in decrease in time-to-hire and increase in overall candidate quality.

Data Scientist

AlgoAnalytics Pvt. Ltd
Pune
02.2018 - 06.2022
  • Unsupervised Data Model Generation :
    Aim is to the meaningful information from legal documents such as Attributes,Clauses etc.
    Technical Environment : Clustering, WharfCoefficient, NetworkX,pyvis
  • Patient Knowledge Graph :
    This project helps to build knowledge graphs over patient discharge summaries.
    T echnical Environment : Python, spaCy, ScispaCy,
    ERNIE, regex, OrientDB, OpenKE, pandas, scikit_learn,pysbd, QuickUMLS
  • Medical Abbreviation detection and expansion :
    This project is aimed to build a system for detecting medical abbreviations and possible expansions from medical documents
    Technical Environment : Python, Spacy, SciSpaCy, , regex, pandas, Scikit_learn,UMLS, RandomForest , TFIDF , Word embeddings ,Feature building.
  • Hybrid RecSys system for Document Recommendation : Hybrid recommendation system for document recommendation using user profile and user answer
    Technical Environment : Deep learning techniques,Clustering,kNN LightFM,Flask , Docker
  • Personality Traits Prediction :
    Predicting personality traits based on answers given by a person.
    Technical Environment: Word2Vec, Sentence2Vec, Google sentence encoder,Flask ,Docker
  • Content based RecSys system for Document Recommendation :
    Content recommendation system for document recommendation using user user answer
    Technical Environment : TfIdf , Wod2Vec, Flair, Clustering,Classification,Azure functions.
  • Grammatical error correction using machine learning approach
    Rule based approach has been used along with machine learning based POS tagging approach to identify and highlight grammatical errors in a sentence.
    Technical Environment : NLTK, Spacy ,POS tagging
  • Product sell outlier detection for retail chain :
    This project involves porting of a GCP based outlier detection for retail chain to the redshift based system. Compatibility between two environments has been identified and necessary changes have been made to match the target environment.
    Technical Environment : RedShift,GCP,ANOVA,Clustering
  • Automatic Financial Transaction status reconciliation
    Status reconciliation for financial transactions involves human efforts. This task has been automated using a machine learning based approach(xgboost) to minimize human efforts.
    Technical Environment : XGBoost,RandomForest, H2O.ai, Feature Engineering

Application Developer

INautix Technologies
Pune
11.2016 - 01.2018
  • GSS portal audit report generator: Technical Environment : JAVA
  • SWIFT Release 2017 Upgrade

Postgraduate Trainee

CDAC Research Center
Pune
08.2015 - 07.2016
    • GPU Enabled Implementation of OpenSEES
    • OpenSEES is the open source software for earthquake simulation whose THA performance is improved by making it GPU enabled
    • Publication : Literature Survey on GPU Enabled Libraries for
    • Improving Efficiency of OpenSEES on Linux Cluster

Education

Master of Engineering - Computer Science

Pune Institute of Computer Technology
Pune
08.2014 - 2016.07

Bachelor of Engineering - Computer Science

MGM's COE
Pune
07.2002 - 2006.06

Skills

    NLTK, Spacy,

undefined

Additional Information

Publications :

  • Literature Survey on GPU Enabled Libraries for Improving Efficiency of OpenSEES on Linux Cluster
  • Entity Typing and Relation Classification for Knowledge Graph Building using ERNIE

Timeline

Data Scientist

Enlyft Inc
12.2022 - Current

Principal Data Scientist

CutShort.io
06.2022 - 12.2022

Data Scientist

AlgoAnalytics Pvt. Ltd
02.2018 - 06.2022

Application Developer

INautix Technologies
11.2016 - 01.2018

Postgraduate Trainee

CDAC Research Center
08.2015 - 07.2016

Master of Engineering - Computer Science

Pune Institute of Computer Technology
08.2014 - 2016.07

Bachelor of Engineering - Computer Science

MGM's COE
07.2002 - 2006.06
Shweta NayakData Scientist