Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
Generic

SIRISHA PUNNAMRAJU

Full Stack Data Scientist
Hyderabad

Summary

Highly motivated, Passionate Full Stack Data Scientist with a 7+ years of Overall Working Experience 3+ years of Global Working Experience 4+ years in the fields of AI, Data Science, Machine Learning, and Deep Learning with a focus on Natural Language Processing (NLP) and Deep Learning expertise. A unique combination of Data Science as well as Data Engineering Skills Complete Practical Knowledge right from Data Collection, Data Preparation, Data Preprocessing to setting up Data Pipelines, Model Training, Fine-Tuning and Application/Model deployment. Strong knowledge in fundamentals and dedication towards work and a good team player

Overview

5
5
years of professional experience
2
2
years of post-secondary education

Work History

Full Stack Data Scientist

Velocity Clinical Research
Hyderabad
04.2022 - Current

User Engagement:

  • An application which enhances the user experience by applying AI in every step of the process and engages the user through gamification of the app. Users can enter their data in 4 different ways. They can speak to the app, upload an image or a file, enter the free text or select from a given set of choices

Responsibilities:

  • Designed & implemented the ML pipeline flow
  • Identified different data sources & collected appropriate data
  • Cleaned & Pre-processed the data
  • Prepared the Data dictionaries to train multiple models
  • Evaluated the trained models and select the best model that fits the current requirement

Text Recommendation:

  • Display the user a set of matching terms while they are typing the text. The search of matching terms is not just a lexical search, but a contextual semantic search based on a set of dictionary terms
  • Tools & AI Models: Sentence BERT Transformer Models, Fast-AutoComplete, Python

Speech To Text Conversion:

  • Collected and Prepared the audio and text data to train the speech model
  • Custom trained Microsoft speech to text service for domain specific terms.
  • Tools & AI Models: Microsoft Custom Speech service, FastAPI, Python

Image to Text Conversion:

  • Performed Image Data Collection and Data Augmentation to test and evaluate the selected cloud Text Extract services
  • Integrated the text extract service into the application flow
  • Tools & AI Models: Google Vision API, AWS Textract, Spark Vision NLP, FastAPI, Python

Custom Named Entity Recognition:

  • Trained and Integrated an NER model to extract entities from text.
  • Tools & AI Models: RASA & Clinical NER Model

Data/Application Migration with ETL:

  • Configurable Data Migration Service to migrate data from disparate sources to CTMS
  • Implemented Config driven approach to pre-process, transform and map data into target system.
  • Tools & Frameworks: Python & AWS

Staff Utilization & Revenue Optimization:

  • Identified and Analyzed various database tables to get the staff schedules and the appointment durations
  • Performed EDA in Tableau to find key insights into the data
  • Calculated several metrics to assess current business performance
  • Compared actual appointment hours vs staff scheduled hours to identify the staff performance
  • Estimated the actual revenue vs increased revenue with the optimum staff utilization
  • Tools & Frameworks: Tableau, Python for data-preprocessing

Campaign Analysis:

  • Performed EDA in Tableau and provided key insights to assess Campaign effectiveness
  • Derived the conversion rate for each site to estimate the number of users to contact based on several site metrics.
  • Tools & Frameworks: Tableau, Python for data-preprocessing

Machine Learning Engineer II

TrierHealth
Hyderabad
12.2019 - 04.2022

Provided Deep Learning AI solutions to various NLP tasks involved in the application flow. Worked well in a team setting, providing support and guidance.

MLOPS pipelines:

  • Orchestrated custom MLOPS pipelines for Deep learning/Transformer models
  • Tools & AI Models: AZURE MLOPS, Transformer Models

Speaker recognition & Sentence Type Detection:

  • Collected and prepared the data to identify the speaker and sentence type from the text
  • Labeled the sentences for multi class classification
  • Fine-tuned and deployed the models for better accuracy and performance
  • Tools & Models: Pytorch Framework, BERT Transformer Models

Custom Named Entity Recognition:

  • Annotated Data to train an NER model from scratch
  • Trained an Entity Linker model to customize the NER output to deliver required set of dictionary codes for the domain specific terms
  • Extracted domain specific entities based on the context
  • Tools & Models: Prodigy (for Annotating the dataset) & Spacy NER Models

Semantic Word Similarity Search:

  • Identify the most suitable High-level terms for a given term and associate the output with the relevant ICD dictionary codes
  • Tools & AI Models: Python, Sentence Transformer Models (SBERT)

Full Stack Data Engineer

RoundSqr
05.2019 - 12.2019
  • Real Time Spark Streaming Application to migrate data from Azure SQL Server Database to AWS REDSHIFT
  • Worked on Smart Patient Data Processing Platform (PDP) which analyses the patient spend patterns and predicts the probability of a patient availing the services of a hospital
  • This includes collecting and storing patient data into HDFS and processing using SPARK core and SPARK sql modules in Python (Pyspark)
  • Finally developing a predictive model using spark MLlib
  • Developed many use cases and mini projects on big data such as
  • Uber dataset analysis using spark sql (in pyspark)
  • 911emergency helpline data analysis using spark core (in pyspark)

Summer Intern

Gramener
05.2018 - 07.2018
  • Project: Bank Marketing Campaign Analysis using R and Tableau
  • To develop the best classification model to predict the potential clients for the Term Deposit Subscriptions that are being offered by the bank
  • Thoroughly implemented Exploratory Data Analysis Techniques to gain insights into the data
  • Developed 5 different classification models such as
  • Logistic Regression,
  • SVM,
  • Decision Trees,
  • KNN and
  • Random Forest
  • Determined the best classification model for the current data based on the performance statistics such as prediction accuracy, AUC and ROC curves
  • Academic Projects on Machine Learning and Data Analytics:
  • Audit Analytics in python: Evaluating Audit opinions to predict the company performance
  • Loan Default Prediction, Stock Market and Portfolio Management in R

Education

MBA - Business Analytics

University of Hyderabad
Hyderabad, India
07.2017 - 05.2019

Bachelor of Technology - Mechanical Engineering

V.R. Siddhartha Engineering College
Vijayawada, Andhra Pradesh, India

Applied machine learning in Python from Coursera Machine Learning Operation in Production from Coursera Sun Certified Java Developer - NLOPS

Coursera

Skills

Machine Learningundefined

Accomplishments

  • Gold Medalist (Topper) of the MBA Business Analytics batch (2017 -2019) with CGPA 9.13, from University of Hyderabad

Timeline

Full Stack Data Scientist

Velocity Clinical Research
04.2022 - Current

Machine Learning Engineer II

TrierHealth
12.2019 - 04.2022

Full Stack Data Engineer

RoundSqr
05.2019 - 12.2019

Summer Intern

Gramener
05.2018 - 07.2018

MBA - Business Analytics

University of Hyderabad
07.2017 - 05.2019

Bachelor of Technology - Mechanical Engineering

V.R. Siddhartha Engineering College

Applied machine learning in Python from Coursera Machine Learning Operation in Production from Coursera Sun Certified Java Developer - NLOPS

Coursera
SIRISHA PUNNAMRAJUFull Stack Data Scientist