Summary
Overview
Technical Skills
GitHub Profile
Work History
Education
Timeline
Certifications
WHAT I AM PROUD OF
Generic
Lohith G N

Lohith G N

Data Scientist
Bengaluru

Summary

With over 13 years of experience in Data Science, I specialize in Generative AI, contributing expertise across various sectors such as telecom, retail, and healthcare. Proficient in deploying advanced models like LLM, NLP, and transformers, I excel in crafting quality and safety for LLM applications through innovative approaches. My journey in Data Sciences and Machine Learning began in 2012, marked by successful implementations on NLP and Predictive modeling. Leading diverse global teams, I bring expertise in LLMOPS, Generative AI Technology Consulting, and aligning solutions with modern business objectives.

As a Data Scientist with hands-on experience, I've actively contributed to global data analytics projects, addressing challenges in locations such as Shanghai, Shenzhen, Hanoi, Helsinki, Dhaka, Tel Aviv, Tunis, Bangkok, and Ho Chi Minh City. Holding a Master's in Data Science & Engineering from BITS Pilani and completing a Post Graduate Program in Data Science & Machine Learning at the University of Chicago, I remain committed to driving innovation in the ever-evolving field of Generative AI.

Overview

13
13
years of professional experience
10
10

Years of Machine learning experience

Technical Skills

AI technologies : Generative AI, NLP, Computer vision, Machine learning, Neural networks, Reinforcement learning, GAN, Transfer learning, Predictive analytics, Explainable AI, Edge AI,Recommendation engines

Gen AI models : GPT 3, BERT, T5, Dall-E, BART,Transfo XL, XL net, ROBERTa,Babbage,CLIP

Gen AI frameworks & DB's: Haystack, Langchain, Huggingface transformers,OpenAI GPT,Pytorch,Keras,Gensim,Textblob,Spacy,GPT-Neo,AllenNLP,FAISS,Milvus,Pinecone,Elasticsearch,Deeplake

Programming: Python, R, C,C++

ML algorithms :Regression techniques - Linear, Multi Linear, Polynomial Regression, Regularisation Techniques - Ridge Regression, Lasso Regression, Classification techniques- Naïve Bayes, Logistic Regression, K- Nearest Neighbour, Support Vector Machine, Decision Trees, Parallel and Sequential Ensemble methods- Random Forrest, Adaboost, Gradient Boosting and Extreme Gradient Boosting Clustering Techniques – K- Mean Clustering, Hierarchical/Agglomerative Clustering. Re-Inforcement Learning – Multi Arm Bandit (Thomson Sampling, Greedy Algorithm with Decay), Markov Decision Problems. Deep Sequential Neural Network – Convolution Neural Networks, Deep Keras Sequential and Deep Keras functional functional neural networks, Recurrent Neural Network Long Short Term Memory (LSTM), Gated Recurrent Unit (GRU) , Time Series Analysis – Univariate and Multi Variate Time Series Analysis. Natural Language Processing (NLP) – Working with regular expressions, Tokenization, Stopwords, Stemming and Lemmatization, Converting Text into Numerical Data, Document Analysis, Text Classifications, Sentiment Analysis, Autoencoders, Open AI Transformer Models,One and few shot learning,PyTorch,Streamlit.

BI Tools & API : Tableau, PowerBI, DOMO, Qlik,IBM cognos analytics,Streamlit, Fast API,Flask Databases: MySQL, HIVE,Casandra,PySPARK, Hadoop,Data lake

Exploratory Data Analysis : Data Extraction, Data Cleaning, Data Wrangling, Feature Engineering, Removing Outliers, Data Explorations, probability distributions. Machine Learning Operations Over Cloud – AWS Sage Maker, Azure Databricks

Mathematics : Probability,Statistical analysis, Linear Algebra,Vector Calculus, Graphical modeling, Hypothesis testing,Bayesian theory,Cost functions, Gradient descent

Cloud – Amazon Web Services- AWS Glue, AWS Lambda, AWS EC2, AWS S3, AWS EBS, AWS ELB, AWS Bastion Hosts, AWS Cloud Watch, AWS Networking, AWS Firewalls and NACL, Security Groups, AWS ECS and AWS EKS (Kubernetes Services), AWS Direct Connect, AWS Cloud Front

GitHub Profile

https://github.com/lohith0501

Work History

Principal Data Scientist

Randstad Digital
Bengaluru, Karnataka
01.2024 - Current
  • Spearheading the development of an intelligent chatbot for Airbus France, designed to comprehend various components of the aircraft domain, including HTML, C language code, domain descriptions, and requirements files. Leveraged Metaflow, HTML parsers, Beautiful Soup, pandas, hybrid retrievers, and LLAMA3 to build the solution.
  • Conducted research to optimize Large Language Model (LLM) inference speed and compute for production deployment, achieving significant performance gains through quantization techniques, prompt tuning, model fine-tuning, and hyperparameter tuning.
  • Led the development of an HR chatbot for Randstad Digital, utilizing Retrieval-Augmented Generation (RAG) techniques with cutting-edge LLMs like GPT-4, Google Palm 2, and open-source models from Hugging Face.
  • Deployed various generative AI applications using Streamlit for quick prototyping, deployed LLM model Docker image on the cloud for quick inferencing, and REST APIs using FastAPI.
  • Developed and led a Generative AI Solution for Legacy Code Modernization, utilizing open-source LLMs for specific tasks like PII masking, pseudocode generation, text-to-code translation, code critic code testing, and documentation generation.

Role -Data Scientist -Gen AI, ML & Cloud

Rakuten Symphony
Bangalore
01.2022 - Current
  • Built a Language Model (LLM) using Mistral 7B and Faiss vectordb to guide the analysis of highly utilized cells with High PRB in the network. Employed prompt engineering for identification logic, achieving an 82% accuracy rate. Implemented the solution in Rakuten Cloud (GDSP), an AI platform, for daily analysis of over 10,000 cells.
  • Developed a custom Language Model (LLM) chatbot for retrieving multivendor document data, improving user comprehension and interaction. Utilized a methodology that integrates Haystack, extractive QA, hybrid retriever (BN25 & REALM), Farm reader (ROBERTa), and Milvus multimodal database. Employed the RAG technique to address hallucination in generative models. Currently in widespread daily use by more than 30 domain experts.
  • Built a recommendation system to suggest underperforming cells in a mobile network using collaborative filtering and content-based models. Cut proactive measurements time by 80%, making a substantial team contribution.
  • Enhanced LLMS quality and safety by reducing hallucinations, data leaks, toxicity, and jailbreaks/prompt injections. Implemented techniques like selfcheckGPT, Toxigen model, NER, LLM metrics, and developed a monitoring system.
  • Participated in a project predicting time series data with a Time Language Model (LLM) to improve traffic growth forecasting precision using GenAI.
  • Implemented 11 new analytical dashboards, including statistical distribution, Pearson correlations, Apriori techniques, etc., leading to a 90% reduction in network analysis time. Leveraged auto-scheduling in Apache Airflow for these approaches..

Role - Data Scientist -NLP & Text Analytics

ITC Infotech
Bangalore
07.2020 - 12.2021
  • Initiated an industry-specific Document predictive model using DistilBERT from Hugging Face, automating the identification of appeal or recall categories in claim documents. Utilized Apache Kafka for data ingestion, resulting in a significant 70% reduction in man-hours for UHG client processes, replacing the manual approach.
  • Constructed a probabilistic model employing Bayesian statistics to discern whether a patient is likely to undergo a C-section or a normal delivery. The model was trained using a combination of blood reports, scanning reports, and SDOH data.Accuracy was 65%%
  • Enhanced cost efficiency by approximately 30% through the implementation of a OpenAI CLIP model for medical report identification. The solution workflow was executed on the Azure cloud.
  • Conducted a Proof of Concept (POC) for detecting effusion in chest X-ray (CXR) images. Trained a ResNet-50 model, performed ablation, and used Keras callbacks for model evaluation. Implemented a strategy with weighted cross-entropy to reduce False Negatives, improving accuracy by 20%.

Role - Data Scientist -ML & Cloud

Reliance Jio Infocom Limited
Mumbai
10.2016 - 07.2018
  • Developed a DBSCAN clustering machine learning algorithm to autonomously split Tracking Area Codes (TAC) for overloaded TACs, completely eliminating the need for human intervention.
  • Constructed a forecasting model using SARIMAX to pinpoint 100 potential hotspots in the country where mobile network capacity can be expanded in the near future.
  • Constructed a daily dashboard to monitor 4G mobile network capacity, identify outlier cells using the Isolation Forest algorithm, and created a web app for streamlined monitoring.
  • Identified capacity enhancement features, multi carrier settings in Samsung vendor through Random forest multi class classification ML algorithm which improved Mute call rate & Volte drop rate PAN India.

Role - Senior Engineer -ML

ZTE Telecom India Pvt Ltd
Bangalore
01.2015 - 11.2015
  • Infused innovation into Idea cellular networks performance analysis through real-time analysis using MySQL,stream processing and Tableau.
  • Utilized an exponential smoothing time series algorithm for predicting underperforming mobile network base stations, enabling the technical team to take proactive measures.
  • Achieved a 50% reduction in manual hours through the implementation of a Python script automation framework.
  • Conducted Multivariate Time Series Analysis using ARIMA & VAR model, visualizing daily, weekly, and monthly traffic for peak periods.

Role - Engineer -ML & Database

Ericsson
Bangalore
11.2012 - 12.2014
  • Algeria: Performed marketing analysis for Ooredo, identifying optimal product categories for different seasons. Applied reinforcement learning (Bandit Thomson Sampling) to analyze price variations and utilized beta distribution to finalize product choices..
  • Thailand: Employed predictive maintenance for DTAC. Utilized clustering (K-means) and forecasting (Additive Holt- Winters) to identify potential downtime and maintenance needs.

Role - Senior Data Analyst

Nokia Siemens Network
Gurgaon
04.2010 - 06.2012

Role - Mobile Network Business Analyst

ZTE Telecom India Pvt Ltd
Gurgaon
07.2009 - 04.2010

Role - Network Performance Analyst

Nokia Siemens Networks
Patna
10.2007 - 07.2009

Education

MTech - Data Science & Engineering

Birla Institute of Technology And Sciences
Pilani
04.2019 - 02.2022

PGPDM - Data Science & Machine Learning

The University Of Chicago Graham School
Chicago
06.2018 - 2019.03

Bachelor of Engineering - Electronics & Communication

SBMSIT
Bangalore
01.2000 - 2004.01

Timeline

Principal Data Scientist

Randstad Digital
01.2024 - Current

Role -Data Scientist -Gen AI, ML & Cloud

Rakuten Symphony
01.2022 - Current

Role - Data Scientist -NLP & Text Analytics

ITC Infotech
07.2020 - 12.2021

MTech - Data Science & Engineering

Birla Institute of Technology And Sciences
04.2019 - 02.2022

PGPDM - Data Science & Machine Learning

The University Of Chicago Graham School
06.2018 - 2019.03

Role - Data Scientist -ML & Cloud

Reliance Jio Infocom Limited
10.2016 - 07.2018

Role - Senior Engineer -ML

ZTE Telecom India Pvt Ltd
01.2015 - 11.2015

Role - Engineer -ML & Database

Ericsson
11.2012 - 12.2014

Role - Senior Data Analyst

Nokia Siemens Network
04.2010 - 06.2012

Role - Mobile Network Business Analyst

ZTE Telecom India Pvt Ltd
07.2009 - 04.2010

Role - Network Performance Analyst

Nokia Siemens Networks
10.2007 - 07.2009

Bachelor of Engineering - Electronics & Communication

SBMSIT
01.2000 - 2004.01

Certifications

Computer vision Nanodegree Program

• Udacity

Machine Learning Master's Program

• Teclov

Extensive Python for AI

. The school of AI

WHAT I AM PROUD OF


• Industry Project (for The Math Company, PGPDM program)

(Advanced regression analysis of price elasticity at US state-level )

• Integrated Python code in Tableau to seamlessly forecast the t+n data points from tableau itself.

Zomato Restaurant Ratings Prediction

• Finished at 5 th place in DLabs Data Science competition.

International onsite customer support

• Worked in Shanghai,Shenzhen,Hanoi,Helsinki,Dhaka,TelAviv,Tunis,Bangkok,Ho Chi Minh City

Lohith G NData Scientist