Summary
Overview
Work History
Education
Skills
Websites
Affiliations
Accomplishments
Languages
Certification
Projects
References
Timeline
Generic
LALKRISHNA H. TRIVEDI

LALKRISHNA H. TRIVEDI

New Mumbai

Summary

Dynamic Data Scientist with a proven track record at Sagility India, specializing in advanced analytics and machine learning to enhance safety and decision-making. Achieved 89% accuracy in real-time surveillance analytics using ResNet50, demonstrating expertise in Python and leadership. Instrumental in driving advancements in BI applications and data-driven strategies through innovative AI solutions. Aiming to leverage analytical skills to further impact organizational performance.

Overview

8
8
years of professional experience
1
1
Certification

Work History

Data Scientist

Sagility India Private Ltd (Formerly HGS)
Mumbai
03.2021 - Current
  • One of the best, most notable, and groundbreaking works of my personal and professional career that I have done during my tenure at Sagility.
  • Developed customized analytics for assessing agent interactions within customer care data, primarily for the healthcare insurance sector.
  • Created Python scripts for generating predictions related to agent performance from call transcription data post-STT.
  • Mastered and gained a comprehensive understanding of workplace processes, uncovering multiple areas for improvement.
  • Led the creation of the SensAI product in a service-based company, transforming business processes for better customer experience management.
  • Enhanced current systems by architecting and implementing SensAI.
  • Designed multiple SensAI features, ensuring industry-grade standards. Of which, I created a customizable, rule-based fuzzy NLP engine enabling accurate insights extraction from text data with n-gram level granularity. Besides, I created the AQEP audit page, which is a one-stop solution for AQEP audit process automation.
  • Besides architecting some other important features, such as NLP planner, Measure, ETL, Scheduler, User/Role management, Tenant management, and ground truth management, etc.
  • Enhanced efficiency and scalability of SensAI by implementing parallelization and batch processing.
  • SensAI resulted in a very promising and big success for the client. Increased the number of audits by 155% (3 times more than manual) with one-third the deployment cost and a 50% reduction in overall audit time, despite being in the initial phase of release.
  • I led a team of six enthusiastic data scientists as I progressed from junior to senior data scientist in the NLP team, and I also created and directed a separate team for SensAI web app development. Single resource bridging the gap between data science and web development.
  • Facilitated CI/CD pipelines, server setups, and SOC 2 security compliance for SensAI.
  • Collaborated with numerous cross-functional teams to deliver impactful client demos and has sparked a lot of interest in the SensAI product.
  • Created SensAI features that are superior to market competitors.
  • Guided creation of dynamic word clouds and a generic NLP/data science pipeline.
  • My recent work involves the inclusion of LLMs in the AQEP automated audit process of SensAI, enabling more robustness. For SensAI 2.0, facilitating the creation of Dify pipelines for AQEP audit NLP rules using LLMs and prompt engineering as separate microservice endpoints.

Free Lancing

Ph D Research Assistance Work
Bhavnagar
06.2020 - 04.2021
  • Provided research assistance in a local Ph.D. Project work named "Real-Time Human Violence Recognition and Location for Indoor Surveillance Systems."
  • Enhanced safety by identifying threats to human life in surveillance footage.
  • Conducted in-depth evaluation resulting in the selection of ResNet50 over other models such as CNN, LSTM, and VGG16.
  • Created a clean, annotated, and balanced dataset for project work encompassing eight violent action classes, including one normal, non-violent class.
  • Performed comprehensive analysis of related studies in computer vision.
  • Secured 89% accuracy using a fine-tuned ResNet50 model for multi-class classification and localization tasks.

Associate Product Engineer

Entomo (Formerly KPISOFT Technologies)
10.2017 - 04.2020
  • Designed business intelligence (BI) webapp features for querying transaction data, which aggregate facts from multiple dimensions and hierarchies within BI queries, and render on the UI in the form of High-charts and push notifications termed as Insights.
  • Developed robust REST APIs and middleware services to interact between components.
  • Made sure the modules are nearly error-free through JUnit test cases.
  • Mentored junior and peer-level software engineers.
  • Coordinated with the CS and ETL teams in designing data and view models.
  • Created POC for product enhancements

Internship

Ericsson R & D
Bengaluru
07.2016 - 08.2017
  • Enhanced monitoring capabilities at network operational centers through automation.
  • Utilized machine learning to analyze telecommunication network faults and anomalies.
  • Managed live streams of network alarm data to identify patterns.
  • Apache Kafka was used to collect live streams of network alarm data, providing input to Apache Spark.
  • Utilized Apache Spark for parallel in-memory data processing.
  • Apache Hadoop was used for scalability.
  • Node.js was used for web UI development.

Education

M.Tech - CSE

M.I.T
Manipal
06.2017

B.Tech - CE

C.S.P.I.T
Gujarat
05.2014

H.L.S.C/12th - Science

Gyanmanjari Vidhyapith
Gujarat
04.2010

Skills

  • Languages: Python, SQL, HTML, CSS, and JS
  • Frameworks: Django, Dify, Apache Spark, Apache Kafka, Hadoop, Apache Presto, Apache Drill
  • Databases: MySQL, MongoDB, MSSQL, Amazon S3, Redis
  • OS: Windows, Linux
  • Cloud Platform: AWS EC2
  • Code Version: Git, GitHub
  • Object-Oriented Programming and design patterns
  • Gen-AI/LLMs: Llama 32 8B, MiniLM, OpenAI, LangChain
  • Prompt engineering
  • Transformer models
  • DL Models: BERT, RAG, Hugging Face, CNN, RNN, LSTM, Bi-LSTM, ResNet50, VGG16
  • Vectorizer: GloVe, BERT, Bag of Words, TF-IDF
  • NLP: SpaCy, NLTK, NER, PoS, stemming/lemmatization, stop words, regular expressions, topic modeling (LDA)
  • ML models: Linear regression, logistic regression, naive Bayes, SVM, KNN, K-means clustering, EDA, PCA, XGBoost classifier, and AdaBoost classifier
  • Testing:
  • Project Management
  • Leadership
  • Software Architecture and Design
  • Visionary

Affiliations

  • Spirituality
  • Meditation
  • Yoga
  • Table Tennis
  • Swimming
  • Acting/Drama
  • Writer

Accomplishments

  • Got promoted twice in 3 year during tenure at Sagility through stellar performance and innovations.
  • Led KM portal initiative at Entomo.
  • Had been part of core team for ISMS/ISO9001 certification at Entomo.
  • Have been Class Representative during M.Tech
  • Have been Placement coordinator during B.Tech.

Languages

Gujarati
First Language
English
Proficient (C2)
C2
Hindi
Upper Intermediate (B2)
B2

Certification

  • Mastering Git & GitHub Program
  • Complete Data Science Boot camp 2024 - Udemy
  • AWS cloud practitioner

Projects

SensAI - NLP Rule Based Audit Analytics Solution (2.5 years, At Sagility)

  • Domain: NLP, Web App Development
  • Client: Blue Shield California (3 LOBs), Talispoint
  • Tools & Tech: Spacy, Pandas, Fuzzywuzzy, NER, Django, Python, HTML, CSS, MySQL, Redis, regular expression, pyparser, Dify, LLMs (Llama)
  • Description: SensAI is a multi purpose application offers wide varieties of features for data scientists and clients. From client perspective it offers an one stop audit page for auditing healthcare insurance customer care calls (agent-customer interaction) data to improve customer experience based on various client specific parameters such as whether agent showed empathy during call or whether agent greeted the customer, reason for calling, authorization etc. On the data scientist side, SensAI offers admin pages such as measures to define and maintain client specific NLP rules with unbounded hierarchy and associations, NLP Planner, ETL, Scheduler, User/Role management, Tenant management, Data Source, Data Engine, etc. Different types of NLP rules (rule based,ML/DL based, LLM based) can be created configured through measures page. NLP core engine is built on fuzzy-wuzzy with token sort ratio to match patterns defined by client specific rules and multi valued logic to resolve inter-dependency of rules. This NLP engine works on custom built vector representation of call transcription data for all ngram combinations possible. Batch processing and parallelization was used to compute large data in limited timeframe. NLP core engine generates predictions based on defined parameter rules with evidences (matched keywords/phrases) along with location in time in call transcription data which then becomes visible in Audit page. Recent work on creating dify pipelines for client specific rules with prompt engineering in Llama 3.2 8B model for SensAI 2.0 microservice architecture. We achieved stable accuracy above 80% for most of the client rules.

CRM IDB Information Validation through NER, RE, POS (6 Months, At Sagility)

  • Domain: NLP, Web App Development
  • Client: Talispoint
  • Tools & Tech: , Spacy NER, usaddress, re, Spacy PoS, MySQL, MSSQL Connector
  • Description: From call transcription data different client specific details such as provider names, group names, customer name, address, phone number, zip code, city etc was extracted with combination of regular expressions, Spacy PoS and NER which was then validated against queried data stored on remote client CRM microsoft Dynamics365 database. Achieved 70% accuracy for NER IDB extraction data.

Dynamic Word Cloud Generation (6 Months, At Sagility)

  • Domain: NLP
  • Client: Humana, LabCorp, Blue Shield California
  • Tools & Tech: Spacy, Pandas, NLTK, TFIDF, Clustering, Sentiment Analysis
  • Description: Based on daily client call volume transcription data, one month call transcription data used for training discarding previous days data on daily basis. Based on certain threshold 30 top most important words extracted for each positive, neutral, negative sentiment using TFIDF vectorization scores.

Automated Keyword Generation (3 Months, At Saglity)

  • Domain: NLP
  • Tools & Tech: Spacy, Pandas, NLTK, TFIDF, Clustering, BERT Topic modelling, LDA, ELBOW method
  • Description: This was a self initiated project to solve and reduce contextual dependency of obtaining keywords (churning) from SMEs for client specific AQEP audit parameters. This was implemented taking one month worth of call transcription data (~90000) and performed statistical analysis related to average n-gram for keywords, min-max range of keywords for agents, customer its corelation with call length and number of segments. Then Glove, word2vec embedding/vectorization techniques were used for K - Means clustering. DBScan was used to store vectorized data for processing. using elbow method obtained optimal number of clusters mainly 6 group of keywords. However, objective was to obtain as many as non redundant cluster of keywords possible and assign tags as per AQEP parameters to which they were close. In the end TFIDF was used to extract Top-N meaningful keywords per cluster. Alternatively LDA / BERT topic modelling approach was also tried out.

Multi Level Multi Class Classification of Survey Verbatim Data (3 Months)

  • Domain: Machine Learning
  • Client: Humana (PPI)
  • Tools & Tech: GridSearchCV, Decision Tree, Logistic Regression, SVC, Clustering, BERT, roBERTa, DistiBERT, Llama
  • Description: The object of this project was to automate manual process of classifying Humana PPI customer survey verbatim data into 5 hierarchical target variables. Data consisted attributes such as date, call id, UUI, agent id, survey question from 1 to 5 namely Q1 to Q5 etc., where Q5 was free text field containing comments from customers. Besides 5 target classes were perspective (mainly sentiment of text), NPS reason 1 to 4 as hierarchy. But data was imbalanced with limited data points specially for last two hierarchical target variables NPS reason 3 and 4. So solution was created for only initial 3 target variables. With Greed Search CV tried various classification models such as logistic regression, decision tree, SVC among which Logistic regression was best fit with 75% accuracy for perspective class, 69% for NPS reason 1 and 55% NPS reason 2. After then, DL approach was also implemented trained fine tuned BERT model and its variant and llama. With BERT, accuracy was improved upto 89% for perspective class , 77% for NPS reason 1, and 75% for NPS reason 2.

References

References available upon request.

Timeline

Data Scientist

Sagility India Private Ltd (Formerly HGS)
03.2021 - Current

Free Lancing

Ph D Research Assistance Work
06.2020 - 04.2021

Associate Product Engineer

Entomo (Formerly KPISOFT Technologies)
10.2017 - 04.2020

Internship

Ericsson R & D
07.2016 - 08.2017

M.Tech - CSE

M.I.T

B.Tech - CE

C.S.P.I.T

H.L.S.C/12th - Science

Gyanmanjari Vidhyapith
LALKRISHNA H. TRIVEDI