Summary
Overview
Work History
Education
Skills
Certification
Personal Information
Languages
Timeline
Generic

Jyoti SHARMA

Faridabad

Summary

Data Scientist with 4 years of hands-on industry experience delivering scalable machine learning, deep learning, and Generative AI solutions that drive real business value.

Overview

5
5
years of professional experience
1
1
Certification

Work History

Data Scientist

OrBEX Technologies PVT LTD
07.2021 - Current
  • Designed and developed an enterprise-scale intelligent knowledge assistant using a hybrid RAG architecture to deliver accurate, context-aware question answering over large organizational document repositories.
  • Architected a complete RAG pipeline covering document ingestion, intelligent chunking, embedding generation, hybrid retrieval (dense vector + sparse keyword/BM25), re-ranking, and LLM-driven response generation.
  • Implemented hybrid search techniques to balance semantic relevance with exact-match accuracy for domain-specific terms, identifiers, and policy references.
  • Built metadata-aware retrieval mechanisms with role-based and contextual filters (department, document type, time range) to ensure secure and relevant responses.
  • Integrated LLMs using structured prompts, context windows, and citation logic to generate grounded, explainable answers with source attribution.
  • Developed scalable REST APIs using FastAPI with asynchronous processing to support low-latency query handling.
  • Delivered a reliable and scalable enterprise knowledge platform that significantly improved information accessibility and reduced reliance on manual document searches.
  • Improved answer relevance and factual accuracy through the combination of semantic and keyword-based retrieval techniques.
  • Developed an enterprise-scale analytics solution to predict disease progression risk for chronic patients using machine learning, combined with a retrieval-augmented generation (rag) system to provide clinical explanations and guideline-based recommendations.
  • Collected and analyzed patient-level clinical data including demographics, diagnoses, lab results, vitals, and historical medical records.
  • Conducted exploratory data analysis (EDA) to identify trends, disease patterns, and key predictors of disease progression.
  • Built and evaluated machine learning models (Random Forest, XGBoost) to predict disease progression risk scores.
  • Optimized model performance using hyperparameter tuning and evaluated results using metrics such as ROC-AUC, precision, recall, and F1-score.
  • Implemented a RAG-based clinical knowledge support system to retrieve relevant treatment guidelines, clinical protocols, and research summaries from internal medical documents.
  • Integrated ML predictions with RAG outputs to provide contextual explanations such as contributing risk factors and recommended monitoring actions.
  • Developed APIs using FastAPI to expose prediction and clinical explanation services.
  • Collaborated with healthcare stakeholders to translate clinical requirements into data-driven solutions.
  • Created dashboards and reports to visualize risk distribution, model performance, and patient risk trends.
  • Enabled early identification of high-risk patients using machine learning–based disease progression risk scoring.
  • Improved clinical decision-making by providing guideline-backed explanations through a RAG-based clinical knowledge system.
  • Developed an end-to-end Machine Learning–based Credit Risk Scoring system to assess the probability of loan default and support data-driven lending decisions.
  • Worked closely with business stakeholders to understand lending workflows, credit approval criteria, and regulatory constraints.
  • Defined the target variable as loan default (binary classification) and framed the problem as a probability-based risk scoring task rather than a hard accept/reject decision.
  • Identified key business objectives such as minimizing defaults, controlling false approvals, and maintaining regulatory compliance.
  • Collected and analyzed historical loan application data including customer demographics, income, employment details, credit history, and repayment behavior.
  • Performed extensive Exploratory Data Analysis (EDA) to identify missing values, outliers, skewed distributions, and multicollinearity among features.
  • Analyzed default rate trends across customer segments to identify high-risk patterns.
  • Built baseline models using Logistic Regression to ensure interpretability and establish benchmark performance.
  • Converted predicted probabilities into interpretable credit risk scores and risk bands (Low / Medium / High).
  • Improved loan default detection by accurately identifying high-risk applicants before approval.
  • Reduced financial risk by minimizing risky loans.

Education

Bachelor of Technology(B.Tech) -

Siddhi Vinayak College Of Engineering
Alwar, Rajasthan, India

Skills

  • Supervised and Unsupervised Machine Learning techniques
  • Statistical Analysis
  • Modeling
  • Hypothesis Testing
  • Feature Engineering
  • Data Preparation for ML models
  • Predictive Modeling
  • Advanced Analytics
  • Experiment Design
  • A/B Testing
  • Data Visualization
  • Reporting
  • Interactive Dashboards
  • Natural Language Processing
  • Text Classification
  • Sentiment Analysis
  • End-to-End Model Deployment
  • APIs
  • Flask
  • FastAPI
  • Streamlit
  • SQL Query Optimization
  • Data Analysis
  • Big Data Processing
  • Business Problem Framing
  • Requirement Analysis
  • Cloud-based Machine Learning Solutions
  • AWS
  • Azure
  • Convolutional Neural Networks
  • ResNet
  • EfficientNet
  • U-Net
  • YOLO
  • R-CNN
  • Sequence Models
  • RNN
  • LSTM
  • GRU
  • Seq2Seq Architectures
  • Transformer Models
  • BERT
  • GPT
  • ViT
  • T5
  • Attention Mechanisms
  • Sequence Learning
  • Generative Models
  • GANs
  • Diffusion Techniques
  • Model Optimization Techniques
  • Quantization
  • Pruning
  • Knowledge Distillation
  • Large Language Models
  • LLaMA
  • Mistral
  • Prompt Engineering
  • Structured Prompt Design
  • Retrieval-Augmented Generation
  • RAG
  • Vector Databases
  • Pinecone
  • FAISS
  • Chroma
  • Programming
  • Data Handling
  • Python
  • SQL
  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Scikit-learn
  • TensorFlow
  • Docker
  • LLM-based systems
  • Hugging Face APIs
  • LangChain
  • Parameter-efficient fine-tuning
  • Retrieval-Augmented Generation Pipelines
  • Backend Frameworks
  • Creative thinker
  • Ability to communicate
  • Ability to manage multiple projects
  • Team worker
  • Collaborative
  • Responsible

Certification

  • IBM Certified Machine Learning
  • Google Cloud Professional Data Engineer

Personal Information

  • Date of Birth: 07/20/88
  • Nationality: Indian
  • Marital Status: Married

Languages

  • Hindi
  • English
  • Hindi

Timeline

Data Scientist

OrBEX Technologies PVT LTD
07.2021 - Current

Bachelor of Technology(B.Tech) -

Siddhi Vinayak College Of Engineering
Jyoti SHARMA