Results-driven Data Scientist with 4+ years of experience specialized in AI, deep learning, and machine learning, with a strong background in LLM. Committed to leveraging advanced methodologies to derive actionable insights and enable informed decision-making. Skilled in collaborating with diverse teams to tackle business challenges effectively.
Overview
4
4
years of professional experience
1
1
Certification
Work History
Data Scientist
RM Group of Education
Noida
07.2025 - Current
Developed a versatile search engine capable of efficiently retrieving relevant documents from various formats, including PDFs, Word Documents, and Text Documents.
Implemented OpenSearch, LlamaParse, and OpenAI (text-embedding-3-large) models to enhance search f1/accuracy + ROC-AUC and efficiency.
Utilized topic modeling, Spacy Tokenization, POS tagging, and GPT-4.1 / GPT-4o / Claude 3.5 Sonnet / Llama-3.1 for feature extraction from resumes, optimizing search results.
Conducted abstractive and extractive summarization of resumes, facilitating HR in skill assessment and job opportunity matching.
Employed Google T5 and Hugging Face BERT models for summarization tasks, ensuring concise and informative resume summaries.
Analyzed and compared results of state-of-the-art Question Answering models, including Bert-base-uncased, Cohere Rerank, GPT-4.1 / GPT-4o / Claude 3.5 Sonnet / Llama-3.1, utilizing LlamaIndex (RAG pipelines) for pipeline creation.
Developed intent classification models for labeled data, enhancing search engine capabilities across multiple classes.
Engaged in all phases of the data lifecycle, including exploration, processing, cleaning, and model building, employing algorithms such as Random Forest, SVM, XGBoost/LightGBM.
Utilized metrics such as f1/accuracy + ROC-AUC, confusion matrix, and f1/accuracy + ROC-AUC score to evaluate model performance.
Created APIs using FastAPI and deployed models for seamless integration into production environments.
Data Scientist
Tata Consultancy Services (TCS)
Delhi
06.2021 - 07.2025
Acquire and preprocess a diverse dataset of emails from various sources, ensuring data cleanliness and relevance.
Fine-tune transformer-based models (BERT, GPT, etc.) on the email dataset for tasks such as classification, summarization, and NER.
Implement email classification using transformer-based classifiers, optimizing performance metrics such as f1/accuracy + ROC-AUC.
Utilize transformer-based summarization models to generate concise summaries of email content, preserving key information.
Integrate transformer-based NER models to extract important entities from email text, enhancing the platform's understanding.
Develop a responsive web interface using Streamlit / React (Next.js) + FastAPI, allowing users to interact with the Email Intelligence Platform seamlessly.
Containerize the entire application using Docker for efficient deployment across different environments. Deploy the platform on AWS Sagemaker for scalable and reliable access.
Education
B-Tech - Computer Science Engineering
ITM University
Gwalior, M.P
Skills
Programming: Python,SQL
Deep learning libraries: Pytorch,Keras,Tensorflow
Deep learning: Text Classification,Text summarization,Natural Language Processing (NLP),Long Short-Term Memory (LSTM),Recurrent Neural Networks (RNN),LLM models, RNN'sLSTM,BERT,T5,Cohere Rerank, GPT-41, GPT-4o, Claude 35 Sonnet,Llama-31,OpenAI
Text processing: Tokenization,Normalization,Word Embedding,Name Entity Recognization
Machine learning : XGBoost/LightGBM , Random Forest , SVM