Experienced Data Scientist with 7.8 years of hands-on expertise in machine learning, NLP, and deep learning, having contributed to AI initiatives at Accenture, TietoEVRY, and currently, Hero MotoCorp. Strong foundation in classical ML techniques and production-grade analytics, with a proven track record of adapting to and leading generative AI transformations. Skilled in fine-tuning LLMs (LLaMA2, Mistral) using QLoRA, building RAG pipelines, and deploying use cases such as contract redlining, clause-level risk analysis, market share analytics, and conversational AI. Experienced with cloud-native AI development on platforms like Databricks, Azure OpenAI, and Hugging Face, seamlessly integrating GenAI into real-world enterprise applications.
Tools & Frameworks: Python, Databricks, PyTorch, QLoRA, Hugging Face Transformers, FAISS, Streamlit, LangChain, Azure, SQL, Power BI.
Tools & Technologies: Python, Databricks, PySpark, Azure, Whisper, LLaMA2, MLflow, Node.js, Microsoft Bot Framework, LUIS, SQL, XGBoost, Spectral Clustering, Adaptive Cards.
Project: OpenAI Call Transcription and SummarizationDuration: Dec 2023 - Oct 2024.
Duration: July 2023 - Oct 2024.
Duration: July 2023 - October 2024.
Duration: Oct 2022 - June 2023.
Duration: July 2022 - Dec 2022.
Duration: Nov 2021 - June 2022.
Project: SAS to PySpark Migration on Databricks.
Duration: Apr 2019 - 2021.
Tools and Technologies Used: Python, PySpark, SAS, Databricks, Azure. Project Description: Led the migration of legacy SAS macro-based ETL pipelines to PySpark on Databricks for improved scalability, cost reduction, and maintainability.
Project: Sentiment Analysis, Topic Modeling, and Power BI Automation.
Duration: Nov 2017 - March 2019.
Tools and Technologies used: Python, Flask, Power BI, NLP, DAX, AIML Libraries, Project Description: Developed an end-to-end NLP solution for extracting sentiment and topics from operational data to improve decision-making, paired with automated Power BI dashboards.
Technical skills
π‘ Machine Learning & NLP
Supervised & Unsupervised Learning, Classification, Clustering, Feature Engineering, Text Preprocessing, Topic Modeling, Model Evaluation (Precision/Recall/F1)
NLP Techniques: Tokenization, Lemmatization, POS Tagging, TF-IDF, Word2Vec, Transformers, Named Entity Recognition (NER), Clause Classification
π§ Generative AI & LLM Ecosystem
Fine-tuning with QLoRA, PEFT, LoRA Adapters
LLMs: OpenAI GPT, LLaMA2, Mistral, Gemini, Falcon
Frameworks: LangChain, LlamaIndex, Hugging Face Transformers
Embeddings: Sentence Transformers, OpenAI Embeddings, Azure Text Embeddings
RAG Pipelines: FAISS, Vector DB Integration, Clause-level Retrieval
Azure AI Services: Azure OpenAI, Azure AI Search, Azure AI Studio, AI Foundry
Conversational AI: Microsoft Bot Framework, LUIS, Prompt Engineering, Multi-turn Dialogs
π§° Deep Learning & CV
Neural Networks: MLP, CNN, RNN, LSTM, Seq2Seq
Applications: Image Classification, Object Detection (YOLOv3), GANs, Autoencoders
Frameworks: TensorFlow, Keras, PyTorch
π§ͺ Tooling & Libraries
Python Ecosystem: NumPy, Pandas, Scikit-learn, Matplotlib, Seaborn, SciPy, OpenCV
GenAI Tooling: LangChain, Transformers (HF), Whisper, Streamlit, FAISS
MLOps: MLflow, Azure ML Pipelines, CI/CD for ML, Drift Monitoring
βοΈ Cloud & Data Engineering
Cloud: Azure (Databricks, Blob, OpenAI), AWS (S3, EC2), GCP
Big Data: PySpark, Databricks Workflows
Databases: MySQL, SQL Server, Azure Data Lake, Cosmos DB
DataOps: Delta Lake, Data Factory, Azure Synapse
π Visualization & BI
BI Tools: Power BI, Tableau, Google Data Studio
Dashboarding: DAX, Drill-through, Automated Reporting
π Web & Version Control
Web Frameworks: Flask, Django, FastAPI
Versioning & DevOps: Git, GitHub, Azure Repos, JIRA, Azure DevOps