Seasoned Applied Data Scientist with 16 years of experience, including over 6 years of leading high-impact AI initiatives. Skilled in aligning data product strategies with business objectives, managing cross-functional teams, and driving enterprise-wide transformation through Generative AI. Demonstrates strong expertise in project execution, stakeholder engagement, and talent development, with a proven record of scaling data science capabilities across organizations.
Tech Stack: Python, SSMS, Llama 3.1, Azure Fabric, Azure AI Foundry, Azure AI Search, Fabric Data Factory, OneLake, and Model Serving as a REST endpoint in a WebApp.
MedhaK:
Spearheaded the development of MedhaK, a generative AI-powered virtual assistant, fine-tuned on proprietary enterprise data to deliver contextual insights, and enhance employee decision-making efficiency across departments.
Tech Stack: Python, PuMuPDF, ChromaDB, RAG, Llama2, Prompt Technique, React, Ubuntu 22.04 LTS, Celery, REST API.
iLMS: Integrated Legal Monitoring System
Tech Stack: Python, LangChain, RAG, UniNER, FastAPI, Ollama, and Vector DBs for retrieval.
e-Despatch 3.0:
Tech Stack: .NET Core, Python, Pytesseract, Poppler, PySimpleGUI, NLP, OpenCV, NER entity extraction, Selenium for RPA, MySQL.
Resume Parser & Ranker:
Tech Stack: EasyOCR, NLP, SpaCy NER, fuzzy string matching, LabelStudio, AWS Elastic Beanstalk, and Rest API.
Classification of Mineral Ore (Lumps and Fines):
Tech Stack: Python, Label Studio, OpenCV, YOLO V8, GitHub Action, ECR, EC2, Docker Container.
Mineral ASP Forecasting:
Tech Stack: Python, GitHub, EDA, Statistical Analysis, Feature Engineering, Outlier Detection, Parametric/Non-parametric ML Algorithm, MLFlow, Container Registry, WebApp.
Client: OKTA, Recommendation System
Tech Stack: Python, NLP, TfidfVectorizer, NLTK, SVD.
Client: Northern Power Grid (NPG)
Tech Stack: Oracle 10g, SQL, PL/SQL, UNIX, and CosBatch Job Scheduler, EQ design, Application Management, Service Delivery.
Client: National Bank Of Paris (BNPP)
Tech Stack: Oracle 10g, UNIX, HP Quality Center, SQL, PL/SQL, HLD/LLD design, impact assessment, and price estimation.
Client: Hallmark National Insurance Company, ALAMANCE
Tech Stack: Oracle 9i, Forms 9i, SQL, PL/SQL, UNIX.
Programming Languages and Tools: Python (Scikit-learn, Pandas, NumPy, Matplotlib, SciPy, TensorFlow), PL/SQL
Statistical Tests: Statistical Analysis, EDA, Central Limit Theorem, Anomaly Detection, Parameter Estimation, PCA, SVD, Z-stats, T-stats, Confusion Matrix, Tolerance, VIF, Z-score, Rank Correlation, AUC-ROC Curve, Hypothesis Testing, A/B Testing, BOX-COX Transformation
Machine Learning Algorithms: Linear and Logistic Regression, SVM, Decision Trees, Random Forest, KNN, Naïve Bayes, K-Means, DBSCAN, Gradient Boosting, XGBoost
Deep Learning: ANN, CNN, RNN, TensorFlow
Generative AI: LangChain, RAG, Prompt Design, Fine-Tuning (LoRA, QLoRA, RAFT), BLEU
Image processing: PIL, Pillow, YOLO
Database: Oracle, MySQL, MongoDB, Chroma DB, Weaviate, Pinecone, Neo4j
Text Analytics: NLP, NLTK, SpaCy
Tools: GitHub, GitLab, Docker Hub, DVC, and MLflow
Cloud Platform: Azure, AWS, EBS, ECR, Container Registry, Azure Data Fabric