Data Scientist
Data Science and Machine Learning:
- Conducted comprehensive Exploratory Data Analysis (EDA), employing advanced visualization techniques to unveil patterns crucial for informed decision-making.
- Spearheaded initiatives in descriptive analysis and data cleaning, ensuring data quality to achieve optimal model performance.
- Developed and executed A/B testing strategies, contributing to evidence-based decision-making and streamlining business processes.
- Demonstrated practical proficiency in a diverse array of machine learning (ML) algorithms, encompassing XGBoost, Linear Regression, KNN, and SVMs, effectively addressing real-world challenges.
- Applied advanced ML techniques that includes: feature selection methods (filter, wrapper methods) , hyperparameter tuning and regularization(l1,l2) and ensemble methods, enhancing model robustness and performance.
Natural Language Processing (NLP) and Deep Learning:
- Leveraged NLP techniques, including Named Entity Recognition, TF-IDF and implemented advanced word, sentence embeddings using 'Word2Vec', 'GloVe', 'universal sentence encoder (USE)' algorithms for comprehensive similarity analysis.
- Applied deep learning architectures (LSTM, RNNs, and GRUs) with both sentence and word embeddings for advanced sentiment analysis, document clustering, and similarity analysis across diverse text datasets.
- Utilized Language Model Models (LLMs) such as DistilBERT for text classification, enhancing capabilities in NLP tasks. Implemented Whisper AI for cutting-edge audio-to-text transcription, expanding expertise and applications in AI.
Product work:
- Assisted in constructing an automated ML pipeline for the 'AI diet assistant product,' spanning ETL processes, data cleaning, Train-Test-Validation Split, EDA, Feature Engineering ,Model Selection and Implementation, Model Evaluation, Hyperparameter Tuning, and Model Validation.
Technologies used: Python, scikit-learn, XGBoost, TensorFlow, Keras, NLTK, spaCy, Gensim, Word2Vec, GloVe, Universal Sentence Encoder (USE), Whisper AI, Hugging Face Transformers, Pandas, NumPy, Matplotlib, Seaborn