
Full Stack Data Scientist with strong computer science and machine learning background with a special focus in Natural Language Processing (NLP), Predictive Analysis & Data Modeling . Involved in the Python open source community and has a huge passion towards NLP, Deep Learning and Transfer Learning. Data handler with excellent handle over batch and stream processing. Security person with understanding of key concepts of cryptography and ethical hacking. Creative thinker with strong story-telling skills & data visualization. Natural team player and mentor A Data Scientist who has a firm command over Data Engineering Tools and Model Architecture. Veteran data science professional experienced in identifying opportunities and strategizing methods for improvement. Detail-oriented, methodical and enterprising with strong focus on devising and running effective processes.
Projects
Creating Skills -to- Role Architecture for Identifying the Skill Gap In the Job Market [AWS, Redshift, Scraping, Tensorflow, Spacy, Flask, Informatica]
Text-Analytics for a Live User Engagement Platform [Python, Flask , BERT, Google Cloud NLP]
GUI Tool for Business Stakeholders And Skilling Sciences Team [PyQt, Qt5 Framework, Python, SQL]
Data And AI Pipeline [Google Cloud (Composer, Container Registry, Kubernetes, Airflow, Cloud Storage, SQL Server), Shell Scripting , Docker, Python]
MLOps Pipeline For Classification Model [GCP (Cloud Build, Cloud Functions, Pub/Sub, GKE), GIT, Python, Docker, Flask]
Job Demand Report [GCP (Cloud Build, Container Registry), GIT, Python, Docker, Fast API, Alteryx]
PROJECTS [Part of M.Tech & UG]
Invoice Categorization
Multi Label Text Classification
Banking behavioral scorecard for Internal Liability customers
Projects
SAS Code Recommendation Engine [Python, CoreNLP]
Cloud based Asset tracker for an Autonomous vehicle data management company [AWS Stack, Python, Ionic, Angular]
Entity Extraction on Operating System Registry For Automated Software Updation and Deletion In An IT-Infrastructure [Python, CoreNLP]
Question Answering Model Using Bi-Directional Attention Flow Network [Tensorflow, Python, Glove Embeddings ]
Real Time Data Analytics and Visualization [Tableau, SQL, Snowflake]
A Control Hub for Real-Time Data-Intensive Applications [Docker, Python, Hashicorp Vault, Docker Compose]
Oracle 11g, MySQL, MongoDB, SQL
Python, R, Javascript
Bootstrap, Angular, Flask, Swagger, Streamlit, Fast API, FIGMA, nginx, Apache
Ubuntu, Windows, Centos, Fedora
GCP, AWS, Azure, Heroku, Neptune, Paperspace, Informatica, Talend, AbInitioAlteryx, Snowflake, Informatica
NLP, Full Stack, Language Models (LLM), Data Engineering
Predictive Analytics, CHAID, CART, RFM Analysis & Funnel Analysis
PCA/LSI/LDA, Clustering, Semantic Similarity, Transfer
Neural Nets, CNN, Factor Analysis, Linear Modeling
Dynamic Programming, Shell Scripting
Tensor Flow, Pytorch & Pytorch Lightning, Keras, Hugging Face Transformers
Plotly, Redshift, Docker, Vopal wabbit
Docker Compose, Ansible, Git
Kafka, Flume, Spark,Sqoop
Statistical analysis, Machine learning
Intelligence gathering, Agile framework understanding
Attention to Detail, Critical Thinking, Teamwork and Collaboration, Decision-Making
Cultural Awareness, Stakeholder Management, G-Suite
Power BI, Qlikview, Tableau
Ambari, Zookeeper, Pig, Hive
Hbase, Tez, Yarn
Publications & Research Work
Seminars
Machine Learning Course by Andrew Ng in Coursera.