Meticulous Data Scientist with 5 years of experience who is accomplished in compiling, transforming and analyzing complex information through software. Expert in machine learning and large dataset management. Demonstrated success in identifying relationships and building solutions to business problems.
•Performed data preprocessing and Built ML models for classification of customers according to their trading behavior using K- means algorithm with optimal number of groups were 4, also performed evaluation of models using Elbow method
•Built analytical reporting pipelines to feed data to Tableau for data visualization
•Calculated Key Performance Indicators (KPI) and metrics with industry standards and large amounts of data into concise targeted information summaries and built reports using Statistical Methods (like Hypothesis testing), Python (PySpark, Pandas), and SQL for higher management reports which are accountable for making business decisions
•Managed and executed data warehouse (Snowflake) plans for a group of products to solve well-scoped problems by building ETL pipelines using Databricks, Airflow which reduced time for query by almost 90%
•Collected IOT data related to natural gas and performed data preprocessing using Python programming
•Built a Prediction model using XGBoost and Linear regression with 81% accuracy for various natural gases like CO2 and Nitrogen which can help in making decisions for environmental sustainability
•Helped CTO with designing deployment strategies for the data architecture of the company
•Built ETL pipelines to retrieve data from AWS (DynamoDB) and ingest it into Data Warehouse (Snowflake)
•Collected data from various crypto platform API and performed pre-processing to do EDA (Exploratory data analysis)
•Built a Prediction model for various cryptocurrencies using the Time-Series model (ARIMA) and Machine Learning model like the Random Forest algorithm with maximum 87% accuracy
•Built a tax calculation model using accounting methods for one of the features on the platform which can help user to calculate their taxes on Crypto earnings
•Worked on educational projects to bring change in a university curriculum by analyzing student data •Scraped 4000 alumni data using Python (BeutifulSoup, Selenium) from LinkedIn
•Performed data transformation to convert it into structured format
• built ML models (DBSCAN) for classification of alumni according to their career path
•Automated ETL processes for data transformation, and cleaning using SQL, Python, and R
•Ensure data validation and data quality issue checks using SQL, Excel, and internal & external data sources •Maintained complex SQL queries and views in a multi-database environment with minimal supervision
•Identified and measured KPIs across all business areas of retail, Insurance to modify business strategies accordingly
1.Analysis of Amazon product’s reviews and predict recommendation status.
4.Determination of Cryptocurrency Market Volatility to Predict Future Performances & Optimize Decision Making:
https://drive.google.com/file/d/1wBAtpHnfCRhKDXCIlqm5ONltghCQfSdo/view?usp=sharing • Technology used: NLP and Time Series Model
5.Determination of stock market volatility to predict future performances based on past indicators using selected 6 sectors:
https://drive.google.com/file/d/1tSRGgC8sE-QluokRCTffgoqFtN4rZBll/view?usp=sharing • Technology used: Time Series model, ARCH and GARCH model