Experienced Data Analyst with 2+ years, specializing in ETL processes, BI tools, and Snowflake. Proficient in Python, PySpark, SQL. Proven track record in automating data pipelines, creating interactive dashboards, and adapting to new skills like NLP and clinical data summarization. Ready to excel in data delivery roles with expertise in utilizing AWS and Snowflake for efficient data management.
Global Project Management - AWS, ETL, Snowflake
- Led extraction of data from Workfront API.
- Architected ETL pipeline with AWS S3 and Python for integration into Snowflake.
- Utilized ARIMA and FBProphet for forecasting models.
- Optimized resource allocation for 200+ member project team using Tableau.
Clinical Document Classification and Data Extraction - NLP, PySpark, AWS RDS
- Directed processing of 20,000 Informed consent documents monthly.
- Enhanced tool with SpaCy NLP and Fuzzy techniques for swift data extraction.
- Resulted in monthly time savings of 500 hours.
- Integrated PySpark for efficient processing.
Protocol Management for Diverse Clients - Python, ETL, AWS Data Stack
- Managed and optimized 150+ clinical trial protocols.
- Utilized in-house ETL tool for processing EDC and SaaS data.
- Leveraged AWS technologies for efficient data processing.
- Ensured compliance with regulations through Python loader classes and SQL queries.
Ad-hoc User Access Provisioning Data Pipeline- ETL, Data Pipeline, AWS Data Stack
- Automated user access provisioning via email, Excel, and FTP.
- Developed ETL with AWS Data Ops tools like Redshift and Athena.
- Cleared 150,000 task backlog in two months.
- Achieved Ace Level 4 award.
- Achieved savings over $1 million.
Cloud Technologies:
- AWS (S3, Redshift, Athena, CloudWatch, Simple Queue Service, EC2, AWS RDS)
- Google BigQuery
Data Processing and Analysis:
- Python
- ETL
- Tableau
- Snowflake
- PySpark
- Hive
- SQL
- Data Modeling
- Data Warehousing
Operating Systems and Scripting:
- Linux
- Windows
- Shell Script
1. Reliance Stock Forecasting Project: Utilized FBProphet ML model for stock forecasting.
2. Global Terrorism EDA: Conducted exploratory data analysis on global terrorism dataset using Python data analysis packages.
3. ETL Tool for Snowflake Integration: Developed ETL tool to integrate AWS S3 data into Snowflake warehouse, including customized dashboard implementation.