Passionate data enthusiast with expertise in Python, PySpark, and SQL scripting language for data analysis and manipulation. Strong foundation in machine learning and statistics. Proficient in Azure Data Factory and Azure Databricks, leveraging cloud data environments in Azure. Seeking opportunities to contribute to data-driven decision-making and drive efficiency and success in global businesses.
1. I am responsible for implementing data pipelines in a dynamic fashion.
2. Understanding the business requirements and accordingly develop data models and implement the tables or views.
3. Helped design the data landscape architecture for multiple projects.
4. Worked on Azure Synapse data warehouse. Optimized query written on On-Prem system to work in distributed Data Warehouse and helped to speed them up on an average by 93 percent.
5. Implemented a solution to go through ADLS using Python SDK and automatically create the entire metadata table, which was the foundation of the entire data pipeline, and completed the load of over 100 tables in just a matter of two days.
6. Enabled client before completion of project by giving a demo and also tutoring them on how to write optimized queries for their future use case on Synapse.
7. Have been rated as exceptional twice in a row for performance review.
8. Helped IBM to get a client by doing a solo POC on a local machine on a data science and data analysis project using Python code. Demonstrated to client in person with great success.
9. Got certified as an Azure data engineer Associate, as well as a Databricks Associate Data Engineer.
Strong Fundamental of Statistics
AZ-900
I enjoy reading about emerging technologies and about new algorithms in terms of data science and machine learning. I also enjoy solving puzzles using python. I aim to keep myself updated with latest developments.