With over 8.1 years of experience in Microsoft Azure Cloud Computing and ETL Development, I have extensive expertise in various Azure components including Azure Data Factory, Azure Data bricks, Azure Data Lake Storage, Azure Blob Storage, Azure SQL Database, Logic Apps, Key Vault, PySpark, and Delta Lake. This comprehensive understanding enables to excel in building pipelines, datasets, and linked services for seamless data integration and transformation in the cloud. I have a proven track record of creating efficient pipelines for data processing, transformation, management, and computation within Azure Data Factory. In addition, I am skilled in scheduling and monitoring pipelines to ensure smooth data flow. Alongside proficiency in using Azure Data bricks and PySpark for data cleaning, transformation, and analysis of datasets. I hold certifications in AZ-900 and DP-203. Also possess knowledge of Azure Synapse, Python, and Microsoft Fabric. Furthermore, I am experienced in data warehousing using ETL techniques for data extraction, transformation, and loading.
● Designed and implemented ETL pipelines using ADF, Azure Databricks, and PySpark to process large-scale structured and unstructured data into ADLS.
● Built Delta Lake architecture for batch and streaming data processing with PySpark and SQL for advanced transformations and aggregations.
● Developed and optimized PySpark SQL queries for data validation, transformations, and integration with downstream systems.
● Optimized pipeline performance by implementing partitioning, caching, and parallelism in PySpark jobs.
● Ensured data security with Azure Key Vault, encryption for data at rest and in transit.