Results-driven and highly skilled Data Engineer with 3.5 years of hands-on experience working with Azure Storage, Azure Data Factory (ADF), PySpark, and Databricks. Proficient in designing, implementing, and optimizing cloud-based data pipelines, as well as managing and processing large datasets. Adept at transforming raw data into valuable insights for analytics and machine learning applications. Strong problem-solving and team collaboration skills, ensuring successful project delivery in a fast-paced environment.
Overview
4
4
years of professional experience
Work History
Azure Data Engineer
CAPGEMINI TECHNOLOGY SERVICES INDIA LIMITED
Bangalore
05.2021 - Current
Designed and developed scalable ETL pipelines using Azure Data Factory (ADF) to integrate data from multiple sources including on-premises databases, Azure Blob Storage, and SQL Data Warehouse
Built and maintained Azure SQL databases and managed data flow between data warehouses and data lakes using ADF pipelines
Implemented data transformation logic in Azure Databricks using PySpark to process large datasets for real-time analytics
Utilized Azure Data Lake Storage Gen2 for storing raw and processed data, ensuring proper folder structures for efficient querying and performance optimization
Assisted in setting up and monitoring data pipelines in Azure Data Factory, contributing to the ingestion of data from various sources
Set up monitoring and alerts with Azure Monitor and Log Analytics to track the health and performance of data pipelines
Managed data migration and integration processes from on-premise systems to Azure Cloud environments, ensuring seamless data flow and minimal downtime
Troubleshot and optimized data workflows, improving the overall performance and reducing job failures by 25%
Wrote custom PySpark code for data cleaning and transformation tasks, helping to prepare data for analysis and machine learning models
Automated data pipelines and data flow processes with Azure Data Factory, ensuring data is processed with minimal latency
Azure Data Services: Azure Data Factory (ADF), Azure Databricks, Azure SQL Database, Azure Blob Storage, Azure Data Lake Storage (Gen 2)
Design, Develop and Implementation at Capgemini Technology Services India LimitedDesign, Develop and Implementation at Capgemini Technology Services India Limited