Dynamic Senior Data Engineer with a proven track record at Deloitte, specializing in PySpark and Azure. Expertly designed scalable ETL pipelines, enhancing data quality and availability. Passionate about leveraging innovative solutions and collaboration to drive impactful business insights and optimize performance, ensuring seamless data integration and reporting.
Overview
9
9
years of professional experience
Work History
Senior Consultant
Deloitte
Chennai
12.2024 - Current
Developed DLT pipelines to read streaming data every 15 minutes from Amazon S3 using Direct Data API and autoloader.
Built pipelines through the Straight Conformance layer and enabled transformations in downstream CDM layers.
Integrated streaming and Delta tables to perform complex joins and aggregations, delivering high-value business insights.
Leveraged AWS services such as Glue Crawlers and Glue Catalog for data cataloging and made the cdm data queryable in Amazon Redshift
Senior Data Engineer
Deloitte
Chennai
03.2021 - 12.2024
Designed and implemented scalable ETL pipelines using Azure Databricks and PySpark to process large datasets for Factory Planning, including capacity planning, demand forecasting, and resource allocation.
Developed data integration solutions by pulling data from Azure Data Lake Storage, Blob Storage, and external APIs like Kinaxis, SAP Turbo, and SAP Sunrise, integrating them into a centralized data warehouse.
Engineered automated workflows using Databricks for data ingestion and transformation, ensuring reliable and consistent data availability for business reporting.
Performed data enrichment tasks (filtering, pivoting, aggregation, etc.) using PySpark and SparkSQL, improving the quality and usability of business data.
Worked on POCs and optimized solutions for performance tuning and cost reduction, while staying updated with new technology trends
Data Engineer
Tata Consultancy Services
Chennai
10.2019 - 03.2021
Built data platform solutions using Azure Cloud technologies, developing pipelines with Azure Data Factory and Azure Databricks to transform and process large datasets Migrated on-premises data solutions to Azure Cloud, ensuring seamless data processing and aggregation from sources to reporting tools.
Developed dashboards using Azure Log Analytics to monitor pipeline performance, ensuring data quality with custom-built validation checks.
Worked extensively on Spark and PySpark for handling large datasets, performing data enrichment tasks, and improving processing speeds.
Integrated RDBMS using Sqoop for loading processed reports into relational databases, automating data loads from retail datasets.
IT Analyst
Tata Consultancy Services
Chennai
07.2016 - 10.2019
Developed backend solutions using Unix Shell Scripting and SQL for managing and processing data related to market sales and product performance.
Automated processes for data ingestion and transformation, reducing manual intervention and optimizing workflow.
Participated in the full software development lifecycle, troubleshooting, and supporting backend data jobs in an Agile environment .
Led POCs for real-time data processing using Spark Streaming and evaluated ELT tools like Fivetran
Education
Bachelor of Engineering - Electronics And Instrumentation Engineering
Sastra University
Tanjore, India
04-2016
Skills
Pyspark
SQL
Azure
Databricks
Python
Redshift
POCs
Worked on Spark Streaming POC for processing real-time data from digital electrical meters using Azure Event Hub and Delta Lake for storage.
Evaluated Azure Data Factory against Fivetran for ELT processes, comparing the tools' effectiveness for automated data ingestion
Timeline
Senior Consultant
Deloitte
12.2024 - Current
Senior Data Engineer
Deloitte
03.2021 - 12.2024
Data Engineer
Tata Consultancy Services
10.2019 - 03.2021
IT Analyst
Tata Consultancy Services
07.2016 - 10.2019
Bachelor of Engineering - Electronics And Instrumentation Engineering