Summary

Overview

Work History

Education

Skills

Technical Experience

Certification

Timeline

Bhakti Padwal

Pune

Summary

Accomplished Azure Data Engineer with 7.5 years of experience in designing and optimizing scalable Big Data solutions. Expertise in Azure Data Factory, Azure Databricks, and Azure Data Lake Gen2, along with programming skills in Python, PySpark, and MySQL. Developed ETL pipelines and enterprise-grade data warehousing solutions that enhance data-driven decision-making and operational efficiency.

Overview

years of professional experience

Certification

Work History

Senior Software Engineer

Tech Mahindra

Pune

07.2023 - Current

Designed & developed ETL pipelines in Azure Data Factory (ADF) for PII data masking which reduced manual efforts by 90%.
Optimized ETL pipelines in Azure Databricks to enhance data transformation efficiency.
Developed PySpark notebooks for streamlined data transformation and analysis, improving processing time.
Process large datasets efficiently using PySpark, Delta Lake, and SQL to ensure high performance and data accuracy.
Integrate Databricks with PIM (Product Information Management) system using MuleSoft APIs to update and synchronize product data.
Monitored and troubleshot API requests and responses, reducing error rates and improving data synchronization speed.
Ensure data quality and validation before sending data to PIM, reducing failures in downstream systems.
Collaborate with cross-functional teams including PIM specialists, API developers, and business analysts to ensure seamless data flow.
Implemented password management using Azure Key Vault.
Automated data handling workflows to continuously mask new records and maintain compliance.
Successfully secured and anonymized sensitive customer data, ensuring full GDPR compliance.

Software Engineer

Cybage Software

Pune

07.2022 - 07.2023

Enhanced data pipeline for analyzing user engagement and content performance on media streaming platform.
Executed data ingestion and transformation with Azure Data Factory and PySpark on Databricks, streamlining daily log and metadata processing.
Performed data cleaning, deduplication and joined datasets to derive meaningful insights like watch time and content completion rate.
Validated data accuracy and enhanced pipeline efficiency through collaboration with senior data engineers and analysts.
Developed and maintained technical documentation for data workflows, table structures, and business logic to facilitate project clarity and handover readiness.

Lead Software Engineer

Persistent Systems

Pune

04.2022 - 07.2022

Completed internal trainings on Databricks, Python, and Delta Lake, enhancing readiness for client projects.
Completed training on Azure Data Factory, Azure Synapse, Data Warehouse, and GitHub, strengthening cloud data pipeline skills.
Led software development projects using Agile methodologies and best practices.
Collaborated with cross-functional teams to define project requirements and deliver effective solutions.
Mentored junior engineers, enhancing their technical skills and knowledge sharing.

Big Data Developer

Infosys Limited

Hyderabad

03.2020 - 03.2022

Led migration of legacy mainframe batch processing to scalable big data solutions using Apache Spark and Hadoop HDFS, enhancing processing capabilities.
Reduced batch processing duration from over 6 hours on mainframe to under 1 hour, significantly improving data availability.
Ingested mainframe output files into HDFS, enabling parallel data processing in Spark.
Migrated and validated data schemas using Avro/Parquet formats, optimizing storage and ensuring compatibility across Spark jobs.
Used Hive external tables to provide a SQL interface for legacy teams while data resides on HDFS.
Ensured data consistency and lineage tracking across mainframe files and Spark output using hashing and audit columns, enhancing data integrity.

Testing Executive

Infosys Limited

Hyderabad

06.2018 - 02.2020

Executed database testing and schema verification using PGAdmin, ensuring data integrity and consistency across tables, which supported reliable data operations.
Created and executed SQL queries to verify test data, check joins and perform validations on relational schema.
Validated frontend data changes triggered by back-end actions, confirming alignment between user interface and underlying database.
Wrote DML queries for test data creation and cleanup during functional testing cycles.
Partnered with developers to identify and report database-related defects through JIRA, improving defect tracking and facilitating faster resolutions.

Education

B.Sc. Computer Science -

Fergusson College

Pune

12-2018

Skills

Python
PySpark programming
Azure Databricks
Azure Data Factory (ADF)
Azure Data Lake Storage
Delta Lake

Azure SQL Database
MySQL
Data Warehousing
Azure DevOps
Version Control(Git)
Azure Key Vault

Technical Experience

Experienced in working with PySpark using Spark Structured APIs Dataframes, and Spark SQL.
Good understanding of Spark Architecture including Spark Core, Spark SQL, Spark Dataframes, Driver Node, Worker Node, Stages, Executors, Jobs and Tasks.
Understanding of Hadoop Ecosystem including HDFS, Name Node, Data Node and MapReduce programming paradigm.
Good understanding of various Spark optimization techniques.
Experienced in working with Azure Data Factory pipelines, monitoring and triaging the failures and configuring triggers.
Experienced in developing PySpark notebooks for data transformation and analysis.
Experienced in scheduling and monitoring Databricks workflows for batch data.
Experienced in working with Delta Lake, Delta live tables, Unity Catalogue in Azure Databricks.
Experienced in working with Azure Databricks using PySpark with different Databricks utilities (File system, Notebook, Widget etc.).
Understanding of Databricks Unity Catalogue.
Hands-on experience on sending data to target systems using Databricks via REST APIs using MuleSoft.
Experienced in working with Azure Data Lake Storage Gen 2 and Azure Blob Storage.
Good understanding of Data Warehousing concepts.
Hands-on experience in writing optimized SQL queries and Stored Procedures for retrieving and analyzing the data.
Hands-on experience on Hive for creating and managing Hive tables in Hadoop, writing Hive queries for ad hoc data analysis.
Good understanding of different Optimization techniques such as Hive Partitioning, Query Level Optimization and Bucketing in Hive.
Hands-on experience on data transformations and end-to-end data validation for ETL using complex SQL.
Worked with variety of Big Data file formats such as CSV, JSON, XML, parquet, etc.
Good understanding of CI/CD process.

Certification

Databricks Certified Data Engineer Associate
Microsoft DP 203 – Data Engineer Associate
Microsoft DP 900 – Data Fundamentals
Microsoft AZ 900 – Azure Fundamentals
Microsoft AI 900 – AI Fundamentals
AWS Cloud Practitioner

Timeline

Senior Software Engineer

Tech Mahindra

07.2023 - Current

Software Engineer

Cybage Software

07.2022 - 07.2023

Lead Software Engineer

Persistent Systems

04.2022 - 07.2022

Big Data Developer

Infosys Limited

03.2020 - 03.2022

Testing Executive

Infosys Limited

06.2018 - 02.2020

B.Sc. Computer Science -

Fergusson College

Bhakti Padwal

Summary

Overview

Work History

Senior Software Engineer

Software Engineer

Lead Software Engineer

Big Data Developer

Testing Executive

Education

B.Sc. Computer Science -

Skills

Technical Experience

Certification

Timeline

Senior Software Engineer

Software Engineer

Lead Software Engineer

Big Data Developer

Testing Executive

B.Sc. Computer Science -

Similar Profiles

null null

Nilesh GurjarNilesh Gurjar

SHANTAM KUMARSHANTAM KUMAR

SHANTAM KUMARSHANTAM KUMAR

Eric BrunoEric Bruno