Big Data Engineer with over 3.5 years of experience in building scalable, cloud-native data solutions using Azure Databricks, Azure Data Factory, and PySpark. Skilled in processing high-volume datasets with tools like ELK Stack, OpenSearch, and SQL Server. Proficient in Python for data engineering, automation, and transformation tasks. Hands-on expertise in designing ETL pipelines, implementing data models, and optimizing analytics-ready datasets across Azure and AWS environments. Strong collaborator with a focus on data quality, performance optimization, and delivering business-impactful insights.
Azure Data Factory
Azure Databricks
MySQL, SQL, PostgreSQL
Python, PySpark
SSMS, SSRS, SSIS
PowerBI
Kafka
Elasticsearch
VC DataMart Frida Baby Data Insights Data Drill Food Chain ID
Developed and maintained an end-to-end data pipeline to transform raw data from multiple sources into business-ready datasets. Applied ELT principles to load, optimize, and validate data in SQL Server, enabling reporting and analytics through Power BI.
Key Skills:
MSSQL · Azure · Python · PySpark · Azure Data Factory (ADF)
Key Responsibilities:
• Designed and optimized ELT pipelines for large-scale data migration and transformation in Azure.
• Utilized PySpark for efficient processing of large datasets.
• Ensured data integrity and accuracy using validation logic during migration.
• Applied column back tracing techniques for reliable data mapping across systems.
• Collaborated with cross-functional teams to deliver business-aligned, analytics-ready data.
Managed end-to-end data processing for customer and order records with a focus on data quality, ETL automation, and business reporting. Enabled sales insights through data integration and transformation using ADF and PostgreSQL.
Key Skills:
PostgreSQL · Azure Data Factory (ADF) · Azure Blob Storage · NetSuite Saved Searches
Key Responsibilities:
• Designed and optimized PostgreSQL schemas for scalable customer and order data management.
• Created ADF pipelines to extract and transform data from NetSuite using Saved Searches.
• Prepared reporting datasets for Power BI dashboards to support business decision-making.
• Applied data cleansing and deduplication to ensure high-quality analytics.
Led a global data integration initiative to unify and analyze high-volume, high-frequency datasets from multiple internal and external sources, enabling cross-domain insights across partners, third parties, and subsidiaries.
Key Skills:
Python · SQL Server · Azure Data Factory (ADF) · Azure Databricks · Azure Services
Key Responsibilities:
• Built scalable ETL workflows using Python and ADF for ingesting diverse datasets.
• Improved data retrieval speed by 30% through pipeline optimization.
• Integrated multi-source datasets to support cross-domain analytics.
• Modeled and transformed data for BI and analytical consumption.
• Collaborated with stakeholders to deliver reliable, production-ready data solutions.
• Participated in Agile ceremonies, including sprint planning and reviews.
Led the Data Drill project to build scalable ETL pipelines for processing structured and semi-structured retail data from Amazon, Flipkart, and e-commerce platforms. Focused on batch processing, data ingestion, and forecasting.
Key Skills:
Python · MySQL · HBase · CSV/JSON/Parquet Ingestion · Power BI · Git
Key Responsibilities:
• Developed Python-based ETL workflows for ingesting data from multiple retail sources.
• Created batch processing scripts using Python and MySQL for high-volume data loads.
• Loaded multi-format data (CSV, JSON, Parquet) into HBase and MySQL for scalable access.
• Applied forecasting techniques to retail data for demand prediction.
• Prepared data models for reporting and dashboards in Power BI.
Designed scalable search and ingestion pipelines to enhance multilingual, regulatory-focused data search. Improved search performance, deduplication, and cross-identifier querying to drive customer satisfaction.
Key Skills:
Elasticsearch · ELK Stack · Kibana · Azure Databricks · Python · Azure Services
Key Responsibilities:
• Built custom Elasticsearch analyzers for synonyms, regex, and case-insensitive search.
• Developed a multilingual ranking model for improved global search relevance.
• Created ingestion pipelines with deduplication using unique identifiers.
• Built dynamic search templates supporting pagination, filtering, and multi-identifier queries.
• Enabled distinct search on nested fields using predefined logic.
• Collaborated with teams to deliver 20% improvement in customer satisfaction.
• Maintained production search pipelines ensuring high availability and reliability.
Microsoft Certified: Fabric Data Engineer Associate
Microsoft Certified: Fabric Data Engineer Associate
Data Analysis with Databricks SQL
Fundamentals of the Databricks Lakehouse Platform Accreditation
Microsoft Azure Essentials by Greatlearning
Microsoft Certified: Azure Fundamentals(AZ-900)
Microsoft Certified: Azure Data Fundamentals(DP-900)