Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Burhanuddin Pipaliyawala

Gurgaon

Summary

Results-driven Data Engineer with over 6+ years of experience delivering enterprise-scale data and analytics solutions across cloud and big data platforms. Expertise in designing, developing, testing, and supporting robust batch and streaming data pipelines that empower analytics and reporting for critical business applications. Strong foundation in data analysis, data modelling, metadata management, and production support, complemented by hands-on proficiency with Spark, SQL, Airflow, Kafka, and the AWS and Azure ecosystems. Recognized for fostering effective collaboration within cross-functional teams to drive innovative solutions and enhance data-driven decision-making.

Overview

7
7
years of professional experience
1
1
Certification

Work History

Data Engineer

GROUNDTRUTH INDIA PVT. LTD.
11.2023 - Current
  • Implemented, fine-tuned, and supported batch and streaming data pipelines with PySpark, Kafka, Airflow, DBT, and AWS.
  • Engineered and preserved Spark pipelines processing 1 billion+ daily events to facilitate enterprise analytics and attribution reporting.
  • Performed source system analysis, data profiling, data dictionary creation, and source-to-target mappings to meet business requirements.
  • Collaborated with senior engineers during technical design for ingestion, transformation, and large-scale data storage layers.
  • Implemented monitoring, alerting, and defect resolution processes to improve production reliability and recovery SLAs.
  • Maintained metadata and data lineage documentation in alignment with data governance standards.

Lead Engineer

JOHN DEERE INDIA (Contract)
07.2023 - 10.2023
  • Developed and supported enterprise ETL pipelines on AWS using Spark, SQL, and Airflow for global analytics workloads.
  • Optimized Spark transformations through partitioning and tuning, reducing batch processing runtime by approximately 40%.
  • Designed analytics-ready data marts in Snowflake and Amazon Redshift for cross-functional reporting.

Data Engineer

NCS – SINGTEL GROUP
04.2022 - 07.2023
  • Engineered scalable data pipelines integrating AWS EMR, Kafka, Airflow, dbt, and Redshift for enterprise analytics platforms.
  • Implemented dimensional models and governance-ready datasets to support audit, compliance, and reporting needs.
  • Improved job reliability using incremental loading, checkpointing, and structured error handling.

Data Solution Engineer

NUMERATOR
06.2021 - 05.2022
  • Owned distributed Spark pipelines and Snowflake data warehouses managing 100TB+ datasets.
  • Tuned Spark jobs by optimizing memory, executors, and shuffle configurations to improve performance.
  • Partnered with analytics teams to deliver standardized and reusable reporting datasets.

Associate Data Engineer

IDEAS (A SAS COMPANY)
08.2020 - 05.2021
  • Developed ETL workflows on Azure Databricks using PySpark and Airflow for marketing analytics use cases.
  • Built reusable data models and analytics layers, improving delivery speed and consistency.

Data Engineer Intern

TECHECHELONS PVT. LTD.
06.2019 - 07.2020
  • Supported Hadoop- and Airflow-based ingestion systems for batch analytics pipelines.
  • Automated ingestion scheduling and implemented schema validation and deduplication checks using Python.
  • Documented data flows and maintained QA-ready datasets for validation and testing.

Education

Bachelor of Technology (B.Tech) - Computer Science & Engineering

Parul Institute of Technology
India
01-2019

Skills

  • Big Data & Processing: Apache Spark (PySpark, Spark SQL), Kafka, Spark Structured Streaming, Hadoop, Hive
  • Cloud & Platforms: AWS (S3, EMR, Glue, Athena, ECS, ECR, Lambda, IAM, CloudWatch), Azure Databricks, ADLS Gen2, Azure Synapse Analytics, Azure Data Factory
  • Data Warehousing & Databases: Snowflake, Amazon Redshift, PostgreSQL, MySQL, Oracle
  • Data Modeling & Governance: Dimensional Modeling (Kimball), Data Vault 20, Metadata, Data Lineage
  • Orchestration & CI/CD: Apache Airflow, dbt, Jenkins, Git, GitHub Actions, Azure DevOps, JIRA, Confluence
  • Programming: Python, SQL
  • Experience with various data file formats
  • Data pipeline control
  • Data migration

Certification

  • AWS Certified Data Engineer (2024)
  • Snowflake Masterclass (2024)

Timeline

Data Engineer

GROUNDTRUTH INDIA PVT. LTD.
11.2023 - Current

Lead Engineer

JOHN DEERE INDIA (Contract)
07.2023 - 10.2023

Data Engineer

NCS – SINGTEL GROUP
04.2022 - 07.2023

Data Solution Engineer

NUMERATOR
06.2021 - 05.2022

Associate Data Engineer

IDEAS (A SAS COMPANY)
08.2020 - 05.2021

Data Engineer Intern

TECHECHELONS PVT. LTD.
06.2019 - 07.2020

Bachelor of Technology (B.Tech) - Computer Science & Engineering

Parul Institute of Technology
Burhanuddin Pipaliyawala