Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Projects
Generic

Hritik Maheshwari

Data Engineer
Bhilwara

Summary

Results-driven Data Engineer with nearly three years of experience in designing, building, and optimizing scalable data pipelines on Azure Databricks. Strong expertise in PySpark, Delta Lake, Unity Catalog migrations, and Azure Data Factory, consistently achieving pipeline performance improvements and reducing execution time by up to 30%. Proficient in implementing Medallion Architecture (Bronze–Silver–Gold) to deliver reliable, production-grade datasets that support analytics, reporting, and BI workloads. Skilled in managing large-scale data processing, ensuring data quality validations, and conducting performance tuning within enterprise environments.

Overview

3
3
years of professional experience
2
2
Certifications

Work History

Associate Data Engineer

Celebal Technologies
03.2024 - Current
  • Migrated 200+ Databricks Jobs, Notebooks, Views, and Tables from Hive Metastore to Unity Catalog, reducing governance gaps and standardizing access across environments.
  • Processed and optimized data pipelines handling 1-2 TB/day using PySpark and Delta Lake, improving overall pipeline performance by 30%.
  • Applied Spark optimization techniques (partitioning, caching, optimized joins) to reduce job execution time by 25-35%.

Junior Data Engineer

Celebal Technologies
07.2023 - 03.2024
  • Implemented SYNC, Deep Clone, and CTAS migration strategies with zero data loss and minimal downtime during production cutover.
  • Built scalable ingestion pipelines using Azure Data Factory and Databricks Autoloader, enabling near-real-time processing of millions of records per day.
  • Implemented Medallion Architecture delivering Bronze, Silver, and Gold datasets consumed by BI and analytics teams.

Education

Bachelor of Technology - Computer Science

Arya College of Engineering and I.T.
Jaipur, Rajasthan
05-2023

Skills

  • PySpark
  • Spark SQL
  • Git
  • CI/CD
  • Databricks (Unity Catalog)
  • ETL development
  • SQL programming
  • Python coding
  • Real-time processing
  • Data migration
  • Data quality assurance
  • Data warehousing
  • Data pipeline design
  • Data modeling

Certification

Databricks Certified Data Engineer Professional

Timeline

Associate Data Engineer

Celebal Technologies
03.2024 - Current

Junior Data Engineer

Celebal Technologies
07.2023 - 03.2024

Bachelor of Technology - Computer Science

Arya College of Engineering and I.T.

Projects

Marketing Analytics Data Platform (End-to-End Development):-

  • Designed and implemented scalable batch ingestion pipelines using Azure Data Factory and Azure Databricks to process multi-source marketing data.
  • Developed PySpark-based transformation logic with embedded data quality validations (null checks, schema enforcement, deduplication).
  • Applied Databricks and Spark performance optimizations (partitioning, caching, file compaction) resulting in a ∼30% reduction in pipeline execution time.
  • Data sets ready for analysis delivered to downstream BI and reporting teams with consistent adherence to SLA.
  • Skills:- ADF, PySpark, Delta Lake, Performance Tuning, Data Quality


Financial Services Platform - Unity Catalog Migration

  • Led the migration from Hive Metastore to Unity Catalog for the Enterprise Databricks workspace, covering jobs, notebooks, views, and Delta tables.
  • Configured storage credentials and external locations to securely access S3-based data lakes.
  • Executed multiple migration strategies including SYNC, Deep Clone, and CTAS, selecting approaches based on data size and dependency complexity.
  • Ensured data access governance, lineage, and fine-grained permissions using Unity Catalog policies.
  • Skills:- Unity Catalog, Hive Metastore Migration, S3, Databricks Security


Healthcare Data Processing System (Orchestration & Incremental Loads)

  • Built PySpark DataFrame-based transformation pipelines on Azure Databricks to process healthcare transactional datasets.
  • Developed ADF-triggered Databricks notebooks for multiple interfaces, enabling automated execution across environments.
  • Designed a master orchestration notebook to execute dependent interface notebooks in parallel, improving throughput.
  • Implemented custom incremental load strategies per interface before publishing curated data to the delivery zone.
  • Skills:- Incremental Processing, Orchestration, PySpark, Azure Databricks, ADF


Infrastructure & Operations Analytics (Medallion Architecture)

  • Architected and implemented Medallion Architecture (Bronze-Silver-Gold) on Azure Data Lake for infrastructure datasets.
  • Configured Databricks Auto Loader to ingest high-volume files into Delta tables with schema evolution handling.
  • Developed business-driven aggregations and transformations to produce Gold-layer datasets for BI consumption.
  • Optimized Databricks workflows and notebook execution to reduce cluster load and overall processing time.
  • Skills:- Medallion Architecture, Auto Loader, Delta Lake, Databricks Workflows
Hritik MaheshwariData Engineer