Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Timeline
Generic

Manas Chauhan

Summary

Azure Data Engineer with over three years of experience in designing, developing, and implementing scalable data solutions on Azure cloud platforms. Proficient in building end-to-end ETL/ELT pipelines for both real-time and batch data processing. Experienced in handling large-scale, high-volume datasets to support business-critical applications across various domains, including hospitality and insurance. Strong expertise in performance optimization, data governance, and delivering secure, reliable, and high-quality data solutions.

Overview

3
3
years of professional experience
1
1
Certification

Work History

Data Engineer

Applied Information Sciences | Windcreek Hospitality
Hyderabad
06.2023 - Current
  • Accelerated player activity sparked job performance by 30% through advanced partitioning strategies, efficient caching mechanisms, and optimized broadcast joins in Azure Databricks, reducing overall data processing time.
  • Engineered dynamic and reusable Azure Data Factory (ADF) pipelines by leveraging parameterization techniques, cutting deployment efforts by 40%, and streamlining pipeline management and maintenance activities.
  • Developed automated data quality validation frameworks in Azure Databricks, incorporating rule-based data checks and anomaly detection algorithms that increased data reliability and improved reporting accuracy by 95%.
  • Orchestrated the automated ingestion of over 5 million daily player activity records into Azure Data Lake Gen2 using scalable ADF pipelines integrated with both scheduled and event-driven triggers, enhancing data availability for analytics teams.
  • Designed and deployed robust data transformation and enrichment workflows in Azure Databricks (PySpark), calculating and aggregating player points from Quest Loyalty datasets, contributing to an improved customer loyalty experience for over 100,000 users.
  • Developed and implemented Change Data Capture (CDC) mechanisms for incremental data processing of jackpot transactions, optimizing ETL processes, and reducing daily data load volumes by over 70%.

Data Engineer

Applied Information Sciences | Geico
Hyderabad
06.2022 - 05.2023
  • Designed and implemented data ingestion pipelines in Azure Data Factory (ADF) to consolidate policy, claims, and customer data from multiple source systems, including SQL Server, REST APIs, and flat files, into Azure Data Lake Gen2, improving data availability and accessibility for downstream analytics by 40%.
  • Developed scalable data transformation pipelines in Azure Databricks (PySpark) to clean, standardize, and integrate complex insurance datasets, delivering a unified 360-degree customer view that enhanced underwriting accuracy and risk assessment capabilities.
  • Implemented Slowly Changing Dimensions (SCD Type 2) frameworks in Azure Data Factory and Databricks to track historical changes in policyholder data, ensuring regulatory compliance, and enabling accurate longitudinal data analysis across 2 million+ records.
  • Optimized Azure Databricks Spark jobs by migrating high-volume claims datasets to Delta Lake format and applying Z-Order clustering, reducing query execution times by 60%, and accelerating actuarial model processing for risk analytics teams.

Education

B.Tech - Spz. in Big Data

University of Petroleum And Energy Studies
Dehradun
05-2022

Skills

Programming languages: Python, SQL, and Java

ETL processing: Azure Databricks, Azure Data Factory, PySpark

Databases: Azure Data Lake, Hive, MSSQL, MySQL

Other Skills: Git, Big Data, NLP, and Machine Learning

Certification

Microsoft Azure Fundamentals

(AZ-900)

Microsoft Azure Data

Fundamentals (DP-900)

Azure AI Engineer Associate (AI

102)

Accomplishments

On The Spot Award 2024

On The Spot Award 2023

On The Spot Award 2022

Timeline

Data Engineer

Applied Information Sciences | Windcreek Hospitality
06.2023 - Current

Data Engineer

Applied Information Sciences | Geico
06.2022 - 05.2023

B.Tech - Spz. in Big Data

University of Petroleum And Energy Studies
Manas Chauhan