Summary
Overview
Work History
Education
Skills
Relevantdomainexperience
Certification
Dataengineeringexperience
Summary Of Experience
Personal Information
Timeline
Generic

Mustafa Sadriwala

Mumbai

Summary

Data Engineer with 6+ years of experience in big data and analytics, specializing in advanced SQL, Spark, and Azure Analytics. Successfully architected and optimized data solutions at Celebal Technologies, enhancing decision-making processes for Tata Motors. Recognized for exceptional problem-solving skills, including a patented project that transformed data into actionable insights. Dedicated to driving innovation through effective data strategies.

Overview

6
6
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Celebal Technologies - Tata Motors
Pune
10.2022 - Current
  • The project involved developing a centralized data analytics and reporting solution using Azure technologies. Automated pipelines were built in Azure Data Factory to extract data from various sources (on-prem SQL, MySQL, PostgreSQL, APIs, and flat files) into Azure Synapse.
  • Data cleansing and transformation were handled with Azure Databricks, optimizing processes for better performance. The solution included the creation of fact and dimension tables, complex queries, and mapping tables for ETL. Reports were visualized in Power BI.
  • Data architecture was designed for Tata Motors, involving data flow from S3 to ADLS Gen2, and structured into Bronze, Silver, and Gold layers using Delta Lake. Spark, Python, and RDD were used for data processing, and K-means clustering was applied for vehicle analysis. The project was patented by the client based on industry standards.
  • Used statistical software to analyze and process large data sets.
  • Tested, validated and reformulated models to foster accurate prediction of outcomes.
  • Cleaned and manipulated raw data.

Data Engineer

Celebal Technologies -Wipro
04.2022 - 09.2022
  • In this project, I designed and implemented a robust data pipeline, utilizing Azure Data Factory (ADF) and Databricks for end-to-end data processing. The project began with the extraction of data from S3, where raw JSON data was stored.
  • Using ADF, I orchestrated the movement of this data into Azure Data Lake Storage Gen2 (ADLS Gen2) for efficient storage and management. Once the data was ingested, I applied PySpark in Databricks for feature engineering, transforming raw data into structured, analytical features, suitable for further processing.
  • This transformation involved cleaning, aggregating, and enriching the data to derive meaningful insights. After completing the feature engineering phase, I performed benchmarking on the vehicle data, analyzing key performance metrics to identify trends and insights.
  • Throughout the project, I optimized the workflows using SQL to ensure efficient data handling and processing, while also ensuring that the pipeline was scalable and maintainable.
  • This project enhanced my expertise in PySpark, ADF, SQL, and Databricks, and provided hands-on experience in handling large-scale data processing and feature engineering in a cloud environment.
  • Established and enforced data governance policies and procedures to comply with regulatory requirements and ensure data privacy.
  • Automated data quality checks and error handling processes to ensure the integrity and reliability of datasets.

Data Engineer

Infrabeat-Emcure
08.2021 - 03.2022
  • Data & Business understanding, Creating BRD's & Technical documentations
  • Hands-on experience with Data ingestion from multiple sources and creating Temporary Staging Layer on On-Premises & On Cloud
  • Created Data Engineering pipelines using practices of ETL/ ELT approach for multiple tools like Azure Factory
  • Writing Complex SQL queries to solve business problems, creating Flat reports & building Data Marts for different business verticals on top of Data warehouse
  • Data Transformation using Power Query Editor,, Mapping & Control Flow on ADF
  • Building Analytics dashboard on Power BI, Created Datasets & Dataflows for SSBI
  • Creation of REST Api to completely automate the process of execution of pipelines
  • Python script creation to move files from On-Premises to Azure Blob storage
  • Creation of Power Bi Dashboard for ITSM tool data of Emcure Pharma
  • Migration of complete HR module from cloud to Azure Database for PostgreSQL(Warehouse)

Data Engineer

Cognizant - Sanofi
02.2019 - 08.2021
  • Designed DW with best practices on Data Modeling Techniques
  • Managing VM's, Integration Runtime
  • Managed SAML authentication on Qlik Sense
  • Creation of Power Bi Dashboard for ISTM tool and Manufacturing Units within Sanofi
  • Working On enhancement on Custom applications within Sanofi using Javascript
  • Working on ITSM tool and having complete Knowledge and Experience of ITSM Processes

Education

Bachelors of Engineering - Electronics and Telecommunications

MH Saboo Siddik College of Engineering
01.2018

Skills

  • Advanced SQL
  • Spark development
  • Big data processing
  • Data warehousing
  • ETL development
  • Data modeling
  • Azure Analytics Expertise
  • Azure Data Factory
  • Azure Databricks
  • Python
  • JavaScript

Relevantdomainexperience

  • Pharmaceuticals
  • Manufacturing
  • Retail

Certification

  • Microsoft Azure DP-900
  • Microsoft - Data Analyst Associate (PL-300)
  • Microsoft Certified: Azure Data Engineer Associate (DP-203)
  • Databricks Certified Data Engineer Associate
  • Databricks Certified Data Engineer Professional

Dataengineeringexperience

  • Data & Business understanding, Creating BRD's & Technical documentations
  • Hands-on experience with Data ingestion from multiple sources and creating Temporary Staging Layer on On-Premises & On Cloud
  • Created Data Engineering pipelines using practices of ETL/ELT approach for multiple tools like Azure Factory, Azure Synapse Analytics
  • Writing Complex SQL queries to solve business problems, creating Flat reports & building Data Marts for different business verticals on top of Data warehouse
  • Data Transformation using Power Query Editor, Mapping & Control Flow on ADF
  • Building Analytics dashboard on Power BI, Created Datasets & Dataflows for SSBI
  • Implemented Spark using Python and utilizing RDD, Data frames and Spark SQL for data cleansing, transformation and referential integrity based on the requirement
  • Implemented SCD type - 2 and applied master data management on the source data using a priority logic on Databricks

Summary Of Experience

4, Software Engineering, Database Management, Data Engineering, Analytics, MS-SQL Server, Azure Databricks, Azure Synapse, Azure Data Factory

Personal Information

  • Father's Name: Tafazzul Sadriwala
  • Date of Birth: 06/15/96
  • Nationality: Indian
  • Marital Status: Single

Timeline

Senior Data Engineer

Celebal Technologies - Tata Motors
10.2022 - Current

Data Engineer

Celebal Technologies -Wipro
04.2022 - 09.2022

Data Engineer

Infrabeat-Emcure
08.2021 - 03.2022

Data Engineer

Cognizant - Sanofi
02.2019 - 08.2021

Bachelors of Engineering - Electronics and Telecommunications

MH Saboo Siddik College of Engineering
Mustafa Sadriwala