Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic
Anurag Mishra

Anurag Mishra

Pune

Summary

Accomplished Senior Data Engineer with extensive experience at ProArch, specializing in ETL framework design and data governance. Proficient in Azure Databricks and SQL, I excel in optimizing data pipelines and mentoring teams. My strong analytical skills drive successful forecasting models, enhancing decision-making and operational efficiency. Seasoned Senior Data Engineer with background in developing, testing, and maintaining data architectures. Possess strong skills in database management systems, Big Data processing frameworks, data modeling and warehousing. Have successfully led teams in creating innovative data solutions to improve system efficiency and business decision-making processes. Demonstrated impact through enhanced data availability and accuracy in previous roles.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

ProArch
Pune
08.2024 - Current
  • Designed and implemented a generic ETL framework for managing the company roster data.
  • Developed data models and stored data in external Delta tables in Parquet format using Azure Databricks, ADLS Gen2.
  • I followed the Medallion Architecture for efficient data processing.
  • Ensured data governance and quality using Unity Catalog.
  • Set up CI/CD pipelines using GitHub for seamless deployment and automation.
  • Built an end-to-end ETL framework for hospital data using Azure Data Factory (ADF) and Databricks.
  • Designed and implemented fact and dimension tables in Azure SQL for data warehousing.
  • Utilized Purview for data governance and quality control.
  • Developed forecasting models for key hospital KPIs using Prophet and XGBoost.
  • Enabled data visualization through Power BI dashboards.
  • Mentored junior engineers, and provided technical guidance on ETL and forecasting processes.
  • Migrated ETL pipeline to Microsoft Fabric.

Senior Data Engineer

Koantek
Pune
01.2024 - 08.2024
  • Led ETL framework development using Databricks and Spark.
  • Built scalable data pipelines, integrating ADLS, Azure Synapse, and ADF.
  • Optimized SQL queries for high-performance analytical reporting.
  • Worked closely with stakeholders to refine business requirements into data solutions.
  • Ensured data accuracy through regular testing and validation procedures prior to deployment in production environments.
  • Implemented best practices around data governance and compliance standards such as GDPR.

Data Engineer

Indium Software
Bangalore
06.2021 - 01.2024
  • Designed and developed data ingestion pipelines using Azure Data Factory and Databricks.
  • Implemented Medallion Architecture for data transformation and governance.
  • Integrated Azure Purview for metadata management and data lineage tracking.
  • Collaborated with data scientists to implement forecasting models with XGBoost and Prophet.
  • Created stored procedures for automating periodic tasks in SQL Server.
  • Analyzed user requirements, designed and developed ETL processes to load enterprise data into the Data Warehouse.
  • Developed Python scripts for extracting data from web services API's and loading into databases.
  • Optimized SQL queries and database schemas for performance improvements in data retrieval operations.
  • Managed version control and deployment of data applications using Git, Docker, and Jenkins.
  • Automated data quality checks and error handling processes to ensure the integrity and reliability of datasets.

Data Quality Analyst

Ubisoft
04.2016 - 07.2019
  • Performed data validation and cleansing for large-scale gaming datasets.
  • Built interactive dashboards in R Shiny and Tableau to monitor player behavior.
  • Conducted data mining and statistical analysis to improve game performance metrics.

Education

Master of Science - Data Analytics

Dublin Business School
Dublin
08.2020

B.E. - Mechanical Engineering

MIT AOE
Pune
05.2014

Skills

  • Python and R programming
  • Data integration tools
  • SQL and PostgreSQL databases
  • Cloud platforms: Azure and AWS
  • Databricks and Synapse analytics
  • Apache Spark and PySpark frameworks
  • Data visualization tools
  • Machine learning frameworks
  • Version control with GitHub
  • Containerization with Docker
  • Unity Catalog and Azure Purview
  • Data modeling techniques
  • Data governance practices
  • Data warehousing solutions
  • ETL development processes
  • CI/CD implementation strategies

Certification

  • Azure Fabric, DP-700
  • Databricks Certified Data Engineer Professional
  • Codility Golden Award for the May the 4th Challenge
  • HackerRank Python (Basic) Certificate

Timeline

Senior Data Engineer

ProArch
08.2024 - Current

Senior Data Engineer

Koantek
01.2024 - 08.2024

Data Engineer

Indium Software
06.2021 - 01.2024

Data Quality Analyst

Ubisoft
04.2016 - 07.2019

Master of Science - Data Analytics

Dublin Business School

B.E. - Mechanical Engineering

MIT AOE
Anurag Mishra