Summary
Overview
Work History
Education
Skills
Timeline
Generic

M Balaji

Bengaluru

Summary

Azure Data Engineer with 4+ years of experience in designing, developing, and maintaining scalable data pipelines on Microsoft Azure. Hands-on expertise in Azure Data Factory, Azure Data Lake Gen2, Azure Databricks, Azure Synapse Analytics, and SQL. Proven ability to build reliable ETL/ELT solutions, implement incremental loads, and ensure high data quality for analytics and reporting in financial and enterprise domains.

Overview

3
3
years of professional experience

Work History

Data Engineer

Vzure Software Network Solutions PVT LTD
Bengaluru
02.2023 - Current

Project 3 (Vzure Software Network Solutions PVT LTD):

Domain: Healthcare Services.

Project name: Health care services (January 2022 – present).

Project Description:

The Healthcare Services project supports centralized processing of patient records, clinical events, billing transactions, and insurance claims for multiple healthcare providers. The platform enables reliable data movement from operational healthcare systems to analytical repositories, ensuring timely availability of accurate data for reporting, compliance, and operational decision-making. The solution was designed to handle high data volumes, frequent updates, and strict data quality requirements.

Responsibilities:

  • Gathered business and technical requirements to design the end-to-end healthcare data pipeline architecture, covering patient, clinical, and billing data.
  • Ingested raw data from REST APIs (Parquet/JSON) and on-premises healthcare systems (CSV) into Azure Data Lake Gen2 (Bronze Layer) using ADF Copy Activity.
  • Performed data cleansing, validation, and transformation in Azure Databricks using PySpark, implementing healthcare business rules for patient records, encounters, and claims.
  • Improved data quality and accuracy by identifying duplicate patient records, inconsistent treatment codes, missing clinical attributes, and abnormal billing amounts.
  • Stored, processed, and standardized data in Delta Lake (Silver Layer), and published aggregated datasets to Azure Synapse Analytics (Gold Layer) for analytics and reporting.
  • Secured data access using Microsoft Entra ID (Managed Identity), and managed secrets and credentials through Azure Key Vault.
  • Configured pipeline monitoring, logging, and alerts using Azure Monitor and Log Analytics to ensure operational reliability, and SLA adherence.
  • Optimized data processing and query performance to support downstream dashboards and healthcare analytics for business and clinical users.

Technologies: Azure Data Factory, Azure Databricks (PySpark), Azure Data Lake Gen2, Delta Lake, Azure Synapse Analytics, Azure Monitor, Azure Key Vault.

Project 2 (Empower Retirements - Great West Global Financial Solutions, Bangalore):

Domain: Banking / Financial Services / Retirement & Pension.

Project name: Retirements Record Keeping (Dec 2021 - Jan 2023)

Project Description:

The Retirement Record Keeping system manages and maintains retirement, pension, and contribution data for employees and plan participants. The system handles large volumes of historical and daily transactional data including employee details, employer contributions, investment allocations, fund balances, withdrawals, and compliance reporting. The project focused on building scalable, secure, and automated data pipelines on Azure to support analytics, reporting, and regulatory needs.

Responsibilities:

  • Designed and developed end-to-end ETL pipelines using Azure Data Factory (ADF) to ingest data from multiple source systems such as SQL Server, flat files, and third-party feeds.
  • Implemented file-based and conditional logic using ADF activities like If Condition, ForEach, Lookup, and Stored Procedure.
  • Ingested raw data into Azure Data Lake Gen2 following bronze, silver, and gold layer architecture.
  • Performed complex data transformations using Azure Databricks (PySpark) to cleanse, validate, and enrich retirement and pension data.
  • Built incremental and delta load logic to efficiently process daily contributions and transaction updates.
  • Loaded curated data into Azure SQL Database / Azure Synapse Analytics for reporting and downstream consumption.
  • Implemented data quality checks such as record counts, reconciliation, and null validations to ensure financial accuracy.
  • Monitored and troubleshot pipeline failures, implemented logging and alerting mechanisms.
  • Supported Power BI reports by providing optimized datasets for retirement balances and contribution trends.

Technologies:Azure Data Factory, Azure Data Lake Gen2, Azure SQL, Azure Databricks, Azure Synapse, SQL, PySpark

Project-1 (Broadridge Financial Solutions Pvt limited Bangalore):

Domain: Capital Markets / Corporate Governance

Project name: Global Proxy Services (Jul 2018 - Sep 2021).

Project Description:

Global Proxy Services is a Broadridge financial platform that supports end-to-end proxy voting and shareholder communications for corporate actions. The system processes large volumes of shareholder, issuer, and voting data across global markets. The project involved validating complex ETL workflows that ingest, transform, and load proxy-related data to ensure accuracy, compliance, and timely delivery for regulatory and client reporting.

Responsibilities:

  • Performed end-to-end ETL testing for Global Proxy Services, validating shareholder, issuer, and proxy voting data across multiple source and target systems.
  • Executed SQL queries to validate data extraction, transformation rules, and load accuracy in data warehouses.
  • Verified source-to-target mappings and business rules for proxy voting, meeting, and entitlement data.
  • Conducted data reconciliation testing including record counts, aggregations, and vote totals to ensure financial and regulatory compliance.
  • Tested incremental, delta, and full loads, ensuring accurate processing of daily and event-driven proxy data.
  • Validated file-based ETL processes (CSV, fixed-width files) for inbound and outbound proxy communications.
  • Logged, tracked, and retested defects using JIRA, collaborating with developers and business analysts for resolution.

Technologies: SQL, ETL Tools, Data Warehouse, Flat Files, JIRA.

Education:

  • MBA (IT and Finance) from MITS Madanapalli in the year 2018.
  • B.COM (Computer Applications) from SVU Tirupathi in the year 2016.
  • I hereby declare that the above information is true to the best of my knowledge.

Signature: M Balaji

Education

MBA - IT And Finance

MITS College
Madanapalli
06-2018

B.COM - Computer Applications

SV University
Tirupathi
06-2016

Skills

  • Azure Data Factory, Azure Databricks (PySpark), Azure Data Lake Gen2, Delta Lake, Azure Synapse Analytics, Azure Monitor, Azure Key Vault

Timeline

Data Engineer

Vzure Software Network Solutions PVT LTD
02.2023 - Current

MBA - IT And Finance

MITS College

B.COM - Computer Applications

SV University
M Balaji