Summary
Overview
Work History
Education
Skills
Timeline
Generic

Rubavathi C

Summary

Azure Data Engineer with expertise in Azure Databricks and performance optimization. Designed comprehensive data pipelines, realizing up to 70% cost savings through innovative strategies. Skilled in data modeling and promoting cross-team collaboration to enhance enterprise analytics initiatives. Results-oriented Data Engineer with expertise in Azure Databricks, Delta Lake, and SQL. Proven track record in optimizing performance and cost efficiency in data pipelines.

Overview

8
8
years of professional experience

Work History

Data Engineer

IBM
Bangalore
04.2024 - Current
  • Migrated batch data processing to near real-time streaming using Autoloader and Change Data Capture (CDC) techniques.
  • Enhanced pipeline efficiency through AQE fine-tuning, Auto Compaction, Optimize Write, and serialization improvements using Apache Arrow, vectorization, and PySpark native functions.
  • Resolved shuffle bottlenecks via Broadcast Joins, Bucketing, and intermediate caching strategies.
  • Tuned data storage and retrieval performance using techniques like column/row elimination, partition optimization, Z-Ordering, and small file handling.
  • Recommended and implemented table update triggers over ADF, updating the gold layer on changes only—cut costs by 70-80% by completely eliminating the usage of ADF.
  • Consolidated redundant tables across different gold catalogs into a single unified table reducing compute and storage costs by approximately 25%
  • Implemented multiple Slowly Changing Dimensions (SCD) models as per business needs.
  • Experienced in designing Star and Snowflake schemas for analytical workloads.
  • Strong expertise in Databricks Unity Catalog, including data governance, access control, lineage tracking, and migration of existing assets to Unity Catalog for centralized management.
  • Implemented data quality checks to enhance accuracy and reliability of datasets.
  • Collaborated with analysts to gather requirements for data integration projects.

Azure Data Engineer

Tata Consultancy Services
Bangalore
01.2020 - 04.2024
  • Developed ETL/ELT pipelines using Azure Databricks and Spark Structured Streaming.
  • Integrated data from Kafka, Event Hub, Azure SQL and ADLS Gen2.
  • Built analytical data models in Azure Synapse.

Azure Administrator

Tata Consultancy Services
Bangalore
02.2019 - 12.2019
  • Managed Azure resources, deployments, and monitoring.
  • Configured self-hosted integration runtimes and Databricks mount points.

Desktop Support Engineer

Tata Consultancy Services
Chennai
07.2017 - 01.2019
  • Provided enterprise desktop support, OS installations, and onboarding assistance.

Education

Bachelor of Engineering - Electronics and Communication Engineering

Sri Sairam Engineering College
Chennai
04-2017

Skills

  • Azure Databricks
  • Azure Data Factory
  • Azure Synapse Analytics
  • ADLS Gen2
  • Delta Lake
  • PySpark
  • Python
  • SQL
  • Event Hub
  • Service Bus
  • Kafka
  • Unity Catalog
  • Data Modeling
  • Performance Tuning
  • Cost Optimization

Timeline

Data Engineer

IBM
04.2024 - Current

Azure Data Engineer

Tata Consultancy Services
01.2020 - 04.2024

Azure Administrator

Tata Consultancy Services
02.2019 - 12.2019

Desktop Support Engineer

Tata Consultancy Services
07.2017 - 01.2019

Bachelor of Engineering - Electronics and Communication Engineering

Sri Sairam Engineering College
Rubavathi C