Summary
Overview
Work History
Education
Skills
Certification
LANGUAGES
Timeline
Generic
Arun Thippareddy

Arun Thippareddy

Hyderabad

Summary

Experienced Azure Data Engineer with 8+ years of expertise in building scalable data platforms and processing large-scale datasets. Proficient in designing and orchestrating data pipelines using Azure Data Factory , managing data lakes in Azure Data Lake Storage Gen2 , and developing analytics solutions with Azure Synapse Analytics .

Hands-on experience with Databricks and Big Data technologies including Hadoop, Hive, and Spark for distributed data processing and transformation. Skilled in optimizing ETL workflows, improving performance, and ensuring data quality across enterprise data pipelines.

Strong ability to design end-to-end data solutions, enabling data-driven decision-making and delivering high-performance, reliable analytics systems..

Overview

1
1
Certification
9
9
years of professional experience

Work History

Azure Data Engineer

AWONE AI PVT LTD
Hyderabad, India
04.2023 - 03.2024
  • Designed and implemented scalable ETL pipelines using Azure Data Factory to ingest, transform, and load data from diverse sources into Azure Data Storage services (Azure Data Lake, Azure Blob Storage, Azure SQL DB), with advanced data processing and analytics performed in Azure Databricks for optimized performance and efficiency.
  • Developed and managed end-to-end data integration pipelines in Azure Data Factory by designing Linked Services, Datasets, and Pipelines to seamlessly extract and consolidate data from multiple sources (Azure SQL, Blob Storage, Azure Data Lake), while implementing automated workflow scheduling through triggers to ensure efficient and reliable data processing.
  • Designed and optimized high-performance Spark applications in Databricks using PySpark and Spark SQL to process, transform, and analyze large-scale datasets. Leveraged the Medallion Architecture (Bronze, Silver, Gold layers) to structure data pipelines for incremental refinement, enabling scalable, modular workflows and data-driven decision-making.
  • Designed and fine-tuned high-performance SQL queries to build scalable data solutions, ensuring optimal efficiency while meeting both technical requirements and business goals.
  • Partnered with cross-functional stakeholders to elicit business requirements, analyze complex data challenges, and escalate mission-critical issues to leadership for strategic resolution.
  • Collaborated with reporting teams to deliver customized, high-quality datasets optimized for analytics, empowering data-driven decision-making and actionable business insights.
  • Engineered robust CI/CD pipelines using Azure DevOps to streamline version control, automated deployments, and ensure seamless integration of data solutions.
  • Designed and integrated unit testing frameworks within Azure Data Factory pipelines to validate data transformations, guaranteeing accuracy and reliability across ETL workflows.

Azure Data Engineer

TIGER ANALYTICS PVT LTD
Hyderabad, India
06.2022 - 02.2023
  • Built and orchestrated data pipelines using Azure Data Factory , enabling seamless ingestion from on-prem and cloud sources into Azure Data Lake Storage Gen2
  • Designed and managed data lake architecture (Raw, Curated, Gold layers) in ADLS, ensuring secure and optimized access using RBAC
  • Developed analytical solutions and optimized queries using Azure Synapse Analytics for reporting and business insights
  • Built ETL workflows using ADF Data Flows to cleanse, transform, and aggregate raw data into analytics-ready formats aligned with business logic
  • Structured and optimized data storage in ADLS Gen2 and Azure SQL DB using partitioning, compression, and Delta Lake formats to improve query performance
  • Integrated Azure Key Vault to securely manage credentials for APIs, databases, and linked services
  • Implemented CI/CD pipelines using Azure DevOps , automating deployments across Dev, Test, UAT, and Prod environments using ARM templates and Git
  • Improved pipeline reliability by implementing data validation checks, schema enforcement, and monitoring alerts, reducing data quality issues by 30%
  • Optimized storage costs by configuring Blob storage tiers (Hot/Cool) and lifecycle management policies, reducing costs by 40%accessibility.

Data Engineer

Vengai Software Solutions Pvt Ltd
Hyderabad
04.2019 - 05.2022
  • Built large-scale data pipelines using Apache Spark (PySpark) to process 100M+ records daily from multiple data sources
  • Developed ETL workflows using Apache Hive for data transformation and aggregation
  • Managed distributed storage using Apache Hadoop (HDFS) for reliable data ingestion
  • Designed optimized SQL queries for reporting and analytics using SQL
  • Automated data processing jobs using Python , improving pipeline efficiency by 30%
  • Implemented data partitioning and bucketing in Hive to improve query performance by 40%
  • Developed incremental data loads and handled Slowly Changing Dimensions (SCD Type 1 & 2)
  • Performed data validation, cleansing, and transformation to ensure data quality and consistency
  • Integrated data from multiple sources like RDBMS, flat files, and APIs into Hadoop ecosystem
  • Worked on performance tuning of Spark jobs (caching, partitioning, joins optimization)
  • Collaborated with cross-functional teams for business requirements and data modeling

SAP BW & HANA Consultant

Accenture Solutions Pvt Ltd
Hyderabad
04.2015 - 01.2018
  • Load monitoring of daily and weekly data loads using Process Chains.
  • Monitored Info Packages, analyzed failure reasons, and handled error records in PSA including modification and reloading.
  • Uploaded master data and transactional data from flat files and SAP R/3.
  • Created Process Chains and Meta Chains for Transaction Data and Master Data uploads.
  • Worked on data loading using delta and full update methods.
  • Managed Ticket Status reports and maintained work history for all tickets until closure.
  • Handled data mismatch tickets and performed Data Reconciliation.
  • Provided feedback to team leads on data load issues and worked towards resolving them.

Education

Master of Science - Robotics And Manufacturing Engineering

University of Greenwich
LONDON, UK
01-2010

Bachelor of Technology - Telangana

SVIT, JNTU
01-2007

Board of Intermediate Education - MPC

Narayana Junior College
Telangana
01-2002

High School Diploma -

Good Samaritan High School
TG
01-2000

Skills

  • Data/Cloud Data Engineering
  • Azure Data Factory (ADF)
  • Azure Databricks
  • Fabric
  • Azure Data Lake Gen2 (ADLS)
  • Dimensional Modeling
  • Data Lake Optimizations
  • SQL
  • PySpark
  • Python Programming
  • ETL Processes
  • Azure Storage Services
  • Cloud-based Data Solutions
  • Microsoft Fabric
  • Azure DevOps
  • Gen AI

Certification

  • DP-700 (Microsoft Fabric Azure Data Engineer Associate) Cert ID:4556DBABAF58691B
  • DP- 900(Microsoft Azure Fundamentals) Cert ID:ACBCF23B95E9A2FA
  • Databricks Associate Data Engineer Cert ID:
  • Databricks Lakehouse Fundamentals
  • Databricks Generative AI, Fundamentals

LANGUAGES

English
Telugu
Hindi

Timeline

Azure Data Engineer

AWONE AI PVT LTD
04.2023 - 03.2024

Azure Data Engineer

TIGER ANALYTICS PVT LTD
06.2022 - 02.2023

Data Engineer

Vengai Software Solutions Pvt Ltd
04.2019 - 05.2022

SAP BW & HANA Consultant

Accenture Solutions Pvt Ltd
04.2015 - 01.2018

Board of Intermediate Education - MPC

Narayana Junior College

Master of Science - Robotics And Manufacturing Engineering

University of Greenwich

Bachelor of Technology - Telangana

SVIT, JNTU

High School Diploma -

Good Samaritan High School
Arun Thippareddy