Summary
Overview
Work History
Education
Skills
Accomplishments
Projects
Certification
Timeline
background-images

PAVAN BEESHETTI

Hyderabad,TG

Summary

Azure Data Engineer with 3 years of experience working on ADF, Databricks, PySpark, Python, SQL and Delta Lake. I focus on building reliable data pipelines, improving performance, and handling API-based ingestion. I’ve contributed to multiple enterprise projects where I improved pipeline speed, optimized cloud costs, and implemented data quality checks and CI/CD deployments. Comfortable taking ownership end-to-end from ingestion to modelling and production support.

Overview

3
3
years of professional experience
1
1
Certification

Work History

Data Engineer

Wissen Infotech
03.2025 - Current
  • Client: TRONOX
  • Project: TOIS (IDP)
  • Built and maintained ADF + Databricks pipelines for ingesting and transforming structured and unstructured data.
  • Created KPI fact and dimensions tables in PySpark SQL to support analytics and reporting teams.
  • Improved pipeline performance by around 40% using Delta Lake OPTIMIZE, caching, and better partitioning.
  • Added a metadata-driven validation layer to handle schema checks, null rules, and simple DQ logic.
  • Helped reduce Databricks compute costs by ~20% through cluster tuning and autoscaling adjustments.
  • Used Azure DevOps CI/CD to automate notebook and pipeline deployments.
  • Worked with Unity Catalog to organize data access and maintain lineage visibility.

Data Engineer

Wissen Infotech
09.2024 - 03.2025
  • Client: GE Healthcare
  • Project: TAA
  • Built Python-based ETL pipelines to ingest data from SharePoint, ServiceNow, and custom REST APIs.
  • Loaded curated data into Azure SQL with SCD logic and table-level validation checks.
  • Added alerting for ETL success/failure and logging for better monitoring.
  • Implemented reconciliation checks for row counts, schema mismatches, and rule-based data quality.
  • Scheduled workflows in Airflow, improving retry behaviour and run visibility.
  • Added exception handling and reprocessing logic to reduce manual intervention.

Data Engineer

Wissen Infotech
02.2023 - 08.2024
  • Client: GE Healthcare
  • Project: Spin Off Finance
  • Designed ingestion frameworks using ADF, Databricks, Python, and PySpark for API, SharePoint, and file-based data sources.
  • Created reusable patterns for incremental and full loads, reducing development time for new pipelines.
  • Developed connectors for SharePoint, Smartsheets, and Box with cleaning and standardization logic.
  • Used Delta Lake features (OPTIMIZE, ZORDER, partitioning) to improve read performance and reduce storage costs.
  • Built Star schema models, SCD1 logic, and automated DDL scripts for Azure SQL.
  • Reduced cloud and compute costs by ~20% through optimization and scheduling improvements.

Education

Bachelor of Technology - Mechanical Engineering

Raghu Engineering College
Visakhapatnam
07-2022

Skills

  • Programming: Python, PySpark, SQL
  • Azure Services: ADF, Data Lake Gen2, Azure Databricks, Azure SQL, Key Vault, Logic Apps
  • Big Data: Apache Spark, Delta Lake, Spark SQL, Databricks Lakehouse
  • Version Control: Git, Azure DevOps
  • Orchestration: ADF, Airflow, Cron Jobs
  • Other Tools (Basic Familiarity): Linux, PowerBI, Kafka, Docker

Accomplishments

  • Delivered 15+ production-grade pipelines across three enterprise projects.
  • Improved overall processing performance by 40% using PySpark and Delta optimizations.
  • Reduced Databricks + ADF cloud costs by 20% with better cluster configurations and job handling.
  • Implemented reusable frameworks for data ingestion and data quality checks.

Projects

1. Metadata-Driven Data Quality Framework, Created a simple rule-driven framework supporting schema checks, null validations, duplicate detection, and DQ reporting., Used Delta tables to store DQ rules and execution logs., 2. Real-Time API Data Pipeline with Kafka and Airflow, Tech Stack: Docker, Kafka, Airflow, Python, PostgreSQL, Power BI, Designed a small prototype using Kafka Producer/Consumer to stream API data into PostgreSQL., Built a basic Power BI dashboard on top of the processed data.

Certification

Databricks Certified Data Engineer (Associate – In Progress)

Timeline

Data Engineer

Wissen Infotech
03.2025 - Current

Data Engineer

Wissen Infotech
09.2024 - 03.2025

Data Engineer

Wissen Infotech
02.2023 - 08.2024

Bachelor of Technology - Mechanical Engineering

Raghu Engineering College
PAVAN BEESHETTI