Data Engineer with 6 years of experience designing, developing, and optimizing scalable data pipelines using AWS Glue, Databricks, PySpark, python and Spark SQL.
Proven ability to process and transform large, complex datasets by applying robust business logic to deliver analytics‑ready data.
Strong expertise in data cleaning, transformation, and analysis, with a focus on data quality, performance, and reliability.
A quality‑driven, collaborative team player with strong communication skills and a passion for building data solutions that enable data‑driven decision‑making.
Overview
6
6
years of professional experience
3
3
Certification
Work History
Data Engineer
Cognizant Technology Solutions
Hyderabad, India
07.2023 - Current
Focus on hands-on development, ETL pipelines, transformations, and reliability.
Worked on a healthcare data engineering project handling sensitive person and organization data sourced from EDF.
Designed and developed an incremental data ingestion pipeline in Databricks, extracting data from EDF and reduced data processing overhead by implementing timestamp-based incremental extraction, enabling efficient and faster data loads.
Architected a multi-layer Delta Lake design (Landing, Working, Publish) to support clean data separation and scalable data processing.
Implemented Landing layer Delta tables to store raw incremental data efficiently and reliably
Performed data transformations using PySpark by applying business rules in the Working layer Delta tables.
Loaded curated and transformed datasets into Publish layer Delta tables for downstream consumption
Enabled seamless data integration with Salesforce by supplying clean, analytics-ready data from the publish layer.
Optimized pipeline performance by implementing incremental data processing, efficient Spark transformations, partitioned Delta tables, and Spark execution tuning and Delta Lake best practices to ensure data accuracy, consistency, and scalability across all layers.
Handled millions of records daily and reduced pipeline processing time by approximately 40% by implementing incremental loads and optimizing Spark transformations.
Data Engineer
Tata Consultancy Services
chennai, India
11.2022 - 06.2023
Designed and developed Spark-based ETL jobs in AWS Glue to perform large‑scale data cleaning, validation, and transformation.
Optimized AWS Glue Spark jobs using performance tuning techniques, significantly reducing processing time and operational costs for high‑volume datasets
Built and maintained ETL data mappings using PySpark and Spark SQL, ensuring efficient and accurate data transformations.
Collaborated with stakeholders in an Agile Scrum framework to deliver reliable, production‑ready data engineering solutions
Data Engineer
DXC Technology
Hyderabad, India
06.2020 - 10.2022
Designed and implemented scalable data pipelines using AWS S3 and AWS Glue to support reliable ETL processing.
Performed data cleaning and validation using AWS Glue DataBrew, improving data quality and readiness for analytics.
Implemented AWS Glue Data Catalog and Crawlers to automatically catalog S3 data and enable SQL‑based querying through Amazon Athena.
Developed PySpark reconciliation scripts to validate data consistency and ensure accurate data movement from source to target systems.
Optimized Spark jobs by applying performance tuning techniques, reducing processing time from hours to minutes on large‑scale datasets.
Monitored and troubleshot data pipelines using AWS CloudWatch, ensuring pipeline stability and quick issue resolution.
Created source‑to‑target mappings by translating business requirements into transformation logic using Spark SQL.
Leveraged Amazon S3 as a central data storage layer, supporting scalable and cost‑effective data processing.
Education
BTECH - Computer Science Engineering
Swami Vivekananda Institute of Technology
Hyderabad
2020
Skills
Pyspark
AWS GLUE
SQL
Python
ETL
Data bricks
EDA
Delta lake
Accomplishments
Q4 FY22 Champs Award is presented for performing optimization of ETL pyspark jobs and as ETL outputs are well appreciated by client .