Summary
Overview
Work History
Education
Skills
Certification
Languages
Timeline
Generic

Aman Raj

Summary

Aman is a Data Engineer in PwC India with 4+ years of Data Engineering Experience. He has worked on Projects pertaining to AWS,PostgreSQL,Python,Pyspark.He has hands-on experience in AWS(S3,Redshift,Glue,Lambda).He has strong experience in SQL,PL/SQL,Data Modeling/DataWarehousing and Building ETL Pipelines. He has working experience with SparkSQL and Pyspark Optimisations

Overview

4
4
years of professional experience
1
1
Certification

Work History

Data Engineer

PwC India
07.2021 - Current

Client Overview: Textile Conglomerate Client

  • Spearheaded the creation and implementation of robust and resilient ETL data pipelines, efficiently processing massive daily data volumes using PySpark and leveraging AWS services such as AWS Glue, AWS Lambda, and AWS S3 for seamless data transformation and feed into the Redshift database.
  • Contributed to scheduling these pipelines using an AWS EventBridge scheduler and implemented Jenkins for CI/CD to have a smooth transition.
  • Achieved a significant optimization milestone by transitioning pipeline logic from SQL to PySpark, resulting in a remarkable 30-minute reduction in execution time.
  • Demonstrated expertise in designing and developing database architectures and expert level data modelling for various data storage solutions, including relational databases, data warehouses, emphasizing both physical and logical design aspects.
  • Designed and implemented a data lake architecture on AWS Athena, leveraging Apache Iceberg to efficiently manage large-scale data storage and analytics.

Client Overview: Pharmaceutical Conglomerate Client

  • Developed and maintained data ingestion and transformation pipelines leveraging AWS Glue, S3, DynamoDB, and Lambda for seamless data migration into Amazon Redshift.
  • Designed and implemented Redshift database structures, including tables, stored procedures, and user-defined functions, to support business reporting and analytics.
  • Applied complex SQL transformations and business rules—such as converting epoch timestamps to UTC—to standardize and enrich source data.
  • Optimized query performance and data retrieval by strategically using DIST and SORT keys, partitioning, and restrictive filtering techniques.
  • Tuned T-SQL (DDL/DML) queries to enhance performance and ensure high database availability.
  • Leveraged AWS Glue job parameters within Redshift stored procedures to build dynamic, parameterized ETL workflows.
  • Utilized functions like QUALIFY() and json_extract_path_text() extensively to handle complex SQL logic and JSON data extraction.
  • Designed and implemented data models (Fact and Dimension tables) aligned with client business logic to support scalable and efficient reporting.

Education

B. Tech. (Bachelors of Technology) - Mechanical Engg

BIT Mesra
05.2021

Skills

Languages: Python, SQL, PySpark, Linux

Cloud: AWS, S3, Glue, SNS, Lambda, Athena, Redshift

Data: MySQL, SparkSQL, Data Modeling, Data Warehousing

Big Data: Apache Hadoop, HDFS, Hive, Spark

Other: Git, Github, Jenkins, CICD, ETL

Certification

AWS Cloud Practitioner

Languages

English, Hindi

Timeline

Data Engineer

PwC India
07.2021 - Current

B. Tech. (Bachelors of Technology) - Mechanical Engg

BIT Mesra
Aman Raj