
Results-driven Data Engineer with 3+ years of experience designing, building, and optimizing large-scale ETL pipelines using PySpark, AWS Glue, and Snowflake. Adept at ingesting semi structured data from cloud storage (S3), automating workflows using Airflow, and ensuring high data quality and performance. Proven success in delivering scalable data lake architectures, implementing schema evolution, and supporting advanced analytics. Experience spans healthcare, workforce analytics, and supply chain domains.
I hereby declare that the information provided above is true and accurate to the best of my knowledge.
Chennai [06/10/2026]
PROJECT EXPERIENCE Pfizer – Medical Rebates ETL Platform
Client: Pfizer | Tools: AWS Glue, PySpark, Snowflake, Airflow, Excel | Role: Data Engineer
Employee Behavior Analytics Platform
Internal Project | Tools: PySpark, AWS Glue, Snowflake, Lambda, Athena
PROJECT EXPERIENCE – Connected Commerce
Client: Mastercard
Role: Senior Software Engineer
Tools: Python, SQL, PySpark, Cloudera (HDFS, Hive), Hadoop Ecosystem, Git, Bitbucket,GitHub Copilot
Key Achievements
Prevented major data corruption by fixing a silent 3B-row JOIN fan-out issue.
Validated 700M+ transaction records with business-acceptable data quality.
Standardized JOIN patterns and improved maintainability across scoring pipelines.
Created reusable support and RCA documentation for future maintenance activities.