Data Engineer with 4.5+ years of hands-on experience in designing, building, and optimizing scalable, production-grade data pipelines on AWS. Strong expertise in Py-Spark, AWS Glue, S3, Redshift, and SQL, with proven success in ETL optimization, cloud migration, cost optimization, and data quality frameworks. Experienced in end-to-end pipeline ownership, performance tuning, and enabling analytics and BI teams with reliable, high-quality data.
Key Contributions and Impact
Key Achievements
Cloud & Big Data - AWS (S3, Glue, Athena, Lambda, IAM), ETL/ELT, data warehousing, data migration
Big Data & Processing: Py-Spark, Python, partitioning, job optimization, incremental loads
Programming - Python, MySQL / PL-SQL , Py-Spark, Spark SQL
Data Engineering: ETL Pipelines, Data Warehousing, Data Modeling, Data Lakes, Incremental Loads, Data Quality, Schema Validation
Orchestration & DevOps: AWS Step Functions, CI/CD, Git, GitHub Actions
Analytics & BI: Tableau, Power BI,KPI Modeling, Reporting Enablement