Highly accomplished Data Engineer with over 3+ years of experience designing and delivering scalable data pipelines and cloud-native solutions across GCP, AWS, and Azure environments. Proven expertise in PySpark, BigQuery, DataProc, CI/CD, and Python, with a strong track record in full-cycle ETL/ELT pipeline development, cost optimization, and data quality assurance. Key contributor to Vodafone’s Jupiter cloud migration program, transitioning data platforms from AWS to GCP, and delivering critical pipelines like Critical Wave, Consumer Gold, Streaming Batch, and Energy (Beacon/Non-Beacon). Successfully developed and maintained curation pipelines under the Churchill program (Vodafone–Three UK merger), building the Virgo and Dragon pipelines, and a unified customer data product. Played a pivotal role in Mobile Data Integrity (MDI) projects—Phase 1 (historical and delta loads) and Phase 2 (full pipeline development with DBT and Qlik Sense integration)—with ownership of end-to-end design, validation, and delivery. Google Cloud Certified – Professional Data Engineer with a passion for building reliable, efficient, and secure data solutions. Adept at cross-functional collaboration, troubleshooting production issues, and mentoring junior engineers. Seeking a challenging role with strong growth opportunities and a competitive package.
1. Jupiter – AWS to GCP Migration Program
Role: Data Engineer
Tech Stack: PySpark, GCP (BigQuery, DataProc, GCS, Workflows), AWS (Athena, Code Commit), Docker, CI/CD (Azure DevOps), GitHub
Key Responsibilities
Migrated large-scale data pipelines from AWS to GCP with minimal downtime.
Developed and deployed pipelines including: Critical Wave, Streaming Batch, Consumer Gold, Priority Wave, DataProducts, Models, and Beacon/Non-Beacon Energy.
Integrated CI/CD using Azure DevOps for automated deployment and Docker image artifact management.
Executed unit testing across both AWS and GCP environments.
Designed reusable pipeline components and optimized Spark jobs.
Achievements:
Migrated 700+ TB of data efficiently.
Reduced deployment cycle time with automated CI/CD.
Received team-wide recognition and Sprint Star award for outstanding contributions.
2. Churchill Program – Customer Data Unification (Vodafone UK & Three UK Merger)
Role: Data Engineer
Tech Stack: GCP (BigQuery, GCS, Workflows), PySpark, GitHub, Docker.
Key Responsibilities:
Developed Virgo pipeline (Vodafone side), Dragon pipeline (Three UK side), and Churchill unified product pipeline.
Ingested daily customer snapshots from Teradata, handled schema evolution and duplicate management.
Curated, transformed, and merged data into unified BigQuery data products.
Implemented rigorous schema validation, file integrity checks, and corruption handling logic.
Achievements:
Built first production-ready Customer_All curation pipeline under tight timelines.
Recognized for building high-quality, scalable pipelines with minimal defects.
Received appreciation from Project Manager, Scrum Master, and team leads.
3. Mobile Data Integrity (MDI) – Phase 1 & 2
Role: Lead Data Engineer
Tech Stack: GCP (BigQuery, GCS, DataProc, Workflows), PySpark, GitHub, Azure DevOps, YAML, Docker, QlikSense, DBT
Key Responsibilities:
Phase 1: Developed historical and delta load pipelines.
Phase 2: Delivered fully automated, append-mode pipelines with decryption, validation, transformation, and archival logic.
Implemented reusable Spark views for corrupt data handling and DQ metrics.
Developed dashboards using QlikSense and modeled tables using DBT.
Achievements:
Reduced pipeline runtime from 16 hours to 6 hours.
Optimized Spark configuration and introduced parallel validation logic.
Achieved 25% cost reduction and created a template for future pipeline development.
Publicly appreciated in team announcements and engineering bulletin.