Data Processing Proficiency: Demonstrated expertise in constructing and maintaining large-scale data pipelines, with practical experience in handling petabyte-scale batch and streaming data using Apache Spark and Apache Pulsar. Cloud Platforms Mastery: Skilled practitioner in Google Cloud Platform (GCP) services including BigQuery, Dataflow, Pub/Sub, and Cloud Storage, with proficiency in ETL/ELT processes, data modeling, and real-time data streaming. Data Pipeline Architecture: Proven track record in designing and constructing data pipelines, ensuring data optimization through cleaning, transformation, and management across various cloud sources such as AWS, GCP, Redshift, and Azure. Data Storage Solutions: Proficient in the design and implementation of sophisticated Data Warehouses, Data Lakes, and Data Lakehouses, strictly adhering to Data Warehousing Principles. Comprehensive Core Competencies: Well-versed in Data Engineering and SQL Development, with particular emphasis on ELT/ETL processes and Apache Spark utilization. Project Execution and Management: Successfully built end-to-end pipelines, integrating sources like Teradata and Oracle, and embracing methodologies such as Scrum for agile deployment. Data Quality and Performance Optimization: Initiated data quality checks at the initial stages, implemented data consistency validations across systems, and finetuned dataset performance. Transformation and Flow Management: Devised and executed ELT pipelines, overseeing 10+ pipelines and 20+ workflows, directed towards streamlining cleaning and transformation, thereby enhancing data accessibility. Continuous Integration and Deployment: Managed release cycles using Jenkins, resolving build issues promptly and ensuring on-time delivery of commitments. Technological Agility: Adept in leveraging various programming languages and cloud technologies for data processing, analytics, orchestration, and quality assurance such as Python, PySpark, ELT/ETL, Hadoop, and more.
Title: Data Engineer
Received Star Award at Modern Data
6 Years 9 Months