Cloud Composer
undefinedTransunion: Event-Driven Data Pipeline integrating Cloud Pub/Sub with On-Premise SQL Server at Transunion, 10/01/23, Present
This data pipeline uses events in a Pub/Sub topic to trigger processing. Cloud Functions read Parquet files, transform them, and land the data as CSVs in a GCS bucket. A Composer pipeline then processes these CSVs in chunks and stores them in real-time within an on-premise MS SQL Server. Responsibilities included collaborating on the architecture, utilizing Google Cloud services, and ensuring seamless integration.
Transunion: Cloud Function & Jenkins Integration for Seamless Data Uploads at Transunion, 05/01/23, 07/01/23
In this innovative initiative, we are revolutionizing our data upload process by implementing a Cloud Function that seamlessly integrates with Jenkins. The objective is to automate the triggering of Jenkins pipelines whenever new data is uploaded to GCS buckets. This streamlined approach enhances efficiency, reduces manual intervention, and accelerates our data processing workflows.
Transunion: Across Google Cloud Projects using Cloud Composer, 01/01/23, 05/01/23,
In this ground breaking initiative, we are building a sophisticated data pipeline to efficiently transfer data from GCS buckets in one Google Cloud project to BigQuery tables in another project. The implementation leverages the power of Cloud Composer, ensuring a scalable, orchestrated, and automated workflow for smooth data integration.
Google: New York State DEC + Google Data Pipeline (Data Warehouse to BigQuery), 02/01/21, 08/01/22
Implemented a robust data pipeline using Google Cloud services for the New York State Department of Environmental Conservation (DEC). The pipeline leveraged key components such as Cloud Run, Google Cloud Storage (GCS), Cloud Storage, Artifact Registry, and Cloud Composer. Orchestrated daily runs with Cloud Composer, seamlessly integrating it with Cloud Run for enhanced automation. Scheduled Cloud Composer to trigger Cloud Run for daily data extraction, sending GET requests to the client API and transforming raw JSON data into newline-delimited JSON format. Utilized Cloud Composer to efficiently load processed data into BigQuery, contributing to a streamlined workflow and improved accessibility of environmental data for DEC.