Data engineer with 5 years of experience in designing, developing and implementing data pipeline solutions. Proficient in data collection, integration and data warehousing using AWS and GCP cloud technologies.
Overview
4
4
years of professional experience
Work History
Data Engineer
Alshaya Group
05.2023 - Current
Utilized Apache Airflow to create ETL data workflow within Google Big query.
Leveraged Google Data Transfer service to implement an efficient and incremental data transfer process, pulling data from a vendor's S3 bucket to a Google Cloud Storage (GCS) bucket.
Developed and maintained ETL pipelines using Apache Spark on Databricks, enabling the efficient extraction, transformation, and loading of diverse retail datasets.
Utilized Google Cloud Composer to orchestrate Databricks data workflow.
Software Engineer
Clarivate Analytics
02.2021 - 05.2023
Designed and developed near real-time data solutions utilizing AWS Kinesis and Python, enabling timely and relevant insights.
Designed and automated ETL workflows using AWS Glue, CloudFormation, and Step-functions to seamlessly integrate data from RDBMS into AWS Redshift, ensuring data accuracy and consistency.
Collaborated on SQL query optimization utilizing Redshift, leading to significant reduction in overall data processing.
Developed a robust Data lake using AWS Lake Formation, enabling efficient data ingestion and management
Utilized AWS Athena to build and deploy tables for efficient data querying with SQL.
Achieved 30% reduction in data processing time by optimizing Pyspark jobs and implementing data processing optimizations with Apache Spark.
Designed and executed database schema changes using Liquibase and Java, enabling seamless data integration and management.
Successfully migrated Spark jobs to Redshift, resulting in a 30% reduction in cloud costs.
Associate - Data Engineer
Bepec Solutions
06.2019 - 01.2021
Collaborated on the development of ETL pipelines using Python and Spark to extract, transform, and load data.
Orchestrated Spark jobs using Apache Airflow for streamlined workflow management.
Successfully migrated HQL scripts to Spark SQL jobs for improved performance and scalability.
Utilized Tableau to design and create data sources for effective data visualization and analysis.
Education
Bachelor of Engineering - Electronics And Communications Engineering
Dayananda Sagar College of Engineering
Bangalore
07.2019
Skills
Apache Spark
AWS Redshift
AWS EMR
AWS Kinesis
Python
Java
SQL
Airflow
Databricks
AWS Glue
AWS Lambda
Google BigQuery
Linux
AWS Athena
Git
AWS Cloud-Formation
Secondary Skills
Apache Hive
Shell Scripting
Scala
Airflow
Hadoop
HDFS
Timeline
Data Engineer
Alshaya Group
05.2023 - Current
Software Engineer
Clarivate Analytics
02.2021 - 05.2023
Associate - Data Engineer
Bepec Solutions
06.2019 - 01.2021
Bachelor of Engineering - Electronics And Communications Engineering