Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
Hi, I’m

Rohan Vijay Wargia

Bangalore
Rohan Vijay Wargia

Summary

Experienced Data Engineer with strong expertise in designing and building scalable data pipelines using AWS services such as Glue, Kinesis, and Redshift. Proficient in Python, PySpark, and ETL development with a solid understanding of data engineering principles, including data modeling, orchestration, and performance optimization. Adept at implementing secure, reliable, and efficient data solutions to support analytics and real-time processing.

Overview

6
years of professional experience

Work History

Koch Business Solutions

Data Engineer
09.2023 - Current

Job overview

  • Built scalable data pipelines using Python, S3, Glue, Redshift and PySpark to process partitioned datasets for analytics and reporting.
  • Integrated Master Data Services into the ETL pipeline using API Gateway which ensures realtime data updates from the Planners.
  • Created a realtime ETL pipeline leveraging using Kinesis, which pulls millions of Industrial sensor data from IP21 database and pushes to Timescale DB.
  • Built ETL pipelines from scratch which pulled data from various sources, wrote business logic with MDS integration and pushed it to Redshift. This reduced supply planner's work from days to hours.
  • Wrote Stored Procedures, SQL queries, etc. Created hypertables, materialized views, triggers etc. as per business requirements.
  • Created pipelines, wrote Python scripts for Industrial maintenance, furnace, orders, Stocked product, etc. using Lambda, Dynamo DB and Athena.

TEKSYSTEMS Global Services

Software Engineer(Project - 2/Ticketmaster)
09.2022 - 08.2023

Job overview

  • Built end-to-end pipeline for ticketing data from Salesforce to Data lake using Databricks, Python, Spark, SQL.
  • Created pipeline to fetch batch and streaming data from different sources and perform transformations as per requirements.
  • Created pipe lines for bringing incremental data from Oracle ADW to Data lake.

TEKSYSTEMS Global Services

Software Engineer (Project - 1/Nike)
09.2021 - 08.2022

Job overview

  • Developed PySpark and SparkSQL code to process the data in Apache Spark running on Amazon EMR clusters as per requirements.
  • Developed Python scripts to create Airflow DAGs to schedule the PySpark, SparkSQL scripts developed.
  • Created Hive tables on AWS S3 file directories to access the data processed by Apache Spark.
  • Created and load database objects in Snowflake warehouse, leveraging AWS S3 as storage layer.
  • Perform unit testing and fixed the bugs using Pytest.

TATA Consultancy Services

System Engineer
07.2019 - 08.2021

Job overview

  • Performed analysis using Numpy, Pandas, Matplotlib, Seaborn on huge data.
  • Wrote an ETL job in AWS Glue using PySpark/Python which loads incremental data from one bucket to many other buckets with various transformations.
  • Wrote an ETL job in AWS Glue using PySpark/Python which pushes the data incrementally into Redshift.

Education

REVA University

B. Tech. from Computer Science and Engineering
05.2019

University Overview

GPA: 6.74

Skills

  • Python
  • SQL
  • AWS(Glue, EMR, DynamoDB, API Gateway, IoT Core, Athena, Lambda, Redshift)
  • Spark/PySpark
  • Pandas
  • Hive
  • Snowflake
  • Airflow
  • Data Structures
  • Hadoop
  • Java
  • TimescaleDB, SQL Server

Accomplishments

Accomplishments
  • Special Initiative Award, Won SIA for my valuable contribution in the project at TCS.
  • High Five, Got an appreciation for my work on APCBD 2.0 project and getting it released before schedule at Nike.

Timeline

Data Engineer
Koch Business Solutions
09.2023 - Current
Software Engineer(Project - 2/Ticketmaster)
TEKSYSTEMS Global Services
09.2022 - 08.2023
Software Engineer (Project - 1/Nike)
TEKSYSTEMS Global Services
09.2021 - 08.2022
System Engineer
TATA Consultancy Services
07.2019 - 08.2021
REVA University
B. Tech. from Computer Science and Engineering
Rohan Vijay Wargia