Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Projects
Timeline
Generic
GAURAV SINGH

GAURAV SINGH

GURGAON

Summary

Accomplished Data Engineer with a proven track record at Uber Ext, specializing in designing scalable data pipelines and automating workflows with Apache Airflow. Expert in data modeling and leveraging AWS technologies, I excel in delivering actionable insights and fostering collaboration across teams to enhance operational efficiency. Highly competent Data Engineer with background in designing, testing, and maintaining data management systems. Possess strong skills in database design and data mining, coupled with adeptness at using machine learning to improve business decision making. Previous work resulted in optimizing data retrieval processes and improving system efficiency. Practical database engineer possessing in-depth knowledge of data manipulation techniques and computer programming paired with expertise in integrating and implementing new software packages and new products into system. Offering several-year background managing various aspects of development, design and delivery of database solutions. Tech-savvy and independent professional bringing outstanding communication and organizational abilities.

Overview

4
4
years of professional experience
1
1
Certification

Work History

Data Engineer

Uber Ext
HYderabad
07.2024 - Current
  • Designed and developed scalable data pipelines using Apache Spark and Hive, optimizing batch processing workflows, and improving data reliability and performance across high-volume datasets.
  • Automated end-to-end data workflows with Apache Airflow, implementing DAGs for scheduling, monitoring, and orchestrating pipelines, resulting in reduced manual intervention and improved SLA adherence.
  • Automated end-to-end data workflows with Apache Airflow, implementing DAGs for scheduling, monitoring, and orchestrating pipelines, resulting in reduced manual intervention and improved SLA adherence.
  • Developed interactive Tableau dashboards to track key business metrics and support data-driven decision-making, enabling stakeholders to access real-time insights, and improve operational efficiency.

DATA ENGINEEER

The Modern Data Company
Hyderabad
01.2024 - 06.2024
  • Worked as a data engineer in the Analytics team for their client from their offshore office in Hyderabad, India.
  • Designed end-to-end data pipelines that dump and transform data to Apache Icebase on a daily basis with the help of Spark.
  • Implement data models with relationships and calculated metrics in Lens, from which data is consumed to Superset for the dashboard.
  • Also, ingest and transform data with the help of GLUE, and dump it inside AWS REDSHIFT.
  • Pulling data from S3, transforming it using PySpark and PyFlare, and ingesting the transformed data into a PostgreSQL database for reporting and analysis.
  • Applied advanced data analysis techniques using Trino SQL to extract valuable insights from a large dataset in decision-making.

DATA ENGINEER

Xceedance Consulting
Gurgaon Sector 19
11.2021 - 11.2023
  • Designed and implemented AWS data lakes using Amazon S3 and AWS Glue for ingesting, processing, and storing massive volumes of data.
  • Crafted a Big Data based solution using Apache Icebase and Spark that organized data in massive volumes.
  • Developed data pipelines from scratch, optimised data aggregation from 10+ independent sources and automated the ETL process to roll out the solution.
  • Collaborated with data scientists and business analysts to understand data requirements and provide timely, accurate data solutions.
  • Design of data models and data warehouses from which meaningful data is consumed by dashboarding tools.

Education

B-TECH - ENGINEERING IN ECE

NIT ALLAHABAD
ALLAHABAD
06.2021

Skills

  • Apache Icebase
  • Data Lakehouse
  • Kafka
  • PySpark
  • Airflow
  • MongoDB
  • Redshift
  • Trino
  • SQL
  • AWS Athena
  • Cassandra
  • Docker
  • Data Modelling
  • Data Warehouse
  • Kubernetes
  • ADF
  • SQL Server
  • Git & GitHub
  • Databricks

Certification

  • CASSANDRA AND CONFLUENT KAFKA, Load data from MySQL to Cassandra database and for messaging purpose used Apache Kafka and AWS SNS.
  • DATA WAREHOUSE ARCHITECTURE UDEMY, 10/22, Implement data models for reporting and dashboarding purpose and understand structure of DWH using Snowflake and Star Schema, prepared data dictionary for staging area.

Accomplishments

  • Individually contributed 150+ articles on Data Structures (Arrays, Stack, Queue, Linked list, Tree) and Algorithms (Dynamic Programming, Optimised Searching).
  • Conducted a firm wide Global Level session for 200+ employees on SHM encompassing designing, executing and best practices on workflows.
  • Organised BigData session for India's Analytics team of 50+; demonstrated usage of HIVE platform; personally conducted 10+ workshops.

Projects

NEAR REALTIME DATA PIPELINE, 07/23, 10/23, Crafted a Cassandra based real time ingestion pipeline for marketplace data in order to help DWH team to reduce request load from production MySQL. The objective was to shift business users from production, to overcome data leaks & security issues. Setup web interface Datastax Studio for users to query real time data from Cassandra using LDAP authentication.

Timeline

Data Engineer

Uber Ext
07.2024 - Current

DATA ENGINEEER

The Modern Data Company
01.2024 - 06.2024

DATA ENGINEER

Xceedance Consulting
11.2021 - 11.2023

B-TECH - ENGINEERING IN ECE

NIT ALLAHABAD
GAURAV SINGH