Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic
Anuj Agarwal

Anuj Agarwal

Summary

Detail-oriented Data Engineer with seven plus years of experience specializing in designing and implementing robust data architectures. Proven expertise in creating efficient ETL pipelines, optimizing data workflows, and implementing scalable solutions for enhanced business intelligence. Adept at leveraging cutting-edge technologies to drive data-driven decision-making. Collaborative team player with a track record of delivering high-quality, performance-driven data solutions.

Overview

8
8
years of professional experience

Work History

Senior Data Engineer

GoDaddy.com LLC
Remote
10.2024 - Current
  • Optimized data pipelines by implementing advanced ETL processes and streamlining data flow.
  • Enhanced system performance by designing and implementing scalable data solutions for high-traffic applications.
  • Designed robust database architecture that supported seamless integration of new datasets and facilitated rapid analysis capabilities.
  • Championed the adoption of agile methodologies within the team, resulting in faster delivery times and increased collaboration among team members.

Data Engineer

Adidas
01.2022 - 09.2024
  • Onboarded the Direct-to-Consumer (DTC) Ecom sales data utilizing the Databricks platform for multiple regions like China, Europe, North America, Emerging Markets, Southeast Asian Countries
  • This includes loading the demand, delivery, and invoice dataset from legacy systems like SAP HANA, Exasol
  • Assumed a leadership role in the product development process, which involved conducting daily scrum meetings, facilitating sprint retrospectives through the utilization of JIRA retrospective boards, estimating delivery timelines, coordinating sprint planning activities, and engaging with market teams
  • Additionally, provided User Acceptance Testing (UAT) support throughout the development cycle
  • Extensively worked on Databricks optimization leading to savings of 1.5 million dollars annually for the Europe data product
  • Tools included: Databricks, AWS S3, Kafka, SAP, FTP, Azure, Airflow and PySpark

Senior Data Engineer

ZS Associates
09.2021 - 01.2022
  • Developed and led 6 Data Engineers to migrate on-premise cluster projects to AWS EMR cluster for entire client space
  • Tools included: Azure DevOps pipeline, AWS - EMR, S3 and Autosys

Lead Data Engineer - Big Data & Analytics

Airtel Africa Digital Labs
11.2020 - 09.2021
  • Managing the project in the Agile Environment by hosting daily Scrums, Sprint planning and goal setting, and managing documentation on confluence
  • Developing ETL flow using PySpark for data transformation where we read ASN files and convert data from binary to Hex and then to required human readable format
  • Managed lifecycle elements of ETL development, from robust testing to final deployment
  • Led and Managed 5 data engineers as direct reportees
  • Tools included: Spark, Presto, Hive, mongo Db, Postgres, Grafana, Apache Nifi and Airflow

Data Engineer

ZS Associates
01.2019 - 11.2020
  • Develop PySpark and SQL codes to create Big data ETL workflows
  • Implemented metric Calculations
  • Create data pipelines to get data from FTP to HDFS
  • Worked with Autosys job scheduler
  • Tools included: Pyspark, S3 and Autosys

Data Engineer

NTT Data
02.2017 - 12.2018
  • Wrote Shell scripts for daily maintenance activities, including indexes and tables analyses
  • Developed the dataflow in Apache Nifi to handle huge transactional data
  • Created Hive and HBase tables to maintain the data
  • Maintains Hadoop cluster for Multiple environments in the project
  • Tools included: PySpark, S3, Apache Nifi, Sqoop, Hive, Impala, Flume, Apache Storm, Kafka, HBase, SVN, Jenkins and Autosys

Education

B.Tech - Information Technology -

Hindustan College Of Science And Technology - Mathura
07.2016

Skills

  • Databricks
  • PySpark
  • AWS
  • GCP
  • Azure
  • Python
  • Spark SQL
  • Hadoop
  • Hdfs
  • Hive
  • Impala
  • Apache Sqoop
  • Apache Nifi
  • Apache Kafka
  • Airflow
  • Autosys
  • Oozie
  • Grafana
  • Data Modeling
  • ETL development
  • SQL Expertise
  • Data Warehousing

Timeline

Senior Data Engineer

GoDaddy.com LLC
10.2024 - Current

Data Engineer

Adidas
01.2022 - 09.2024

Senior Data Engineer

ZS Associates
09.2021 - 01.2022

Lead Data Engineer - Big Data & Analytics

Airtel Africa Digital Labs
11.2020 - 09.2021

Data Engineer

ZS Associates
01.2019 - 11.2020

Data Engineer

NTT Data
02.2017 - 12.2018

B.Tech - Information Technology -

Hindustan College Of Science And Technology - Mathura
Anuj Agarwal