Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
59
Vishal Agrawal

Vishal Agrawal

Noida

Summary

Data Engineer having 6 + years of experience into Finance, Retail/CPG, Asset Management & Shipping analytical large scale big data products with end to end design and development using Python, Pyspark, Spark, SQL, Databricks, Azure & more.

Overview

7
7
years of professional experience

Work History

Data Engineer

Rockwell Automation
06.2023 - Current
  • Designed and developed Finance modules like Sales Orders , General Ledger end to end for business
  • Developed & Managed the complete scalable data framework solution catering all the teams under data analytics like Finance, Supply Chain and others
  • Refactoring of curated , processed and validation notebooks
  • Learnt & implemented doctests for reusable functions for all finance modules
  • Collaborated with cross-functional teams to ensure seamless integration of data solutions with existing systems
  • Created High Level Design documents for new modules.

Senior Data Engineer

Quantiphi | Client: CocaCola Bottlers
11.2022 - 05.2023
  • Created a Point of Sales Data based forecasting solution for a prominent US beverage conglomerate to predict daily product availability, resulting in increased revenue of over $20 million
  • Successfully implemented and adopted by 11 US-based beverage bottlers
  • Analyzed and optimized distributed Machine Learning pipelines by reengineering the Data Fetch module, resulting in a 45% reduction in time taken to generate predictions and 50% reduction in cloud computing costs
  • Designed and created a data monitoring pipeline that improved data quality by by notifying vendor side data streams, resulting in more accurate predictions generated by the model
  • Led and developed the intern onboarding, integrating 11 newcomers into the project and familiarizing them with essential complex technologies involved in day-to-day activities

Senior Data Engineer

Quantiphi | Client:AP Moller Maersk
06.2021 - 11.2022
  • Worked on 5 different data products under Asset Management & Daily
  • Operations Performance Management
  • Designing and developing applications using Python, Pyspark and Spark
  • Proficient in writing and tuning of complex SQL queries with joins and sub queries by inspecting queries and exception handelling
  • Done a lot of optimizations like Incremental Processing of data in delta tables , synapse tables and in AAS Cubes
  • Reduced the execution time of each layer pipeline, resulting in 50% reduction for overall pipeline in dev and prod environment which means lesser use of azure resources thus saving many dollars
  • Did automation of code so that it can be used for all the terminals
  • Have always played key role in architectural discussions
  • Role level security for users from different regions
  • Creating logic apps for parallel pipeline runs by artificial resource locking mechanism
  • Delivered end to end data product from Ingestion till reporting, including building framework supporting 100% KPIs using Python

Senior Analyst

Capgemini | Client : Unilever
Mumbai
10.2018 - 06.2021
  • Completely hands-on in building Databricks notebook using Spark, PySpark &
  • Spark-SQL ,leveraging spark functionalities to provide seamless client solution
  • Transformation and loading of data into Business Data Lake in Azure for Supply Chain, Finance, Marketing
  • Communicate with external clients and end-user to determine specific requirement
  • Followed the Project life cycle methodology like Agile and Scrum; defined under the process framework using jira tool ensured that customer deadlines were met
  • Represented my team for many production releases
  • Knowledge sharing sessions for new team members
  • Working on complex stories having multiple dependents and dependencies thus maintaining the sequence flow

Education

Technology

Feroze Gandhi Institute Of Engineering
06.2018

Skills

  • Python : Developed Data Engineering End to End Framework batch/inc/streaming, serving 45 KPIs
  • Pyspark : Developed cross functional KPIs with Complex business logic
  • SQL : Implemented base logic for KPIs, adhoc Scripts for framework
  • Apache Spark : Extremely hands on spark Developer Indepth understanding of Arch Concepts & Optimizations
  • Spark Streaming: Implemented it in 2 projects from Scratch
  • Deltalake : Used for datalake housing supporting ACID
  • Databricks : Hands on developer, created raw to Processed notebooks Dims/facts for KPIs and Many adhoc ones
  • Datafactory : Designed multiple master pipleine & Child pipleines , set complex dependencies , used Lookups, Notebooks, Web Hook
  • Logic Apps : On email trigger adf pipeline Supporting parallel runs of same pipeline, Cube Refresh logic apps using web hook
  • PowerBi : Developed multiple pages of dashboard For 12 KPIs, created complex measures
  • SQL-Server : Created adhoc tables/ dim/ facts, Meta data tables feeding framework
  • Change-Data-Capture : SCD2 with supporting deleted records from source
  • Data warehousing : worked on SCD2 setup tables
  • AAS Cube : Designed and developed AAS tabular model for a project from scratch
  • Data Modeling : Created complex relations for many KPIs
  • Data Structures : Used python data structures and abstract data types in development projects
  • Azure Data Lake : Created data warehouse on ADLS Gen2 using Delta lake
  • Data Ingestion : Ingested data from OLTP Databases , Flatfiles, Datalakes, Kafka
  • Big Data Technologies : Have understanding of hadoop ecosystem and spark massive parallel processing framework
  • ETL Development: Build many ETL pipelines from scratch
  • Query Optimization : Performed performance optimization using materialized views , Data Skipping, Partitioning
  • System Design : Worked on designing systems based on CAP theorem for batch and streaming solutions
  • Low Level Design : Created objects level designs for usecases in shipping and fleet management
  • Data Analytics : Worked on descriptive, diagnostic analytics for Shipping, fleet and Asset management solutions
  • Data Science : Worked on predictive analytics for retail and Consumer Packed Goods solutions
  • Functional Programming : Developed the framework code in functional way
  • Object Oriented Programming : Created few side projects using object

Accomplishments

Project: Finance Analytics

Problem Statement: To develop and design Finance products for business like General Ledger, Material Movement, Sales Orders etc.

Solution Implementation: Developed end to end pipeline from ingestion till reporting.

Impact: Finance people can see through reports to know their sales, value, expenses & earnings.

Project: Product Availability / On-Shelf Availability-OSA:

Problem Statement: To predict product availability on retail store shelf.

Solution Implementation : Developed a bottlers scan & invoice data driven product with which we identified store (Out of stock) OOS & extras .

Impact : Store owners now knows how much product is available.

Project: Asset Management:

Problem Statement: To know about the health of assets on terminal.How much they are available, number of breakdowns, hours spent on breakdowns, their running hours etc.

Solution Implementation:Designing Data pipeline to ingest data from two different databases. Implementing logic of each KPIs Building facts & Dimensions and their modeling on AAS cube. Building dashboard showing all indicators with different bifurcations.

● Impact:Users can now see the overall health of their terminal’s assets. They now know which asset is under-utilized and which is overutilized. Taking actions to do more preventive maintenance then waiting for doing corrective maintenance later. Utilization of actual man hours spent compared it with planned and how much is the backlog. Giving them a fair idea of which asset needs attention.

Project: Daily Terminal Operations Performance Management Phase 2.0:

Problem Statement: To get insights from all the indicators of Phase1 in real time. With additional insights requirements.

Solution Implementation: This time starting the product from scratch developing a real time framework. Ingesting data from Kafka topics. Building data pipelines and KPIs in real time and more optimized format.

Impact: Users can now see each and every movement happening on the port in real time. Helping them to assist terminal guys to improve in certain areas. Giving the ability to take action and decision quickly based on data.

Project: Daily Terminal Operations Performance Management Phase 1.0

Problem Statement: How to strategically organize well and to reduce any asset idle time and to fasten the product delivery through seaports. Basically, optimizing the product under container lifecycle on the port so that there is no delay in delivery.

Solution Implementation: Created a data pipeline, ingesting terminal vessels, cranes, equipment and their movement on port. Analysis done by connecting all these data points to form a container lifecycle model and to determine the time and place of the container on the port. like if it is just discharged from ship , it's on a berth , maybe traveling through a truck or can be stacked.

Impact: Kips like crane moves per hour/berth moves per hour/Dual Cycle/Yard capacity /Housekeeping moves can now tell Shift Managers, Line Managers and Supervisors of terminal about how their terminal is performing.

Project: Business Data Lake Implementation

Problem Statement : To have a scalable data solution for different client data domains.

Solution Implementation : Build a unified data engineering platform for Supply Chain/ Marketing /Finance big data. Using Databricks , Data factory , Python, Spark , SQL.

Impact: Business now have their data from different domain from universal data lake into single Business data lake.

Timeline

Data Engineer

Rockwell Automation
06.2023 - Current

Senior Data Engineer

Quantiphi | Client: CocaCola Bottlers
11.2022 - 05.2023

Senior Data Engineer

Quantiphi | Client:AP Moller Maersk
06.2021 - 11.2022

Senior Analyst

Capgemini | Client : Unilever
10.2018 - 06.2021

Technology

Feroze Gandhi Institute Of Engineering
Vishal Agrawal