Data Engineer having 6 + years of experience into Finance, Retail/CPG, Asset Management & Shipping analytical large scale big data products with end to end design and development using Python, Pyspark, Spark, SQL, Databricks, Azure & more.
Project: Finance Analytics
● Problem Statement: To develop and design Finance products for business like General Ledger, Material Movement, Sales Orders etc.
● Solution Implementation: Developed end to end pipeline from ingestion till reporting.
● Impact: Finance people can see through reports to know their sales, value, expenses & earnings.
Project: Product Availability / On-Shelf Availability-OSA:
● Problem Statement: To predict product availability on retail store shelf.
● Solution Implementation : Developed a bottlers scan & invoice data driven product with which we identified store (Out of stock) OOS & extras .
● Impact : Store owners now knows how much product is available.
Project: Asset Management:
● Problem Statement: To know about the health of assets on terminal.How much they are available, number of breakdowns, hours spent on breakdowns, their running hours etc.
● Solution Implementation:Designing Data pipeline to ingest data from two different databases. Implementing logic of each KPIs Building facts & Dimensions and their modeling on AAS cube. Building dashboard showing all indicators with different bifurcations.
● Impact:Users can now see the overall health of their terminal’s assets. They now know which asset is under-utilized and which is overutilized. Taking actions to do more preventive maintenance then waiting for doing corrective maintenance later. Utilization of actual man hours spent compared it with planned and how much is the backlog. Giving them a fair idea of which asset needs attention.
Project: Daily Terminal Operations Performance Management Phase 2.0:
● Problem Statement: To get insights from all the indicators of Phase1 in real time. With additional insights requirements.
● Solution Implementation: This time starting the product from scratch developing a real time framework. Ingesting data from Kafka topics. Building data pipelines and KPIs in real time and more optimized format.
● Impact: Users can now see each and every movement happening on the port in real time. Helping them to assist terminal guys to improve in certain areas. Giving the ability to take action and decision quickly based on data.
Project: Daily Terminal Operations Performance Management Phase 1.0
● Problem Statement: How to strategically organize well and to reduce any asset idle time and to fasten the product delivery through seaports. Basically, optimizing the product under container lifecycle on the port so that there is no delay in delivery.
● Solution Implementation: Created a data pipeline, ingesting terminal vessels, cranes, equipment and their movement on port. Analysis done by connecting all these data points to form a container lifecycle model and to determine the time and place of the container on the port. like if it is just discharged from ship , it's on a berth , maybe traveling through a truck or can be stacked.
● Impact: Kips like crane moves per hour/berth moves per hour/Dual Cycle/Yard capacity /Housekeeping moves can now tell Shift Managers, Line Managers and Supervisors of terminal about how their terminal is performing.
Project: Business Data Lake Implementation
● Problem Statement : To have a scalable data solution for different client data domains.
● Solution Implementation : Build a unified data engineering platform for Supply Chain/ Marketing /Finance big data. Using Databricks , Data factory , Python, Spark , SQL.
● Impact: Business now have their data from different domain from universal data lake into single Business data lake.