Summary
Overview
Work History
Education
Skills
Additionaltrainingandcertifications
Publications
Selectedcollegeprojects
Timeline
Generic
DEEPAK KUMAR

DEEPAK KUMAR

Data Engineer
Bengaluru,KA

Summary

Results-driven data engineer with over 10 years of experience designing, building, and optimizing data pipelines across various industries including BFSI and supply chain management. Proficient in Databricks Medallion architecture, Kubernetes, and Docker. Deep expertise in data warehousing, data migration, data mining, and data modeling. Skilled in utilizing data engineering tools, machine learning, and advanced analytics to deliver scalable and efficient solutions that drive business insights and enable strategic decision-making. Seeking an opportunity to contribute skills in data integration, pipeline development, and cloud-based platforms to a dynamic and innovative team.

Overview

10
10
years of professional experience
6
6
years of post-secondary education

Work History

Lead Data Engineer

Arrow Electronics
5 2022 - 9 2024
  • Architected and implemented Medallion architecture on Databricks, integrating heterogeneous data from various sources and creating refined datasets across bronze, silver, and gold layers
  • Implemented and optimized data import logic for the AMSS Scheduling tool by enhancing code, refining SQL queries, and improving the underlying logic, leading to more efficient runtimes
  • Developed and automated a Teradata-to-Databricks data pipeline for the Sales Strategy report, enabling sales and production comparisons for Total and VAS versus pick-ship data
  • Introduced Margin Differential and Margin Lift parameters for enhanced financial insights and branch-level analysis
  • Led the VAS Inventory Strategy project for the AMSS team, enhancing insights into lead time, reorder points, safety stock, and price grouping for various CPNs
  • Developed new metrics to provide valuable data, demonstrating strategic planning and execution
  • Developed OBIEE reports for supply chain and management data, which was used as a source for ETL pipelines tracking inventory and sales metrics
  • Spearheaded the SQL Server upgrade from 2008 to 2016, optimizing server jobs to significantly improve runtimes
  • Overhauled and refined SQL code while redesigning tables to better reflect data types and optimize space usage, enhancing overall operational efficiency
  • Developed SSIS Jobs to ingest data in SQL data warehouse, to feed Power BI Reports for efficiency, manufacturing cycle times, warehouse management, sales and quality reports
  • Designed architecture for modernizing databases from access DB to SQL DB as reporting database

Senior Data Engineer

Citi Bank (TCS)
10.2018 - 05.2022
  • Designed and implemented data pipelines to ingest and store CSV, JSON, and XML files, utilizing control and macro files to automate transformations and load data into MSSQL Server
  • Provided production support and chaos management for UCD, GPU, and recovery collections applications, utilizing in-depth domain knowledge
  • Led COB testing to perform stress tests and evaluate system resiliency
  • Developed dashboards to analyze server log data using Splunk to do real time data analysis
  • Used Splunk rest API to change dashboard permissions, create and update dashboards
  • Worked with the development team to build decentralized databases architecture, which remains in sync for failover during any contingency
  • Lead a team to monitor, find pain points and analyze production data
  • Setting up processes to prevent and identify real-time outage of production operations

Research Associate

University of Arizona ECE
01.2017 - 10.2018
  • Conducted research in information theory, focusing on data retrieval and privacy concepts
  • Developed methodologies to enhance private information retrieval (PIR)
  • Designed and implemented innovative data storage strategies that optimized storage space while maintaining data privacy
  • Co-authored multiple papers presented at international conferences, including research on the capacity of uncoded storage constrained PIR

Senior System Engineer - ETL Engineer

Infosys Ltd (Cisco)
09.2012 - 03.2016
  • Utilized tools like Informatica Data Quality for data profiling, data mapping and Metadata Management for Data Governance
  • Solved complex scenarios and coordinated with source systems owners with day-to-day ETL progress monitoring
  • Modified shell scripts to perform operations such as scheduling Jobs on Unix and Windows server to process data and control job execution

Education

Master of Science - Computer Engineering

University of Arizona - College of Engineering
Tucson, Arizona
05-2018

Bachelor of Technology - Electrical and Electronics Engineering

M.S. Ramaiah Institute of Technology
Bangalore, India
05-2012

Skills

Python

Additionaltrainingandcertifications

  • Coursera: Microsoft Azure for Data Engineering (Certificate)
  • Coursera: Python Data Structures (Certificate)
  • Coursera: Machine learning offered by Stanford University

Publications

  • MIMO Wiretap Channel with ISI Heterogeneity: Achieving Positive Secure DoF with no CSI, 51st Asilomar Conference on Signals, Systems and Control, Pacific Grove, CA, 10/01/17
  • Private Information Retrieval from Storage Constrained Databases - Coded Caching meets PIR, IEEE ICC 2018 Communication Theory Symposium
  • The Capacity of Uncoded Storage Constrained PIR, IEEE ISIT 2018 Information Theory Symposium

Selectedcollegeprojects

  • Study of Internet Topology, Built code to study Internet topology at the autonomous system (AS) level using real topological data, data source available at the center for applied internet data analysis (CAIDA).
  • Private Information Retrieval, This project explores a research problem where information is retrieved from servers anonymously. Developed storage and retrieval Algorithm in MATLAB for simulating theoretical model.

Timeline

Senior Data Engineer

Citi Bank (TCS)
10.2018 - 05.2022

Research Associate

University of Arizona ECE
01.2017 - 10.2018

Senior System Engineer - ETL Engineer

Infosys Ltd (Cisco)
09.2012 - 03.2016

Lead Data Engineer

Arrow Electronics
5 2022 - 9 2024

Master of Science - Computer Engineering

University of Arizona - College of Engineering

Bachelor of Technology - Electrical and Electronics Engineering

M.S. Ramaiah Institute of Technology
DEEPAK KUMARData Engineer