Summary

Overview

Work History

Education

Skills

Certification

Timeline

DILIP KUMAR

Summary

With a proven track record at Deutsche Bank, I excel in architecting and optimizing data solutions using Azure Databricks and PySpark, showcasing my technical prowess and leadership in guiding teams towards excellence. My expertise in Azure Synapse Analytics and a knack for innovative problem-solving have significantly enhanced data processing capabilities.

Overview

years of professional experience

Certification

Work History

Senior Data Engineer

Deutsche Bank

07.2022 - Current

Customer Banking Data Plateform(CBDP):

Company Overview: Enterprise Analytical Platform (EAP)
Designed and implemented end-to-end data pipelines using Azure Data Factory (ADF) and Azure Databricks to transform, process, and integrate structured and unstructured data from various sources into Azure Data Lake Storage (ADLS)
Developed and optimized large-scale data processing workflows in Azure Databricks using PySpark to support advanced analytics, reporting, and machine learning use cases
Configured and managed Azure Synapse Analytics for data warehousing, enabling seamless integration with Power BI for real-time reporting and interactive dashboards
Implemented scalable storage solutions using Azure Blob Storage and ADLS, ensuring data security through role-based access control (RBAC) and encryption, while optimizing performance and cost for big data applications.

OneBaufi:

Developed/Managed Terraform framework and successfully set up and maintained (Dev, UAT, Prod) environment, including the creation and management of service accounts
Developed Big Query solutions with Dataform leveraged JavaScript in Dataform to define and automate the creation of Big Query tables and implement business logic for complex data transformations
Automated workflows using Cloud Composer utilized Cloud Composer and Apache Airflow to deploy and manage Python-based DAGs, integrating shell scripts to streamline and automate end-to-end ETL processes.
Worked on Pub/Sub companion and Topics to consume the mesage sent by the cloud scheduler and automated the workload to run and process the data.
Implemented the validation logic of framework from raw and taret side also scheduled along with the worload.

iPolice:

Created a POC for the team, visited Paris to have initial-level discussions, and managed to secure the project for offshore India.
Designs of the project flow with the senior-level architect.
Created PySpark flexible framework with metallic architecture which had many modules
Implemented many PySpark jobs transformations and actions as per the requirement and did the optimizations of spark job parametrization to pass memory, core, executors on fly
Created some of python precheck scrips to run before running the main job
Developed junior staff through targeted coaching and mentoring, improving capabilities and competencies of technical teams
Established a Cloudera-based Hadoop ecosystem to offer a versatile data integration solution for various source and target systems
Developed a Spark-based framework to read data from HIVE table and produce files in multiple pre-defined formats, including creating and loading Hive tables for testing
Implemented data pipelines using Control-M and Shell Scripts, optimized Spark jobs, and configured Control-M for job scheduling with email notifications for failures
Developed shell scripting, automated it and maintained the housekeeping
Enterprise Analytical Platform (EAP)

Senior Data Engineer

BDH Society Generale

11.2021 - 07.2022

Company Overview: Big Data Hub the new Data Platform for the Commercial Banking Tribe Domain fulfilling multiple Business requirements related to Data
Established a Cloudera-based Hadoop ecosystem to offer a versatile data integration solution for various source and target systems
Developed a Spark-based framework to read data from HIVE tables and produce files in multiple pre-defined formats, including creating and loading Hive tables for testing
Implemented data pipelines using Control-M and Shell Scripts, optimized Spark jobs, and configured Control-M for job scheduling with email notifications for failures
Developed shell scripting, automated it and maintained the housekeeping’s
Big Data Hub the new Data Platform for the Commercial Banking Tribe Domain fulfilling multiple Business requirements related to Data

Data Engineer

OpenText Technologies

02.2019 - 04.2021

Importing and exporting data in HDFS and Hive using Sqoop
In-depth knowledge on writing query for HIVE, related to partitioning and bucketing
Involved in creating Hive tables, loading with data, and writing hive queries which will run internally in map reduce
Having knowledge on Hive Data File Format like ORC, PARQUET
Worked with Spark ecosystem using Spark SQL and Spark Scala
Applied Data Vault 2.0 methodology for data modelling in an Enterprise Datawarehouse, handling stages such as Raw Data Vault, Business Vault, and Information Vault
Configured Jenkins Pipelines for code deployment on clusters
Developed shell scripts to run Spark jobs, including handling holiday and weekend scenarios
Utilized Control-M for scheduling Spark jobs and implemented Oozie jobs for email notifications upon job success or failure
Optimized Spark Scala code and Hive query processing for improved performance

Technical Specialist

IBM

08.2016 - 02.2019

Importing and exporting data in HDFS and Hive using Sqoop
In-depth knowledge on writing query for HIVE, related to partitioning and bucketing
Involved in creating Hive tables, loading with data, and writing hive queries which will run internally in map reduce way
Having knowledge on Hive Data File Format like ORC, PARQUEST
Worked with Spark ecosystem using Spark SQL and Spark Scala

Education

B.Tech. - Electrical & Electronics Engineering

Galgotai’s College of Engineering and technology

01.2016

Skills

Azure Data Factory
Azure Synapse Analytics
Azure Databricks
ADLS
Azure Delta Table
BigQuery
Dataform
Pub/Sub
Cloud Composer
Cloud Scheduler
Terraform
Apache Spark
Java
Pyspark
Hive
Sqoop
Cloudera Hadoop
Delta Lake
Oozie
Control-M
Airflow

PL/SQL
SQL
Shell Scripting
Azure Devops
ADLS Gen 2
Cloudera Hadoop (HDFS)
Parquet
AVRO
Python
Github
UNIX
Confluence
Bit bucket
JIRA
Service Now
Spark development
Real-time analytics
Big data processing
Data pipeline design
Git version control
Python programming

Certification

Snowflake SnowPro Core
Azure Data Engineer Associate (DP-203)

Timeline

Senior Data Engineer

Deutsche Bank

07.2022 - Current

Senior Data Engineer

BDH Society Generale

11.2021 - 07.2022

Data Engineer

OpenText Technologies

02.2019 - 04.2021

Technical Specialist

IBM

08.2016 - 02.2019

B.Tech. - Electrical & Electronics Engineering

Galgotai’s College of Engineering and technology

DILIP KUMAR

Summary

Overview

Work History

Senior Data Engineer

Senior Data Engineer

Data Engineer

Technical Specialist

Education

B.Tech. - Electrical & Electronics Engineering

Skills

Certification

Timeline

Senior Data Engineer

Senior Data Engineer

Data Engineer

Technical Specialist

B.Tech. - Electrical & Electronics Engineering

Similar Profiles

Dnyaneshwar BhogilDnyaneshwar Bhogil

PRADEEP KUMARPRADEEP KUMAR

Manoj Singh SolankiManoj Singh Solanki

Amit KulkarniAmit Kulkarni

Chih Han YuChih Han Yu