Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
CustomerServiceRepresentative

Madhukar Shyam

Data Engineer
Pune

Summary

Overall 4 years of experience as Data Engineer professional with good technical knowledge in ETL, python, Pyspark, Databrick, BigQuery. Experience in working in a development project and actively involved in all the phases of the project life cycle including data acquisition, data cleaning, data processing. Possess good interpersonal and relationship building skills. Flexibility to work with new technologies and able to accept new challenges.

Overview

4
4
years of professional experience

Work History

Data Engineer

Accenture India Private Limited
Pune
11.2021 - Current

PROJECT I :


OBU Client :Takeda Pharmaceutical


Summary : The data foundation program(DFP) is intended to provide the cloud based solution. This required the data from on-premise systems to be migrated to the new core solution and setting up new batch based data ingestion for the new source. The cloud platform is build to cater the analytics need for the business' users and to cater the data needs of the down stream applications.


Roles & Responsibilities:


  • Developed and maintained data pipeline and ETL processes using pyspark , Python, AWS Glue and AWS Services.
  • Engineered data pipeline using internal ETL tools, native Python scripts, with Spark SQL and AWS Services like Glue( pyspark ).
  • Scheduled the Glue jobs Using Tidal application in production as per business requirement.
  • Handled data from multiple sources across multiple dimensions and gain in depth understanding of data and its business context.


PROJECT II:


Phoenix Client : Takeda Pharmaceutical


Summary : This project involves transferring of lake table parquet files from lake-com EDB bucket to inbound folder of exchange account bucket(ADX).


Roles & Responsibilities:


  • Developed Pyspark script for the transfer of lake filed from EDB to ADX bucket.
  • Created the configuration file to customize the details of lake table that needs to be transferred.
  • Developed the Big file framework for execution of Pyspark scripts.
  • Developed Databricks notebook using Pyspark to create the hub and mart delta tables.
  • Automated the process to create the Json input file using Python to avoid the manual efforts.

Jr. Data Engineer

COGNIZANT TECHNOLOGY SOLUTIONS
Pune
02.2019 - 10.2021

PROJECT: SAMLINK ACCOUNT (Domain-BFS)


Summary: In this project an EDL (Enterprise Data Lake) is created where we store our data and run different type of analytics, my role in this project being a ETL developer is to pull the data from staging layer till data mart layer with the help of ETL tool Informatica where we perform all the needful business glossaries and cleansing to provide the final data as per business requirements for reporting purposes.


ROLES/RESPONSIBILITIES:


Understanding the Business Requirements and developing various ETL transformations to load data from source to target.

  • Extensively used Informatica BDM for loading the data from sources to target. Loading source data (flat files, mainframe files, etc.) from staging layer to Data mart layer in Hadoop cluster.
  • Developed code in the project as per the Informatica, Unix and Python norms to ensure a defect free delivery.
  • Developed dynamic mapping to enable changes to sources, targets and transformation logic at runtime.
  • Analyzing and validating mapping, workflows and output data to verify data loaded into targets as per the Business Logic.


UTILITIES DEVELOPED USING PYTHON:


  • Developed python utility to count source and target records of different source systems.
  • Developed python utility to deploy informatica application.
  • Developed python utility to create parameter file (.xml) based on the parameter given which helps to run dynamic mapping.


GCP (Google cloud platform):


  • Good Knowledge of GSUTIL commands to perform file operations (copy, delete, …etc.) in google cloud platform.
  • Perform complex queries using BigQuery for the data analytics.
  • Created partitioned table and view using BigQuery.

Education

B.Tech - Electronics & Telecommunication

C V Raman College of Engineering
Bhubaneswar
08.2013 - 05.2017

Skills

Python

undefined

Accomplishments

  • ETL Processing on Google Cloud Using Dataflow and BigQuery (Coursera Learner )
  • Working with BigQuery (Coursera Learner )
  • Databricks Accredited Lakehouse Fundamentals (Databricks )
  • PySpark & AWS : Master Big Data with PySpark and AWS (Udemy)

Timeline

Data Engineer

Accenture India Private Limited
11.2021 - Current

Jr. Data Engineer

COGNIZANT TECHNOLOGY SOLUTIONS
02.2019 - 10.2021

B.Tech - Electronics & Telecommunication

C V Raman College of Engineering
08.2013 - 05.2017
Madhukar ShyamData Engineer