Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Timeline
Generic

Madhukar Shyam

Bangalore

Summary

Over 5+ years of experience as a Data Engineer, with strong technical expertise in ETL processes, Python, PySpark, Databricks, and BigQuery. Proven track record in all phases of the project lifecycle, including data acquisition, cleaning, and processing. Demonstrates excellent interpersonal skills, with the ability to build relationships and adapt to new technologies and challenges effectively.

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Engineer

Accenture India Private Limited
Bangalore
11.2021 - Current

oject 1 : OBU

Client: Takeda Pharmaceutical

Summary:
Led the Data Foundation Program (DFP), which involved migrating data from on-premise systems to a cloud-based solution. This program established a new batch-based data ingestion framework to integrate data from various sources into the core cloud platform. The platform was designed to support the analytics needs of business users and fulfill the data requirements of downstream applications.

Roles & Responsibilities:

  • Developed and maintained data pipelines and ETL processes using PySpark, Python, AWS Glue, and other AWS services.
  • Engineered robust data pipelines utilizing internal ETL tools, native Python scripts, Spark SQL, and AWS Glue (PySpark).
  • Scheduled and managed AWS Glue jobs in production using Tidal application to meet business requirements.
  • Managed data from multiple sources across various dimensions, acquiring a deep understanding of the data and its business context.

Project 2 : Phoenix

Client : Takeda Pharmaceutical

Summary:
Executed the transfer of Lake Table Parquet files from the Lake-com EDB bucket to the Inbound folder of the Exchange Account bucket (ADX). This project ensured the efficient and secure movement of data between storage environments, facilitating seamless data access and integration within the cloud infrastructure.

Roles & Responsibilities:

  • Configured and customized lake table details for transfer between s3 buckets by creating and managing configuration files.
  • Developed Big File framework to streamline the execution of PySpark scripts.
  • Designed and implemented Databricks notebooks using PySpark to create Hub and Mart Delta tables.
  • Automated creation of JSON input files with Python, reducing manual efforts and increasing process efficiency.
  • Extensively worked with Spark SQL, DataFrame API, and AWS services including S3, Lambda, and Redshift.
  • Proficient in PySpark, with significant experience using Databricks notebooks for data processing and analytics.

Project: End-to-End Text to SQL LLM Application

Summary:

Developed an application that translates natural language queries into SQL commands using large language models (LLMs), specifically leveraging Gemini Pro for querying SQL databases. This application allows users to interact with databases through conversational interfaces, simplifying data

retrieval and analysis.

Key Responsibilities:

  • Integrated Gemini Pro, a state-of-the-art LLM, to parse And understand natural language queries and generate corresponding SQL statements.
  • Implemented robust querying mechanisms to interact with SQL databases, ensuring accurate and efficient data retrieval.
  • Designed and developed a user-friendly interface for seamless interaction between users and the backend database.

Technologies Used:

Programming Languages: Python, SQL

Tools & Frameworks: Gemini Pro, Streamlit

Databases: sqlite3

Jr. Data Engineer

COGNIZANT TECHNOLOGY SOLUTIONS, Pune
Pune, India
02.2019 - 11.2021

PROJECT: SAMLINK ACCOUNT (Domain - BFS)

Summary:
Contributed to the creation of an Enterprise Data Lake (EDL) to store and analyze data for business reporting and analytics. As an ETL Developer, my primary responsibility was to extract, transform, and load data from the staging layer to the data mart layer using Informatica, ensuring the data was cleansed and processed to meet business requirements.

Roles & Responsibilities:

  • Analyzed and understood business requirements to design and develop ETL transformations, loading data from source systems to target data marts.
  • Utilized Informatica Big Data Management (BDM) extensively to load data from various sources (flat files, mainframe files, etc.) into Hadoop cluster.
  • Created dynamic mappings in Informatica to accommodate runtime changes in sources, targets, and transformation logic.
  • Built a Python utility to count source and target records across different source systems, enhancing data verification processes.
  • Developed Python utility for deploying Informatica applications, streamlining the deployment process.
  • Executed complex queries using BigQuery to support data analytics and reporting.
  • Designed and implemented partitioned tables and views in BigQuery to optimize data storage and retrieval.

Education

B.Tech - Electronics And Telecommunication Engineering

C.V Raman College Of Engineering
Bhubaneswar
05.2017

Skills

  • Programming Languages: Python,SQL
  • Big Data & Data Processing: PySpark,Pandas,AWS Glue,Informatica BDM,Databricks
  • Cloud Services: Amazon S3,Google Cloud Storage,AWS Lambda,AWS Athena, Bigquery
  • Data Integration & API: REST API
  • AI & LLM Technologies: OpenAI,GenAI,Gemini Pro,streamlit
  • Job Scheduler: Tidal, Airflow

Certification

  • PySpark & AWS : Master Big Data with PySpark and AWS (Udemy)
  • Databricks Accredited Lakehouse Fundamentals (Databricks )
  • Working with BigQuery (Coursera Learner )
  • ETL Processing on Google Cloud Using Dataflow and BigQuery (Coursera Learner )

Accomplishments

Top Performer Award

COGNIZANT TECHNOLOGY SOLUTIONS, Pune May 2020

FY24 ATCI NA Pinnacle Award Winner

Accenture India Private Limited ,Bangalore Jan 2024

Timeline

Data Engineer

Accenture India Private Limited
11.2021 - Current

Jr. Data Engineer

COGNIZANT TECHNOLOGY SOLUTIONS, Pune
02.2019 - 11.2021

B.Tech - Electronics And Telecommunication Engineering

C.V Raman College Of Engineering
Madhukar Shyam