Summary

Overview

Work History

Education

Skills

Languages

Timeline

Madhu Sudhan Patti

Summary

A competent professional with 8 years of expertise in Hadoop with Spark and it’s Ecosystems – mainly in Hadoop, Sqoop, Hive, Spark, PySpark, Python, Scala, AWS and Elastic Search

Overview

years of professional experience

Work History

Data Engineer

IBM

06.2024 - Current

Migrated the India jobs from SG on-prem servers to GCP India server as per RBI guidelines
Developed the new jobs in UAT environment and performed data enrichments such as filtering, aggregation using Spark, PySpark and Hive as per business requirement in Sparkola tool
Enable the jobs in Airflow and validated the regular runs in jobserver
Deployed the jobs in UAT, QA and PROD environments
Validated the jobs in Jobserver and dependencies in Airflow

Senior Project Engineer

Wipro Limited

10.2021 - 06.2024

Creating the RDDs, DFs for the required input data and performed the data transformations, actions using spark-core and Spark-Data Frames
Build data pipelines that are scalable, repeatable, and secure, and can serve multiple purposes
Constructing a state-of-the-art data lake on AWS using EMR, Spark, Step Functions, CloudWatch Events
Experience in usage of Amazon EMR for processing Big Data across a Hadoop cluster of virtual servers on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3)
Experience on Spark Architecture including Spark Core, RDD, Data Frames, Data Sets, Spark SQL, Spark Streaming, and experience in importing the data from source HDFS into Spark RDD for in-memory data computation to generate the output response
Hands on experience Using Hive Tables by Spark, performing transformations and Creating Data Frames on Hive tables using SparkSQL
Experience in converting Hive/SQL queries into RDD transformations using Spark, Scala
Worked with Apache Spark components which provides fast and general engine for large data processing
Migrated an existing on-premises application to AWS
Designed, Built, and Deployed multiple applications utilizing the AWS stack EC2, S3, EMR focusing on high-availability, fault tolerance, auto-scaling
Designed and built custom ETL processes in AWS using Lambda functions, EMR Clusters reducing the cost overhead for the client
Developed and maintained automated CI/CD pipeline for the code deployment
With the help of this it is easy to manage already existing infrastructure and complex change- sets can be applied to infrastructure with minimal human interaction and thereby avoids many possible human errors
This was achieved using technologies such as Terraform, Jenkins, GitHub, AWS CICD services
Provide daily monitoring, management, troubleshooting and issue resolution to systems and services hosted on cloud resources
Developed the Spark code to perform data enrichments and calculations as per business requirement
Worked on performance optimization of various ecosystems such as Hive, Sqoop, Spark, Elastic Search and Kibana
Performed Data enrichments such as filtering, sorting and aggregation using Spark and Hive
Loading fact tables into Elastic search for visualization through Kibana
Created Dashboards and Visualizations in Kibana as per business requirements to monitor the day to day over changes in data

Software Engineer

EPAM Systems

05.2021 - 10.2021

Creating the RDDs, DFs for the required input data and performed the data transformations, actions using spark-core
Worked closely with business customers for Requirement gatherings
Designed Hive repository with external tables, internal tables, buckets, partitions, and ORC compressions for incremental data load of parsed data
Worked on performance optimization of various ecosystems such as Hive and Sqoop, Spark
Performed Data enrichments such as filtering, sorting and aggregation using Spark
Worked on building the scripts for the resource's creation in the AWS Cloud like step functions Glue Jobs, Lambda Handler etc
Hands on experience in building pipelines to implement the Business use case functionalities by performing the transformations

Software Engineer

Optum Global Solutions India PVT Ltd

10.2019 - 05.2021

Experience in Designing Hadoop and Spark Applications and recommending the right solutions and technologies for the applications
RDBMS Tables have been imported/exported using Sqoop
Used Apache Hive to run map reduce jobs on top of this HDFS Data
Built distributed in-memory applications using SPARK core and SPARK SQL to do analytics efficiently on huge data sets
Experience on creating the RDDs, DFs for the required input data and performed the data transformations, actions using spark-core
Worked closely with business customers for Requirement gatherings
Developing Sqoop jobs with incremental load from heterogeneous RDBM(Oracle) using native dB connectors
Designed Hive repository with external tables, internal tables, buckets, partitions, and ORC compressions for incremental data load of parsed data
Experienced in developing Hive Queries on different data formats like Text file, CSV file, Log files
Leveraging time-based partitioning yields improvement in performance using HiveQL
Created Hive external tables for the data in HDFS and moved data from archive layer to business layer with hive transformations
Worked on performance optimization of various ecosystems such as hive and Sqoop
Improvising the tuning options using HIVE functions such as Partitioning, Bucketing, Index, CBO etc

Software Engineer

HCL Technologies

05.2017 - 10.2019

Used Apache Hive to run map reduce jobs on top of this HDFS Data
Built distributed in-memory applications using SPARK and SPARK SQL to do analytics efficiently on huge data sets
These applications were built using Spark Scala API and used YARN as resource manager
Experience on creating the RDDs, DFs for the required input data and performed the data transformations, actions using spark-core
Performed Data Enrichment, cleansing and common data aggregations through RDD transformations
Interactive analysis of Hive tables through various data frame operations using SparkSQL
Involved in performance optimization of Spark Jobs and designed efficient queries
Performed Import and Export of data into HDFS using SQOOP
Handled heterogeneous data sources such as Oracle and different file formats
Created Sqoop jobs with incremental load to populate Hive External tables
Performed Data enrichments such as filtering, sorting and aggregation using Hive
Worked on performance optimization of various ecosystems such as hive and Sqoop
Improvising the tuning options using HIVE functions such as Partitioning, Bucketing, Index, CBO etc
Experienced in developing Hive Queries on different data formats like Text file, CSV file, ORC files and leveraging time-based partitioning yields improvement in performance using HiveQL
Used Oozie Scheduler system to automate the pipeline workflow and orchestrate the map reduces jobs that extract the data on a timely manner
Getting connected with the onshore team to review the code and validation of final results

Education

B.Tech - Mechanical Engineering

JNTU Anantapur

01.2014

Intermediate - M.P.C

Sri Chaitanya J.R College

01.2010

SSC - SSC

Sri sai baba E.M high school

01.2008

Skills

Hadoop
Sqoop
Hive
Spark
PySpark

Python
Scala
AWS
Elastic Search

Languages

English, Very Good
Telugu, Fluent

Timeline

Data Engineer

IBM

06.2024 - Current

Senior Project Engineer

Wipro Limited

10.2021 - 06.2024

Software Engineer

EPAM Systems

05.2021 - 10.2021

Software Engineer

Optum Global Solutions India PVT Ltd

10.2019 - 05.2021

Software Engineer

HCL Technologies

05.2017 - 10.2019

B.Tech - Mechanical Engineering

JNTU Anantapur

Intermediate - M.P.C

Sri Chaitanya J.R College

SSC - SSC

Sri sai baba E.M high school

Madhu Sudhan Patti

Summary

Overview

Work History

Data Engineer

Senior Project Engineer

Software Engineer

Software Engineer

Software Engineer

Education

B.Tech - Mechanical Engineering

Intermediate - M.P.C

SSC - SSC

Skills

Languages

Timeline

Data Engineer

Senior Project Engineer

Software Engineer

Software Engineer

Software Engineer

B.Tech - Mechanical Engineering

Intermediate - M.P.C

SSC - SSC

Similar Profiles

Sayyad FarhanSayyad Farhan

RAJKUMAR PRAJKUMAR P

Ebony McRaeEbony McRae

Vinothkumar KVinothkumar K

ASHWITA GAJAREASHWITA GAJARE