Summary
Overview
Work History
Education
Skills
Additional Information
Timeline
Generic

GOWTHAM MAHENDRAN

TAMIL NADU

Summary

  • Having almost 6+ years of experience in designing and developing Big Data applications using the Hadoop Ecosystem technologies (HDFS, Hive, Sqoop, Apache Spark and AWS).
  • Domain experience includes Finance,Insurance,Retail. Real time experience in Hadoop/Big Data related technology experience in Storage, Querying, Processing and analysis of data.
  • Understands the Complex Data Processing needs of big data and have experience in developing codes and modules to address those needs.
  • Capable of processing large sets of Structured and Semi-structured data.
  • Worked on AWS Components like S3, EMR.
  • Worked with different file formats like JSON, XML, AVRO data files and text files.
  • Worked extensively on Hadoop migration project and POCs.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive.
  • Experience in importing and exporting data using Sqoop from HDFS to RDBMS and vice-versa.
  • Knowledge in installing, configuring, and using Hadoop ecosystem components like HDFS, Hive, Sqoop, and Spark.
  • Brought in simplification process and Optimization initiatives to bring efficiency into applications.
  • Proficient in optimizing Sqoop imports and exports for performance and scalability.
  • Experienced in designing and implementing complex data integration solutions using Sqoop.
  • Experience in handling hive schema evolution with avro file format .
  • Roficient in handling hive partitions and buckets with respect to the business requirement.
  • Skilled in handling semi structured/serialized data processing using hive (AVRO,PAQUET,ORC).
  • Experienced in efficiently using Hive managed and external table with respect to the business requirement. Deep knowledge in incremental imports, partitioning and bucketing concepts in Hive and Spark SQL needed for optimization.
  • Proficient in developing and implementing Spark RDD-based data processing workflows using Scala, or Python programming languages.
  • Experienced in optimizing Spark RDD performance by tuning various configuration settings, such as memory allocation, caching, and serialization.
  • Skilled in using Spark RDD persistency and caching mechanisms to reduce data processing overhead and improve query performance.
  • Familiarity with Spark RDD lineage and fault tolerance mechanisms and their impact on data processing reliability and performance.
  • Expertise in using Spark RDD transformations and actions to process large-scale structured and unstructured data sets, including filtering, mapping, reducing, grouping, and aggregating data.
  • Having hands on experience in deploying spark jobs over EMR cluster as a step execution.
  • Used Agile methodology to work with IT and business team to progress efficient system development.
  • Data base experience in SQL Server and MYSQL.
  • Have good problem solving and analytical skills and ready to innovate in order to perform better.
  • Have strong Interpersonal skills and communication skills.

Overview

7
7
years of professional experience

Work History

Big Data Developer

HDFC BANK LIMITED
08.2019 - Current
  • Is a consumer banking services company headquartered in Mumbai, India
  • The company offers products and services including wholesale banking ,retail banking ,treasury, auto loans ,two wheeler loans ,personal loans ,lifestyle loans ,consumer durable loans ,credit card
  • Along with this various digital products are Pazapp and SmartBuy
  • Performed Import and Export of data into HDFS and Hive using Sqoop and managed data within the environment
  • Created Hive tables, loaded data, and wrote Hive queries.
  • Managed Hadoop MapReduce jobs for processing large datasets
  • Was responsible for Optimizing Spark sql queries that helped in saving Cost to the project
  • Consuming Data from upstream system using WEBAPI, RDBMS, file systems and applying various business logic and transformation and write the data to target hive/HBase table which is further used by Business for Analytics purposes
  • Migration of huge amount of data from RDBMS to HDFS using Sqoop jobs
  • Automated Sqoop jobs using shell scripts to pull data from various data bases into Hadoop.
  • Used Spark Streaming to divide streaming data into batches as an input to Spark engine for batch processing jobs.
  • Familiarity with Spark RDD-based data processing libraries and frameworks, such as Apache Spark SQL, MLlib, and GraphX, and their features and limitations.
  • Experienced in optimizing Spark DataFrame performance by tuning various configuration settings, such as memory allocation, caching, and serialization.
  • Expertise in using Spark DataFrame transformations and actions to process large-scale structured and semi-structured data sets, including filtering, mapping, reducing, grouping, and aggregating data.
  • Cleansing and transformation of data using Spark pushing the crunched data to Hive table.
  • Created Glue tables on S3 buckets and loaded the data into the tables.
  • Skilled in using Spark DataFrame persistency and caching mechanisms to reduce data processing overhead and improve query performance.
  • Familiarity with Spark DataFrame-based data processing libraries and frameworks, such as Apache Spark SQL, MLlib, and GraphFrames, and their features and limitations.
  • Exported necessary spark Jars to run in the cluster.
  • Worked on generation of complex data generation which is further used by downstream application.
  • Involved in working on the Data Analysis, Data Quality and data profiling for handling the business that helped the Business team.
  • Loaded and transformed large sets of semi structured data likes XML,JSON,Avro,Parquet.
  • Code&peer review of assigned task Unit testing and Bug fixing
  • Technologies : Hadoop, HDFS, Hive, Sqoop, Spark sql,


ETL DEVELOPER

JANA SMALL FINANCE BANK
07.2016 - 08.2019
  • Jana Small Finance Bank is a consumer banking services company headquartered in Bangalore, India
  • This project defines mid-level platform design for Core Banking (CBS) with regards to the changes need to migrate Commercial & Offshore accounts from CAP to CBS
  • The Document is designed to complement the End to End Projects Design documents and Shows at a platform level the developing design in greater details
  • It serves as overview to the documents of applications software on the CBS and IBM Web Sphere Data Stage
  • Responsible for developing, support and maintenance for the ETL (Extract, Transform, Load) process using Informatics Power Center
  • Develop Mappings and Workflows to generate Staging files.
  • Develop various transformation like Source Qualifier , Sorter Transformation ,Joiner Transformation, Update strategy Lookup Transformation, Expression and Sequence Generator for loading data into target table.
  • Created multiple Mapplets Workflows, Tasks, database connection using Workflow Manager.
  • Created Session and batches to move data at specific intervals & on demand using Server Manager.
  • Responsibility include creating the session and scheduling the session.
  • Recovering the Failed Sessions and Batches.
  • Extracted the data Form Oracle ,DB2,CSV and Flat files.
  • Implemented performance tuning techniques by identifying and resolving the bottlenecks in source, target transformation mapping and session to improve performance. Understanding the functional requirements.
  • Designed the dimension model of the OLAP data marts
  • Preparing the documents for test data loading.

Education

MBA - Operations Management

SRM UNIVERSITY
2016

B.E - Computer Engineering

SATHAYABAMA UNIVERSITY
2014

12th -

SARASWATHI MATRIC HIGHER SECONDARY SCHOOL
SALEM
2010

10th -

SARASWATHI MATRIC HIGHER SECONDARY SCHOOL
SALEM
2008

Skills

  • Data Eco System : Hadoop,Sqoop, Hive, Apache Spark and AWS
  • Distribution : Cloudera 512
  • Databases : SQL Server, MySQL
  • Languages : Scala, Python, SQL
  • Operating Systems : Linux and Windows

Additional Information

WORK EXPERIENCE

HDFC BANK LIMITED – AUGUST- 2019 – Present

JANA SMALL FINANCE BANK – JULY 2016 – AUGUST- 2019

Timeline

Big Data Developer

HDFC BANK LIMITED
08.2019 - Current

ETL DEVELOPER

JANA SMALL FINANCE BANK
07.2016 - 08.2019

MBA - Operations Management

SRM UNIVERSITY

B.E - Computer Engineering

SATHAYABAMA UNIVERSITY

12th -

SARASWATHI MATRIC HIGHER SECONDARY SCHOOL

10th -

SARASWATHI MATRIC HIGHER SECONDARY SCHOOL
GOWTHAM MAHENDRAN