Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Timeline
Generic
Rajesh  Govindan

Rajesh Govindan

Hosur,TN

Summary

CARRER OBJECTIVE Hadoop Developer has good Experience and understanding of parallel processing and distribution file systems of Big Data and its Ecosystems. Technical expertise in transforming heterogenous data into vital information to serve the necessities of clients/customers associated with the organization.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Technical Lead

HCL Technologies Ltd
Bengaluru, KA
07.2021 - Current
  • Developed Importing Json schema using to load GCP – Big Query and automated using Airflow.
  • Build and implementation of an ETL pipeline through GCP-Data Proc serverless and reducing Data processing time by 25%.
  • Guided and reduced redundant activities of 3 departments by
    Building Data Warehouse, Data modeling and Ingestion.
  • Improved Data accuracy by 15% by implemented Data quality checks.
  • Conducted performance tuning on SQL queries, optimized. Data retrieval by 20%.
  • Implemented Data enrichments such as filtering, pivoting, format modeling, sorting and aggregation using Big Query tools.
  • Implemented CI/CD pipeline for automating build while committing changes to query.
  • Build performance optimization of various ecosystems such as Big Query.
  • Used Airflow Scheduler system to automate the pipeline workflow and orchestrate the map reduces jobs that extract the data on a timely manner.

Senior Software Engineer

YASH Technologies Pvt Ltd
Bengaluru, KA
07.2020 - 06.2021
  • Optimized Query by using spark SQL and Spark Data Frame reducing Processing time 25%
  • Implemented Caching, Broadcast join and Hash aggregate reduced shuffling and achieving Data processing reduced more than 50% latency.
  • Optimized compression Techniques like ORC and Parquet by default codec snappy to achieve 50% high throughput and latency.
  • Build Import and Export of Data pipeline into HDFS through Hive using SQOOP.
  • Enhanced Data enrichments such as filtering, pivoting, format modeling, sorting and aggregation, Partitioning and Bucketing using Hive tools. Reduced 90% of cluster worker node size by applying proper filter and avoiding wide transformation.

Senior Software Engineer

Yash Technologies
Bengaluru, KA
07.2015 - 06.2020
  • Collaborated ETL Testing and Web Application Testing.
  • Implemented 90% quality data checks to meet SLA.
  • Implemented 90% Test Cases have passed except 10% least minor bug.
  • Delivered exceptional client support by promptly addressing concerns and implementing requested changes or enhancements to software solutions.

Education

Master of Science - Computer Application

Adhiyamaan College of Engineering
Hosur, India
05.2011

Skills

  • TECHNICAL COMPETENCIES
  • BIG DATA ECOSYSTEMS: GCP -Big query, Data Proc, Hadoop, Hive, Sqoop, Yarn, Airflow, Cloud Composer, Data Flow
  • Apache Spark 3x, Airflow
  • Distribution FRAMEWORK: Horton Works, Cloudera, Spark
  • PROGRAMMING LANGUAGES: python
  • DATABASE: IBMDB2, Oracle, SQLSERVER, MYSQL
  • NoSQL DATABASE: HBase
  • OPERATING SYSTEM: Linux
  • PROJECTS SUMMARY
  • Project Name: Marketing Intelligence, Sales chat, Prepaid and Postpaid
  • Client: Verizon-India
  • Environment: GCP, Airflow, HIVE, Python, Spark
  • Data warehouse: Big query
  • Dev ops tools: Maven, GitLab, Jenkins CI/CD
  • Defect tracking Tool: Jira
  • Team Size : 5

Accomplishments

  • Overall, 8.6 Years of experience in GCP, Big Data Developer, Big Data Testing and Web Application Testing
  • Working predominately GCP services like GCP-Big query, Data Proc, Cloud Composer, Airflow
  • Deployed and managed Data Proc clusters on cloud platforms like, GCP-Data Proc, Big query resulted in saving $2000 monthly in infrastructure costs
  • Used cloud composer -airflow to schedule the jobs in big data pipeline system
  • Automated data ingestion and transformation processes for data pipeline, processing up to 500 GB of data daily with 99% accuracy
  • Wrote 20+ data transformations in Hive and Spark and improved query performance using partitioning, bucketing and AQE
  • Designed and implemented ETL pipelines using Hive and Spark to process and analyse large-scale datasets resulting in 40% in processing time reduction
  • Optimized clusters Hadoop and Spark jobs to enhance performance resulting in a 25% improvement in overall efficiency
  • Hadoop Framework and PySpark developer with cloud expertise in Big Data technologies – mainly in Core GCP Services Big Query, Data proc, Airflow, Hadoop, Sqoop, Hive, Spark, Spark SQL, Spark Data Frame
  • 4.6 years of experience in manual and ETL testing with Automobile, Agri domain
  • Handled performance optimization techniques in Spark and Big query data processing and job execution
  • Understanding data sets, read data different source files like csv, Parquet, Avro, Json, Sequence files and store it in HDFS/Hive tables using Spark
  • RDBMS Tables have been imported/exported using SQOOP
  • Good exposure on analysing data using Hive queries to meet the business requirements
  • Experience in building extract, transform, and load data onto HDFS for processing and creating HIVE tables on top of the output for data analysis
  • Development and Implementation of various methods to load HIVE tables from HDFS and Local File System
  • Developed Hive Queries to parse the raw data, populated external & managed tables and store the refined data in partitioned external tables
  • Hands on experience in FileZilla, WinSCP, Eclipse etc.

Certification

GCP -Professional Data Engineer

GCP- Professional Cloud Architect

Timeline

Technical Lead

HCL Technologies Ltd
07.2021 - Current

Senior Software Engineer

YASH Technologies Pvt Ltd
07.2020 - 06.2021

Senior Software Engineer

Yash Technologies
07.2015 - 06.2020

Master of Science - Computer Application

Adhiyamaan College of Engineering
Rajesh Govindan