Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Timeline
Generic
Lakshminarasimman S

Lakshminarasimman S

Chennai

Summary

A passionate and tech-oriented Hadoop Developer/Data Engineer with over 8 years of experience in the Big Data domain, including hands-on expertise in implementing real-time replication and migration solutions using a proprietary ETL tool based on the business requirements. Elasticsearch Certified Engineer with proficiency in core search capabilities and observability features.

Overview

10
10
years of professional experience
1
1
Certification

Work History

Senior Tech Lead

Indium Software Private Limited
11.2018 - Current
  • Elasticsearch Certified Engineer with expertise in implementing core search capabilities and observability features using Filebeat, Metricbeat, and Logstash pipelines, including custom code development for extracting metrics via JMX and processing logs.
  • Continuously exploring new product features that support various CDC (Change Data Capture) database replication solutions, enabling seamless integration from diverse source systems to multiple target environments.
  • Installing and configuring the platform Striim in production environments, while delivering client demos tailored to various customer-specific use cases.
  • Developed Python utilities to address custom use cases for customers facing challenges in automating replication pipelines.
  • Supporting variety of customer use cases in implementing CDC solutions for on-premises databases, including Oracle, MySQL, PostgreSQL, and SQL Server, as well as cloud environments like BigQuery, Pub/Sub, Azure ADLS, Azure SQL Server, and Hadoop ecosystems such as HDFS, HBase, and Hive.
  • Set up local environments for databases such as MySQL, PostgreSQL, and Oracle to configure CDC (log-based replication), and developed custom and TPCC workloads for benchmarking product and databases.
  • Developing custom Java and Python components, including User-Defined Functions (UDFs), to address customer-specific use cases that are not supported by the product by default.
  • Designing and building replication pipelines, ranging from simple to complex, with data transformations, and optimizing under performing pipelines in customer environments.
  • Developing and supporting internal data migration and validation frameworks, leveraging Spark as core technology to manage large datasets.

Big Data Engineer

Noah Data (Division of Indium Software)
10.2016 - 11.2018
  • Hadoop Cluster Management: Set up and configured a Hadoop cluster using Cloudera distribution, tuned performance for the cluster and HBase, and provided ongoing maintenance support.
  • Node Management: Prepared VMs for adding nodes to the Hadoop cluster, and handled the commissioning and decommissioning of data nodes as required.
  • High-Availability (HA) Setup: Configured high-availability (HA) for HDFS, HBase, Hive, Oozie, Hue, and Cloudera Manager, and externalized the metastore for Cloudera Manager, Hive, and other services.
  • Service Management: Managed and tuned Hadoop services based on resource availability and load to ensure optimal performance for other ETL processes and services.
  • ETL Development: Set up Sqoop import and export jobs using Oozie with Hue as an ETL developer tool, created and managed over 20 ETL flows, and ensured successful job completion on a daily basis.
  • Data Management: Created and tuned Hive internal and external tables for historical data, and optimized YARN for running parallel queries across various workloads in the Hadoop cluster.
  • Real-Time Data Processing: Populated real-time data in Apache Phoenix using Phoenix views to run SQL queries on HBase tables, optimized Phoenix SQL queries for quick response times, and effectively used composite row keys for HBase table modeling.
  • ETL Automation: Automated ETL processes using Oozie workflows in Hue, reducing data wrangling time by up to 300%.
  • Proof of Concepts (POCs): Conducted POCs to transition from a MySQL database source (Licensed Queue -> MySQL -> HDFS) to Kafka as a centralized source, directly landing data into HBase tables as part of Phase-2 enhancements.
  • Failure Management: Handled ETL failures by analyzing Hadoop/MapReduce logs, and resolved service issues by examining resource utilization metrics.
  • Proactively addressed potential bottlenecks in the ETL process through regular monitoring, enabling seamless workflow operations

Junior Web Data Analyst

SineQure Software
08.2014 - 07.2016
  • Developed basic Python scripts to scrape web pages for extracting usage history and other product-related information.
  • Executed various pre-built machine learning models on sample data and reported results to the team lead.
  • Created MySQL schemas to store reporting data for further analysis, and ran SQL queries to generate summarized reports.

Education

M.Tech - Computer Science and Engineering

Bharath University

B.Tech - Information Technology

Annai Teresa College of Engineering - Anna University

Skills

  • Big Data Processing using Hadoop- CDH
  • Performance Optimization
  • Cloud Computing Technologies - AWS
  • Teamwork and Collaboration
  • Database : Oracle, Postgres, Mysql ETL using Striim
  • CDC Replication/Migration using real time ETL
  • Search : Elasticsearch - ELK
  • Programming : Java-Core, Python-Core
  • File Systems : DSV, AVRO, Parquet,JSON
  • Streaming : Apache Kafka
  • Data Processing : Apache Spark

Accomplishments

  • Received client appreciation with a cash reward and was the winner of an internal hackathon.
  • Received client appreciation from top managers and the CEO for completing the project, along with a cash reward.
  • Won the Customer's Pride of the Quarter award twice.
  • Received client appreciation for managing ETL processes and implementing a secure, Kerberized Hadoop cluster.

Certification

Elasticsearch Certified Engineer

Timeline

Senior Tech Lead

Indium Software Private Limited
11.2018 - Current

Big Data Engineer

Noah Data (Division of Indium Software)
10.2016 - 11.2018

Junior Web Data Analyst

SineQure Software
08.2014 - 07.2016

M.Tech - Computer Science and Engineering

Bharath University

B.Tech - Information Technology

Annai Teresa College of Engineering - Anna University
Lakshminarasimman S