Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
Generic

LAXMAN VAISHNAV

Jodhpur

Summary

Data Engineer with 4+ years of experience building and optimizing scalable ETL pipelines for large-scale financial datasets. Proficient in PySpark, Apache Spark, and AWS EMR, with strong expertise in distributed data processing. Reduced ETL processing time from 9 hours to 1 hour through performance optimization. Skilled in data modeling and delivering high-quality data solutions for business insights.

Overview

4
4
years of professional experience

Work History

Data Engineer

Anand Rathi Wealth Services
12.2021 - Current

Tech Stack: AWS EMR, S3, Lambda, CloudWatch, Airflow, Hadoop, Spark, PySpark, SQL, Hive (Tez), Hdfs, Oozie, Quicksight

  • Designed and implemented a scalable ETL pipeline on AWS EMR using transient clusters, performing data transformations with PySpark and loading processed data using Spark write operations to target storage systems.
  • Designed and implemented an ETL pipeline using AWS EMR transient clusters, leveraging Hive with Tez for optimized query execution.
  • Automated cluster provisioning and job execution using AWS Lambda for triggering EMR clusters and AWS CloudWatch for scheduling.
  • Orchestrated workflows and managed dependencies using Apache Oozie, ensuring seamless execution of Hive, Sqoop, and PySpark jobs.
  • Performed additional data processing & optimizations using PySpark, working with data stored in S3 and Hive tables.

Education

Bachelor of Engineering -

M.B.M. Engineering College
Jodhpur
2018

Skills

    Programming & Data Processing:
    Python, SQL, PySpark, Shell Scripting, XML

    Big Data Technologies:
    Apache Spark, Apache Hadoop, HDFS, AWS EMR

    Cloud & AWS Services:
    Amazon S3, AWS IAM (Roles & Policies), AWS Lambda, Amazon CloudWatch

    Data Warehousing & Databases:
    Snowflake, Apache Hive, MySQL, SQL Server

    Workflow Orchestration:
    Apache Airflow, Apache Oozie

    Analytics & Visualization:
    Tableau, Power BI, Amazon QuickSight

Accomplishments

  • Improved ETL workflow performance by ~85%, reducing execution time from 9 hours to 1 hour by leveraging PySpark optimizations, efficient data partitioning, and distributed processing on AWS EMR transient clusters.
  • Awarded for outstanding performance in Q2 of 2022-23 for delivering high-quality ETL solutions and process optimizations.
  • Recognized as ‘Star Performer’ in multiple projects for successfully optimizing data pipelines and reducing processing time.
  • Received Spot Award for implementing an efficient PySpark-based data processing framework, enhancing data accuracy and performance.
  • Optimized ETL Performance: Reduced data processing time by 30% by implementing Hive partitioning & bucketing, improving query efficiency.
  • Automated Workflow Execution: Designed and implemented Apache Airflow DAGs, reducing manual job execution and ensuring 100% job success monitoring.

Timeline

Data Engineer

Anand Rathi Wealth Services
12.2021 - Current

Bachelor of Engineering -

M.B.M. Engineering College
LAXMAN VAISHNAV