Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic

RENU YADAV

Summary

Senior Software Engineer with 12 years of experience in bigdata engineering, specializing in designing distributed large-scale data solutions. Expert in Big data framework like Hadoop, Apache spark, Kafka, hive, HBase, hdfs, Airflow, kafka connect, duckdb and Trino. I have led design and development efforts resulting in overall 30% improvement of query response time. Passionate about building scalable data solutions.

Overview

14
14
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Uptycs
05.2023 - Current
  • Led the development and deployment presto gateway to support multiple trino clusters which eventually helped in proper management of queries and helped in proper trino resource management and save infra cost by 15%
  • Architect and implemented a data compaction solution that compacts large volumes of small files across multiple tables and customers within 20-30 minutes of data arrival, reducing query times and improving overall trino query response tie in 30 %
  • Implemented trino connector which can be to to fetch the latest files which are compacted without getting any duplicates
  • Unblocked Trino upgrades by resolving hudi connector data consistency issue in apache hudi
  • Redesigned and implemented a component to improve its result response time by using Duckdb for reading the results
  • Led end-to-end design and implementation of horizontal scaling of postgres db sharding which has removed the bottleneck on single db instance
  • Technologies Used : Spark, Trino, postgres, Hudi, Duckdb, hdfs

Senior Big data Engineer

UBS
03.2021 - 05.2023
  • Led the development of end-to-end ETL data pipeline for IB reporting which collect data from different sources and store in data lake which is used by analyst to get the trade insights
  • Worked on Azure cost optimizer project which show different cost saving techniques
  • Implemented AD authentication
  • Designed the backloading flow of failed data flow to improve data consistency
  • Worked on building and implementing inventory control system which is used in reconciliation of message between upstream and datalake
  • Worked on dockerization of Sprint boot application on Azure
  • Worked on improving the performance on kafka producer in datalake application
  • Working on migrating logs Splunk
  • Collaborated with devops to setup CICD
  • Technologies Used : Spark, kafka, Azure Event Hub, docker, databricks, Java, Azure AD, Cassandra

SDE 2

Dream11
07.2020 - 02.2021
  • Worked on adding new features in the feature store for machine learning models.
  • Worked on developing source and sink connectors for Kafka connectors.
  • Worked on building the event repository for daily mobile app feeds.
  • Technologies Used : Spark, kafka, AWS, kafka connect, Druid, Java, Python

Sr Software Engineer

Expedia
12.2018 - 07.2020
  • Worked on designing and implementation of the projects
  • Designing of the Generic pipeline for ML data Processing
  • Dealing with stakeholders for project timelines and requirement gathering
  • Involved with project deployment and issue tracking
  • Involved in optimization of application to reduce the AWS cost
  • Involved in code optimization and reviewing
  • Collaborated with cross-functional team to integrate machine learning models into data pipeline, increase predictive analytical accuracy by 25 percent
  • Technologies : Hadoop, Spark, Hive, Hbase, Aws, kafka, Airflow, Scala, Python

Sr Software Engineer

Symantec
12.2014 - 11.2018
  • Worked on requirement gathering, designing and development of project
  • Build ETL pipeline using distributed frameworks like spark and hadoop which has reduced the Terabytes of data processing time from a 12 hours to 2 hours
  • Dealing with stakeholders for project timelines and requirement gathering
  • Involved with project deployment and issue tracking
  • Involved in the optimization of the application to reduce the AWS cost.
  • Involved in code optimization and reviewing
  • Technologies : Hadoop, Spark, Hive, Hbase, Aws, kafka, Airflow, Scala, Python

Software Engineer

Zensar Technologies
09.2011 - 11.2014
  • Worked on developing various component of the application
  • Involved in optimization of applications
  • Involved in the designing phase of the projects
  • Technologies : Java, oracle

Education

BE - IT

Imperial College of Engineering
Pune
01.2011

12th science -

Kendriya Vidyalaya
Pune
01.2007

Skills

  • Hadoop
  • Spark
  • Spark Streaming
  • Hive
  • Hbase
  • Cassendra
  • Kafka
  • Kafka connect
  • KSQL
  • Kstreams
  • Trino
  • Duckdb
  • Hudi
  • AWS
  • Azure Fundamentals
  • Java
  • Scala
  • Python fundamentals
  • Airflow
  • Oozie

Certification

  • Azure Fundamental AZ-900, 01/01/22
  • OCJP Java Certified, 01/01/11

Timeline

Senior Data Engineer

Uptycs
05.2023 - Current

Senior Big data Engineer

UBS
03.2021 - 05.2023

SDE 2

Dream11
07.2020 - 02.2021

Sr Software Engineer

Expedia
12.2018 - 07.2020

Sr Software Engineer

Symantec
12.2014 - 11.2018

Software Engineer

Zensar Technologies
09.2011 - 11.2014

BE - IT

Imperial College of Engineering

12th science -

Kendriya Vidyalaya
RENU YADAV