Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
Generic

Ayush Kumar Jha

Navi Mumbai

Summary

Data & Infra Engineer with a proven track record of leading multiple high-impact projects across data infrastructure, real-time processing, and analytics. Experienced in managing scalable pipelines for 5B+ daily events, with hands-on expertise in Airflow, Kafka, Trino, Spark, and Kubernetes. Achieved 40% faster pipelines and ₹1 crore in cost savings through performance tuning and system optimizations. Built real-time analytics platforms with Apache Pinot and datastores in Scylla for AI personalization. Also developed full-stack internal tools using Django and React, and enabled personalized marketing notifications at scale.

Overview

6
6
years of professional experience

Work History

Tech Lead

JioSaavn
Navi Mumbai
04.2021 - Current

Infrastructure Management

  • Managed, optimized and upgraded versions of Apache Airflow and Trino clusters, ensuring high availability and performance.
  • Migrated Kafka, Airflow, and Trino from Azure Singapore to Azure JioIndiaWest.
  • Integrated authentication on Trino and Airflow, ensuring zero downtime and no impact on production jobs.
  • Migrated workloads to AMD-based clusters, reducing pipeline execution time by ~40%.
  • Created new infra using terraform in compliance with ISMS 27001:2013 guidelines.

Pipeline & Data Warehouse Management

  • Designed and maintained data pipelines processing 5 billion events daily to obtain metrics such as streams, DAU, and MAU.
  • Led the project of migrating legacy reducer code to spark.
  • Led the project of migrating of taking mysql dumps from sqoop to spark.
  • Led the project of taking daily dumps of mongo collections in spark for the purpose of disaster recovery and ensuring ISO 27001:2013 guidelines.
  • Built a data warehouse solution using Apache Pinot for real-time analytics and low-latency querying.
  • Achieved cost savings of approximately ₹1 crore anually by using JSON serializer to extract data in Hive.
  • Automated ETL workflows using Apache Airflow, improving efficiency and reliability.
  • Used data pipelines to power personalised notifications.

Real time processing

  • Developed real-time streaming solutions using Kafka and Storm, enabling near-instantaneous data processing.
  • Working on project of stream computation in realtime.
  • Created a datastore in Scylla to power the Recently Played module, Fast Track module on the homepage and support AI-driven entity ranking within the homepage modules.

Tool Development

  • Designed and developed internal and external data tools like DataOne and LabelOne using Django and React.
  • Ensured strict access control on these tools as they contain business critical data.
  • Built tools like s3-mock service, which uses AWS APIs to put data in Azure Blob.

R&D Engineer

Next Education India Pvt Ltd
Hyderabad
06.2019 - 04.2021

Automatic Subjective Answer Evaluation System.

  • Creating a system that automatically rates the subjective answer.
  • Used deep learning methods to train the system on the previously rated answers by the teachers, and using Netflix Sidecar to integrate it in a JVM-based system.

Automatic tagging questions to concepts.

  • Designing a system that suggests concepts to the question and integrating it in the tagging system.
  • Used methods such as the universal sentence encoder and inverse term frequency in natural language processing to score concepts, suggesting these to SMEs to streamline their work.

Knowledge space theory-based adaptive tests.

  • Worked on the knowledge space theory-based adaptive test, which is based on the state map generated from the concept map, and finally predicts the current state of the student (concepts which the student knows) in a minimum number of questions.
  • Developed the tests end to end and optimised them using techniques like Redis, cache, and AWS Lambda functions, and created a simulator that simulates student responses to verify the efficacy of the algorithm.

Education

Bachelor of Technology -

Indian Institute of Technology
Roorkee
06-2019

Skills

  • Algorithms and data structures
  • Hadoop, Spark, and Hive
  • MySQL and NoSQL databases
  • Python, C, and Java
  • ETL

Accomplishments

  • Secured a rank of 1,687 in JEE Advanced 2015
  • Secured a rank of 175 in online round of ACM ICPC Chennai

Timeline

Tech Lead

JioSaavn
04.2021 - Current

R&D Engineer

Next Education India Pvt Ltd
06.2019 - 04.2021

Bachelor of Technology -

Indian Institute of Technology
Ayush Kumar Jha