Summary
Overview
Work History
Education
Skills
Timeline
Generic

Ratnesh Mishra

Software Engineer 2
F.no -203, Anugraha , 1st Cross , Munireddy Layout, Kadubeesnahalli, Bengaluru

Summary

Software Engineer with 10+ years of industry experience, currently working as SDE 2 in the EMR Open Data Analytics Hive team at AWS, driving performance and feature enhancements for Apache Hive on EMR. Previously contributed to large-scale data platforms at BookMyShow, Moonfrog Labs, and 1MG. Experienced in designing and optimizing distributed systems, big data pipelines, and cloud infrastructure. Skilled in Java, Golang, Python, and passionate about big data, distributed systems, software architecture, scalability, and open source.

Overview

10
10
years of professional experience

Work History

Software Development Engineer 2 (AWS EMR Hive)

Amazon Web Services
Bengaluru
06.2021 - Current

Working in dataplane team for (Apache) Hive under open data analytics org of AWS EMR.

  • Hive Upgrade: Leading the upgrade of Hive on EMR to version 4, integrating 5,000+ new commits while ensuring compatibility with existing Hive features in EMR. Oversaw a major Hive Metastore upgrade, a critical component for Spark, Flink, Hudi, and other EMR services. Also contributed to upgrading Hive 4 support for Hudi, Flink etc enabling seamless ecosystem interoperability.
  • Hive Performance Improvement :Worked on Hive performance improvements as part of the TPC-DS 3TB performance enhancement initiative, targeting read performance for ORC and Parquet formats on EMR. Delivered 10+ optimizations—including non-ORC file listing, Tez relaxed locality, and ORC split computation enhancements—resulting in a 1.59x total runtime improvement on EMR-7.6.0 vs. EMR-7.0.0, surpassing our original 1.5x goal.
  • Glue data Catalog Views : Worked with Athena , Spark and Glue/lakeformation teams for adding Glue data catalog views (single common view across AWS services like Amazon Athena and Amazon Redshift). Developed APIs for enabling creation of glue data catalog views via lake formation which involved end to end development of system for providing on the fly query compilation and validation for spark and Athena.
  • FGAC for hive using LakeFormation : Worked on adding support for fine grained access control in Hive on EMR with AWS lake formation using runtime IAM roles.

Software Engineer 2 (Data Platform)

BookMyShow
09.2020 - 05.2021

User Profiles - Designed and developed scalable grpc APIs for various aggregate profiles like Behavior , transaction , content profiles powering features like user personalization and targeted ads. Currently operating under a load about 40k rpm .

User Segmentation Framework- Contributed to the development of in-house developed user segmentation framework designed for facilitating features like user bucketing and funneling.

Developed query engine over elastic for transforming different logical combination of business funnels into elastic queries .

Software Engineer 2 (Platform)

Moonfrog Labs
02.2018 - 09.2020

Data pipeline - Acting as a key player in the team responsible for adding various features , improving ,maintaining Moonfrog's data pipeline cluster handling ~40M/ minute unique data events , ~3.2TB/day data by volume,
Major Feature contributed to include

Design and development of autoscaling capability for the stateful, distributed data pipeline .which helped to handle traffic surge efficiently when pipeline traffic suddenly became nearly 4X during Lockdown (from serving 15B events/day to ~58B events/day)

Leagues Service as a Platform - Developed leagues service(a feature for increasing user engagement ) as a platform and also integrated and released for TPG game , currently serving ~4M DAU (Daily Active Users)

Migration of Stat server to Kubernetes - Centralised stat server cluster (handling ~ 20k request per sec) containerised and deployed on Kubernetes using AWS EKS , helpful for saving redundant maintenance effort and cost.

Data Pipeline - Responsible for adding various features , improving ,maintaining Moonfrog's high scale data pipeline cluster handling ~40M/ minute unique data events , ~3.2TB/day data by volume, maintained in redshift backed in s3.

Data lake query Capability - Migrated existing CSV data to parquet using AWS EMR and added direct query capability from data lake(s3) using AWS Athena.

SDKs and Dashboards - Developed various SDKs like Stats client SDK , League's SDK , stat server SDK , RTS SDK(for tracking game concurrents ) etc and in house Dashboards(for tracking different business metrics)

Software Engineer

1MG Technologies
03.2016 - 01.2018

Worked in Preorder team(responsible for everything backend till order placement), in a Microservice environment as a sole owner of major business units or services.

Major projects developed or contributed to :-

Backend services for Apps- Worked on various services responsible for serving initial configs, articles, handling push notifications etc for 1MG app.

Payments - Worked on payments , 1MG wallet, involving third party wallet integrations and payment handling on 1MG app and website.

Microservice Framework- Enhanced in-house developed Microservice framework Vyked by adding various features like graceful service restarts , improved logging, timeouts etc.

Catalog - Worked on 1Mg catalog, service responsible for serving OTC categories and products also developed portal for adding and modifying categories and products for the category managers.

Software Developer Associate

Amdocs
06.2015 - 02.2016
  • Worked in the ENCC team working for T Mobile US.

Education

BE - Computer Science

Bit Mesra

Class 12 -

Jawahar Vidya Mandir

Class 10 -

Adwaita Mission High School

Skills

Databases - Postgres, Sql, Memsql, Influx, Couchbase

Queuing & Scheduling - NSQ, redis , Kafka, Aws SQS

Big Data - Hive ,Tez ,Trino ,Spark ,AWS Redshift , Es-Hadoop, Yarn

Infra , build & deployment Tools- Good exposure to AWS stack (EC2, EKS, ECS,ECS, ELBs, EBS,Route 53, AMIs,Security Groups ,IAMs, Lambda etc),Kubernetes , Docker (responsible for introducing and setting up cluster from scratch using EKS),Terraform (responsible for introducing and using for pipeline autoscaling),Jenkins

Monitoring & alerting - Familiarity with grafana, ELK stack, prometheus, Nagios, Monit, Supervisor, Aws SNS, Aws Cloudwatch

Timeline

Software Development Engineer 2 (AWS EMR Hive)

Amazon Web Services
06.2021 - Current

Software Engineer 2 (Data Platform)

BookMyShow
09.2020 - 05.2021

Software Engineer 2 (Platform)

Moonfrog Labs
02.2018 - 09.2020

Software Engineer

1MG Technologies
03.2016 - 01.2018

Software Developer Associate

Amdocs
06.2015 - 02.2016

BE - Computer Science

Bit Mesra

Class 12 -

Jawahar Vidya Mandir

Class 10 -

Adwaita Mission High School
Ratnesh MishraSoftware Engineer 2