Summary
Overview
Work History
Education
Skills
Timeline
Generic

Amit Mishra

BANGALORE,KARNATAKA

Summary

With ~ 8 years of experience in Big Data, Data Analytics, and Data Warehousing, Built data pipelines and data warehouses from scratch, utilizing technologies such as Spark, Hadoop, Elasticsearch, Kafka, Python, Java, NoSQL, GCP, and AWS, and collaborated with cross-functional teams to deliver search analytics and data pipelines.

Overview

8
8
years of professional experience

Work History

Data Engineer III

Cleartrip Pvt. Ltd
11.2023 - Current
  • Led requirement gathering, planning, and execution, ensuring smooth production releases and reducing deployment issues by 15%
  • Initiative aligns with cost-cutting measures for Data Application on GCP Cloud, aiming to achieve a 60% reduction in costs
  • Migrating Storm app topology to Spark Streaming that will help to reduce maintenance by 40%
  • Achieved a performance improvement of ~30% by tuning and optimizing the data pipeline
  • Improved reliability and consistency by 20% through code revamping and scheduling the application on Azkaban (Airflow)
  • Increased Kafka fault tolerance and throughput by 30% through configuring appropriate partitioning, retention, and other settings
  • Saved BigQuery costs by 50% through archiving stale data and implementing a Star Schema
  • Implemented an end-to-end data pipeline resulting in a 40% reduction in dashboard creation time for Revenue in Looker Studio
  • Reduced manual intervention by 50% by populating dependent BigQuery tables using Google's scheduled query service
  • Enabled and scheduled a data pipeline using a Google Dataproc cluster, reducing data processing time by 50% and improving data ingestion speed by 30%, Leveraged CleverTap as the data source, e ciently processing events and loading them into BigQuery tables
  • Enabled a Google Cloud Function, reducing ingestion time by 40% and automating data population into BigQuery tables based on events
  • Mentored junior Data Engineers, assigned tasks, and assisted in query resolution, improving team e ciency by 25%.

Sr. Software Engineer

Rafay System Pvt. Ltd
05.2022 - 10.2023
  • Created a data simulator for QA testing, boosting team e ciency by 20%
  • Implemented Spark ETL Pipeline to consolidate/Rollup TimeSeries table, reducing master database overhead by 25%
  • Designed and optimized the Time Series database, lowering costs and maintenance by 40%
  • Created real-time data cardinality insights using Elasticsearch and Java, enhancing data analysis capabilities by 25%
  • Established a Spark Stream Pipeline for cost monitoring, leading to a 12% reduction in expenses
  • Enhanced OpenTSDB with Java and added join and sorting capabilities to optimize query by 10%
  • Experience with Docker and Kubernetes, including image building and deployment from Docker Hub, expediting development by 30%
  • Designed HBase row keys for sorted data storage and implemented multi-tenancy, improving data retrieval speed by 25%
  • Possess intermediate-level proficiency in AWS EMR (Elastic MapReduce) for advanced big data processing and AWS S3 for e cient data storage and retrieval.

Lead Engineer

Moglix.com
12.2018 - 05.2022
  • Spearheaded 80% of Search API development and optimized caching and enhanced search API response time by 30%
  • Orchestrated data migration and preparation with Spark, resulting in a 40% reduction in time and e ciency.
  • Implemented a Data Warehouse from scratch using CDC (Debezium), Kafka, Spark, and Google BigQuery, to visualize category, sales performance, increasing data visibility by 30%
  • Optimized Category, Brand suggestions, resulting in a 20% boost in search hits
  • Elevated search results by 35% through improved brand and category identification, achieved a 0.2% to 0.5% conversion rate
  • Implemented product scoring for brand and category promotion, resulting in a 30% rise in PDP views
  • Mentored two junior team members, resulting in a 20% improvement in their performance and productivity.

Software Engineer

ClearTrail Technologies Pvt. Ltd
08.2016 - 12.2018
  • Installed, configured, and maintained a 7-node Hadoop (HDP/HDF) Cluster
  • Collaborated on Java utility development, resulting in a 20% improvement in processing speed for analytical products
  • Designed, optimized, and deployed Spark pipelines, resulting in a 20% reduction in data processing time and pipeline e ciency
  • Developed Kafka producer for Spark Streaming, resulting in a 50% reduction in lag detection time and identifying processing delays
  • Created an Indexer to enhance Apache Solr's fast data retrieval from HBase by 40%
  • Configured NiFi processor to e ciently transfer 10GBs of data from FTP to HDFS
  • Implemented a weekly HDFS data rotation using a Cron Job, cutting storage costs by 15%.

Education

B.Tech - Computer Science & Engineering

Maulana Azad National Institute of Technology, MANIT (NIT)
Bhopal
2016

Skills

  • Python Spark Java Kafka GCP AWS MySQL Elasticsearch NoSQL Hadoop Hive Dataproc Kubernetes HDFS Azkaban

Timeline

Data Engineer III

Cleartrip Pvt. Ltd
11.2023 - Current

Sr. Software Engineer

Rafay System Pvt. Ltd
05.2022 - 10.2023

Lead Engineer

Moglix.com
12.2018 - 05.2022

Software Engineer

ClearTrail Technologies Pvt. Ltd
08.2016 - 12.2018

B.Tech - Computer Science & Engineering

Maulana Azad National Institute of Technology, MANIT (NIT)
Amit Mishra