Summary
Overview
Work History
Education
Skills
Projects
Accomplishments
Timeline
Generic

Sumit Rathore

Gurugram

Summary

Technically proficient Data Engineer with hands-on experience in designing, developing, and managing Big Data applications and ETL pipelines. Adept at translating complex business and technical requirements into scalable and efficient data solutions.

Overview

3
3
years of professional experience

Work History

Data Engineer II

Airtel International LLP
Gurugram
06.2024 - Current
  • Hadoop LineReader(Open Source Enhancement) : Modified open-source Hadoop LineReader internally to support customized line separators, enhancing parsing of special characters.
  • XML Parser: Developed a XML parser to solve the challenges of extracting data from complex nested tags, transforming the extracted data into a structured dataFrame. Integrated this parser into a high-performance data extraction pipeline using Apache Spark, enabling efficient processing for ETL operations.
  • MongoDB in Spark: MongoDB incremental support for Spark framework.
  • Exactly Once for Spark Batch(Open Source Enhancement): Developed a solution to provide exactly once semantics for Spark Batch jobs with multiple sources and multiple sinks, overcoming the limitation of the standard exactly once guarantee and enabling reliable data processing at large scale.

Software Developer (Big Data)

Airtel International LLP
Gurugram
06.2022 - 06.2024
  • Subscriber's location Decode: Achieved a 45% increase in subscriber location tagging accuracy with network by enhancing decoding logic at the byte level and creating specialized transformers, resulting in improved performance for leading telecom vendors, such as ZTE and Huawei.
  • Big Data ETL Solutions: Implemented scalable ETL pipelines processing 20+ TB of data daily across 14 countries, using UDFs, UDAFs, and transformations to extract insights. Optimized storage with ORC and Hudi, including upsert support, on a Petabyte-scale cluster in an agile environment.

Software Developer (Intern)

Airtel International LLP
Gurugram
01.2022 - 06.2022
  • Apache NiFi: Have modified workflow for seamless data ingestion between source and destination

Education

B.Tech - Computer Science

Maulana Azad National Institute of Technology (NIT-Bhopal)
Bhopal
06-2022

Skills

  • C
  • SQL
  • Java
  • Hadoop (HDFS)
  • Apache Hudi
  • Apache Spark
  • PySpark
  • Kafka
  • Apache Airflow
  • NiFi
  • PostgreSQL
  • MongoDB
  • Git
  • Problem Solving (Data Structure & Algorithms)
  • OOPS
  • Html
  • Bootstrap
  • Web Development
  • On-premises cloud

Projects

  • Hand recognition calculator: calculation via recognizing digits shown by hands, Python, OpenCV, and OOP concepts
  • Trending YouTube data analysis: A Flask application that gives users trending video data based on date and category, Flask, MySQL, Python portfolio, sumitdude11.github.io, GitHub, HTML, CSS, JavaScript, Bootstrap

Accomplishments

  • CIO - Spot Award
  • Gold badge in sql and DSA on hackerrank
  • 500+ DSA Problems solved

Timeline

Data Engineer II

Airtel International LLP
06.2024 - Current

Software Developer (Big Data)

Airtel International LLP
06.2022 - 06.2024

Software Developer (Intern)

Airtel International LLP
01.2022 - 06.2022

B.Tech - Computer Science

Maulana Azad National Institute of Technology (NIT-Bhopal)
Sumit Rathore