Summary
Overview
Work History
Education
Skills
Websites
Languages
Timeline
Generic

Nitesh Singh Kataria

Bengaluru

Summary

With over 9 years of experience in the IT industry, including more than 7.5 years specializing in Big Data technologies and the Hadoop ecosystem, I am a highly organized and detail-oriented Big Data Developer. I excel at preparing and mining large datasets, and possess strong communication, planning, critical thinking, and judgment skills. Experienced in working within Agile environments.

Overview

11
11
years of professional experience

Work History

Software Engineer

Stripe
Bengaluru
01.2022 - Current

Project : Stripe Fees

  • Implemented a Stripe fees report, providing merchants with detailed breakdowns of the fees they pay, addressing transparency issues in the market.
  • Developed a Spark batch pipeline to extract relevant information from the Stripe Ledger, processing around ~250TB of data daily for approximately ~7M customers.
  • Provided a design proposal that will improve the job SLA from 24 hours to 3 hours by converting the snapshot architecture to an incremental architecture, significantly reducing infrastructure costs from $140K/month to $20K/month.
  • Coordinated with 30+ teams for product feature approval related to Stripe fees and enhanced documentation for better user understanding.
  • Successfully managed the report's lifecycle from Alpha to General Availability (GA), ensuring its effective rollout to users and clear communication of its benefits.

Senior Big Data Developer

Agoda (Booking Holdings)
Bangkok
02.2020 - 10.2022

Lead the development and optimization of large-scale data processing projects, ensuring high performance and reliability.

ETL Tool Project:

  • Developed a UI tool enabling users to execute SQL queries and retrieve data.
  • Replaced SQOOP with SparkJDBC, boosting performance for about ~190 teams.
  • Automated an ETL pipeline, transforming and aggregating ~100TB of data daily, enhancing efficiency.

Data Validation Using Spark:

  • Implemented a data validation framework and integrated it into Spark jobs.
  • Created a Go-Gin API to insert approximately ~900K rules into the database.
  • Ensured data quality, cleansing ~10TB of data daily for ~190 teams, and deployed solutions on Kubernetes and Docker, scalability, and deployment efficiency.

Big Data Developer

Cemtics Pvt. Solutions Ltd.
Gurgaon
10.2017 - 12.2019

Project: (ImsiProfile [Customer 360]).

Imsi profiling is the product that gives a brief analytics of the network. It gives us the real picture of the network performance that the customer is using. Major accomplishments:

  • Create pipeline and implemented the custom API code for retrieval of data from Druid, which made data available for (~344 Million) subscribers.
  • Worked on the optimization of the Spark job, which was creating aggregates (TBs/Day) from CDR.

Software Developer

Orange Business Services
Gurgaon
01.2017 - 09.2017

PROJECT: (Global Order Lifecycle Development)

Processed Sequence File in Spark and used it for aggregation, and saved it to Parquet in Hadoop.

Software Developer

Newgen Software Technologies
Noida
10.2015 - 12.2016

PROJECT: (Corporate Loan originating system)

  • Integrated MongoDB with Spring Data and developed various APIs for efficient data retrieval, tailored to client requirements.
  • Worked on the optimization in Data Storage in MongoDB.

Programmer Analyst

Cognizant Technology Solutions
Bengaluru
03.2014 - 09.2015

Project:(Corelogic)

  • Created Hive tables and developed MapReduce code to parse and structure log files into a tabular format.
  • Developed UNIX shell scripts to generate reports from Hive data.

Education

B.tech in Information Technology -

Dehradun Institute of Technology (UTU)
12.2016

Skills

  • Spark
  • Druid
  • Hive
  • Java
  • Scala/GO
  • Python
  • MongoDb
  • Kubernetes
  • Monitoring Tools
  • CI/CD
  • Airflow/Oozie

Languages

English
First Language

Timeline

Software Engineer

Stripe
01.2022 - Current

Senior Big Data Developer

Agoda (Booking Holdings)
02.2020 - 10.2022

Big Data Developer

Cemtics Pvt. Solutions Ltd.
10.2017 - 12.2019

Software Developer

Orange Business Services
01.2017 - 09.2017

Software Developer

Newgen Software Technologies
10.2015 - 12.2016

Programmer Analyst

Cognizant Technology Solutions
03.2014 - 09.2015

B.tech in Information Technology -

Dehradun Institute of Technology (UTU)
Nitesh Singh Kataria