Summary
Overview
Work History
Education
Skills
Websites
Disclaimer
Honors And Awards
Timeline
Generic
Sanjeev Jha

Sanjeev Jha

Gurugram

Summary

Around 10 years of technical expertise in various domains that includes financial and telecom with hands-on experience in Big Data Analytics design and development Around 4+ years of relevant experience in Big Data Analytics and data manipulation using Hadoop Ecosystem tools such as MapReduce, HDFS, Yarn/ MRv2, Hive, HBase, Spark, Kafka, Sqoop, Oozie, Avro, Kerberos and Spark Integration with Cassandra and Zookeeper along with ADF, EventHub, Synapse, Databricks & Azure Function. Rich experience in designing and developing applications in Apache Spark, PySpark, Scala, Java, Kafka, Python, and Hive with Hadoop Ecosystem along with NRT process. Strong experience on Hadoop distributions with Cloudera. Hands-on experience in working on Spark, PySpark, RDD, DataFrame, and Dataset API for processing unstructured and structured data Efficient in writing live real-time processing and core jobs using Spark Structured Streaming with Kafka as a data pipeline system Well versed in writing multiple jobs using Spark and Hive for data extraction, transformation, and aggregation from multiple file formats including Parquet, Avro, XML, JSON, CSV, and OrcFile and other compressed file formats codecs such as GZIP, Snappy.

Overview

9
9
years of professional experience
2012
2012
years of post-secondary education

Work History

Associate Staff Engineer

Nagarro
11.2023 - Current
  • Company Overview: Currently working with Nagarro for the client Dublin Airport Authority (DAA).
  • Coordinated with business analysts for any specifications/ modifications related to the defect or enhancement.
  • Responsible for building scalable data solutions using ADF.
  • Data Modelling & Data Analysis.
  • Understood the business requirements through the high-level design transformations and actions on ADF Pipeline.
  • Created staging layer to accommodate and process raw data from all data sources.
  • Engaged in deployment activities using the deployment framework, though.
  • Modified existing database objects to implement business logic.
  • Monitor query execution plan of existing SQL procedures, and suggest solutions for performance improvement.
  • Currently working with Nagarro for the client Dublin Airport Authority (DAA).

Data Engineer

Impetus Technology
02.2016 - 01.2017
  • Company Overview: Client: Government Agency.
  • Created Spark Engine that consumes streaming messages from Kafka topics and schedules the batches, further performs multiple operations like transformation, filtrations, and does Dataframe SQL query as per use cases, finally write the output to HBase NoSQL database in Avro format and parquet with distinct schema.
  • Data Mining & Analytics for implementing new business concept.
  • Created data access layer in Java for serving client requests to populate data at the Existing ClearInsight visualization application.
  • Automate all the tasks by creating/scheduling a shell/python script.
  • Client: Government Agency.

Education

Bachelor of Engineering - IT

WBUT University

Skills

  • Hadoop distributed architectures
  • object-oriented design
  • cloud application architectures
  • multi-tenant system architecture
  • microservices architecture
  • ETL design
  • data cleansing
  • data processing
  • Spark
  • database tuning
  • optimization
  • stored procedure design
  • Java
  • Scala
  • pySpark
  • Python
  • shell scripting
  • REST

Disclaimer

I hereby declare that the above-mentioned information is correct up to my best knowledge and belief and I bear the responsibility for the correctness of the above-mentioned particulars., Gurugram, 03/13/25

Honors And Awards

Awards from Application and Business Users for Development and Setup Entity profiling platform.

Timeline

Associate Staff Engineer

Nagarro
11.2023 - Current

Data Engineer

Impetus Technology
02.2016 - 01.2017

Bachelor of Engineering - IT

WBUT University
Sanjeev Jha