Summary
Overview
Work History
Skills
Personal Information
Education
Timeline
Hi, I’m

Srijan Srivastava

Big Data Engineer
Patna
Srijan Srivastava

Summary

A data engineer with over 8 years of experience and strong background in developing robust big data solutions and designing enterprise applications across various technologies, including Apache Spark, Databricks, AWS, and the Hadoop ecosystem. Demonstrated expertise in programming and processing complex data, as well as configuring databases to effectively meet business data requirements. Proficient in utilizing Snowflake for data warehousing and analytics, ensuring efficient data management and the establishment of reliable data pipelines.

Overview

8
years of professional experience
2
Languages
4
years of post-secondary education

Work History

EPAM Systems, Inc
Gurugram

Senior Software Engineer
03.2021 - Current

Job overview

  • Developed and executed Apache Spark jobs within Databricks and AWS EMR to clean, normalize, and aggregate claims data, ensuring high data quality and consistency.
  • Integrated Snowflake as a data warehouse to facilitate real-time data analysis, utilizing Snowpipe for seamless data ingestion.
  • Collaborated with business analysts, development teams, and infrastructure specialists to design and implement solutions based on project requirements.
  • Performed data validation, cleansing, and transformation on input datasets, including the creation of configuration files.
  • Stored large volumes of structured and semi-structured data across various data layers using AWS S3, Delta Lake, Snowflake and the Big Data Hadoop Framework.
  • Engaged in all phases of the Software Development Life Cycle (SDLC), including analysis, design, development, testing, and
    deployment, delivering unit-tested systems within customer-prescribed timeframes.
  • Conducted internal code reviews in BitBucket/GitHub, providing constructive feedback to enhance overall product quality
    and team collaboration.
  • Identified and analyzed issues, delivering effective solutions to improve system performance and reliability.
  • Mentored junior developers, fostering professional growth and enhancing team productivity.
  • Contributed to the design and development of technical specification documents to guide project direction and implementation.

Emids Technologies
Bangalore

Software Engineer
10.2019 - 03.2021

Job overview

  • Worked with a 14-node production Hadoop cluster and a 7-node development cluster to manage large-scale data processing.
  • Performed data validation, cleansing, and transformation on input datasets to ensure high data quality and consistency.
  • Overcame challenges in storing large volumes of structured and semi-structured data in a data lake using the Big Data Hadoop Framework, Hive and Snowflake data warehouse.
  • Utilized RDDs, DataFrames, Spark joins, and Spark SQL with Scala to perform complex data processing tasks.
  • Leveraged HBase as a metastore and used RDBMS solutions like MS SQL and PostgreSQL as data sources for effective data management.
  • Gained an understanding of healthcare domain concepts and terminology to drive data-related initiatives.
  • Acquired in-depth knowledge of the Cotiviti data lake framework to enhance data architecture.
  • Engaged in all phases of the Software Development Life Cycle (SDLC), including analysis, design, development, testing, and
    deployment of applications in the Hadoop cluster.
  • Contributed to the design and development of technical specification documents to support project implementation.

ITC Infotech India Limited
Bangalore

Big Data Developer
08.2016 - 10.2019

Job overview

  • Developed big data solutions using Hadoop, Hive, Spark, and Neo4j to meet complex data processing needs.
  • Standardized practices for data ingestion, cleansing, and analysis to deliver high-quality solutions.
  • Created data pipelines to convert Hive tables into Spark DataFrames, ensuring efficient data transformation and output.
  • Overcame challenges related to storing and processing large volumes of structured data using the Hadoop Framework.
  • Utilized SQL for data manipulation in both RDBMS and Apache Hive enhancing data accessibility and reporting.
  • Contributed to the design and development of technical specification documents to guide project implementation.

Skills

Apache Spark/PySpark

undefined

Personal Information

  • Date of Birth: 02/03/1994
  • Gender: Male

Education

University Of Pune
Pune, India

Bachelor's Of Engineering
08.2012 - 06.2016

Timeline

Senior Software Engineer

EPAM Systems, Inc
03.2021 - Current

Software Engineer

Emids Technologies
10.2019 - 03.2021

Big Data Developer

ITC Infotech India Limited
08.2016 - 10.2019

University Of Pune

Bachelor's Of Engineering
08.2012 - 06.2016
Srijan SrivastavaBig Data Engineer