Summary

Overview

Work History

Skills

Personal Information

Education

Certification

Timeline

Hi, I’m

Srijan Srivastava

Lead Data Engineer

Patna

Summary

A lead data engineer with over 9 years of experience and strong background in designing, developing and maintaining robust big data solutions and designing enterprise applications across various technologies including Apache Spark, Databricks, AWS, and the Hadoop ecosystem. Demonstrated expertise in programming and processing complex data, as well as configuring databases to effectively meet business data requirements. Proficient in utilizing Snowflake for data warehousing and analytics, ensuring efficient data management and the establishment of reliable data pipelines.

Overview

years of professional experience

Languages

years of post-secondary education

Certificate

Work History

Luxoft

Senior Software Developer

04.2025 - Current

Job overview

Delivered high-quality code on time by effectively managing project timelines and prioritizing tasks accordingly.
Developed and updated configuration files and internal metadata while performing data quality, validation, cleansing, and transformation of incoming datasets.
Designed and maintained data layers for high-volume, structured data using Hadoop-based frameworks and data warehouse platforms.
Implemented PySpark operations including joins, Spark SQL and data transformations using Python and Pandas, optimizing large-scale data processing workflows.
Developed Shell scripts to enhance automation, improve workflow efficiency, and support data processing operations.
Performed complex data extraction, analysis, and reporting by writing optimized SQL queries across relational and cloud-based databases like Hive, Impala, PostgreSQL, MS SQL and PLSQL.
Developed and maintained comprehensive technical specification documents, defining architecture, data flows and integration requirements for project implementation.
Engineered and optimized data models using advanced modeling techniques, while enforcing data governance, ensuring referential integrity, consistency, and security across enterprise systems.
Executed end-to-end Software Development Life Cycle (SDLC) activities, including requirement analysis, design, coding, unit testing, deployment, and productionization of scalable applications.
Performed root cause analysis, debugging, and issue resolution to optimize system performance, enhance reliability, and ensure operational stability.
Collaborated with cross-functional teams to integrate software components seamlessly into existing systems.
Mentored junior developers, providing guidance on best practices and coding techniques for improved productivity.

EPAM Systems, Inc
Gurugram

Senior Software Engineer

03.2021 - 04.2025

Job overview

Developed and executed Apache Spark jobs within Databricks and AWS EMR to clean, normalize, and aggregate claims data, ensuring high data quality and consistency.
Integrated Snowflake as a data warehouse to facilitate real-time data analysis, utilizing Snowpipe for seamless data ingestion.
Collaborated with business analysts, development teams, and infrastructure specialists to design and implement solutions based on project requirements.
Performed data validation, cleansing, and transformation on input datasets, including the creation of configuration files.
Stored large volumes of structured and semi-structured data across various data layers using AWS S3, Delta Lake, Snowflake and the Big Data Hadoop Framework.
Engaged in all phases of the Software Development Life Cycle (SDLC), including analysis, design, development, testing, and
deployment, delivering unit-tested systems within customer-prescribed timeframes.
Conducted internal code reviews in BitBucket/GitHub, providing constructive feedback to enhance overall product quality
and team collaboration.
Identified and analyzed issues, delivering effective solutions to improve system performance and reliability.
Mentored junior developers, fostering professional growth and enhancing team productivity.
Contributed to the design and development of technical specification documents to guide project direction and implementation.

Emids Technologies
Bangalore

Software Engineer

10.2019 - 03.2021

Job overview

Worked with a 14-node production Hadoop cluster and a 7-node development cluster to manage large-scale data processing.
Performed data validation, cleansing, and transformation on input datasets to ensure high data quality and consistency.
Overcame challenges in storing large volumes of structured and semi-structured data in a data lake using the Big Data Hadoop Framework, Hive and Snowflake data warehouse.
Utilized RDDs, DataFrames, Spark joins, and Spark SQL with Scala to perform complex data processing tasks.
Leveraged HBase as a metastore and used RDBMS solutions like MS SQL and PostgreSQL as data sources for effective data management.
Gained an understanding of healthcare domain concepts and terminology to drive data-related initiatives.
Acquired in-depth knowledge of the Cotiviti data lake framework to enhance data architecture.
Engaged in all phases of the Software Development Life Cycle (SDLC), including analysis, design, development, testing, and
deployment of applications in the Hadoop cluster.
Contributed to the design and development of technical specification documents to support project implementation.

ITC Infotech India Limited
Bangalore

Big Data Developer

08.2016 - 10.2019

Job overview

Developed big data solutions using Hadoop, Hive, Spark, and Neo4j to meet complex data processing needs.
Standardized practices for data ingestion, cleansing, and analysis to deliver high-quality solutions.
Created data pipelines to convert Hive tables into Spark DataFrames, ensuring efficient data transformation and output.
Overcame challenges related to storing and processing large volumes of structured data using the Hadoop Framework.
Utilized SQL for data manipulation in both RDBMS and Apache Hive enhancing data accessibility and reporting.
Contributed to the design and development of technical specification documents to guide project implementation.

Skills

Apache Spark/PySpark

Languages: Python, Scala, SQL, Shell Scripting

AWS - EMR, S3, Step Functions, Code Pipeline, Cloud Watch, Athena

Data Warehousing - Snowflake, Apache Hive

RDBMS - Oracle SQL, PostgreSQL

Query Engine - Impala, Spark SQL, AWS Athena

Databricks

Version Control - BitBucket, GitHub

CI/CD - AWS CodePipeline, Jenkins

Personal Information

Date of Birth: 02/03/1994
Gender: Male

Education

University Of Pune
Pune, India

Bachelor's Of Engineering

08.2012 - 06.2016

Certification

Databricks Certified Data Engineer Associate

Timeline

Senior Software Developer

Luxoft

04.2025 - Current

Databricks Certified Data Engineer Associate

01-2025

Senior Software Engineer

EPAM Systems, Inc

03.2021 - 04.2025

Software Engineer

Emids Technologies

10.2019 - 03.2021

Big Data Developer

ITC Infotech India Limited

08.2016 - 10.2019

University Of Pune

Bachelor's Of Engineering

08.2012 - 06.2016

Similar Profiles

GOPAL KAMANEGOPAL KAMANE
Senior Software Developer at Rad Ole Pvt LtdSenior Software Developer at Rad Ole Pvt Ltd
ILAMPARITHI ELANGOILAMPARITHI ELANGO
Senior Data Engineer at Marsh and McLennanSenior Data Engineer at Marsh and McLennan
MOHAMMED ZUBAIRMOHAMMED ZUBAIR
Senior Data Engineering Analyst at Accenture Solutions Pvt LtdSenior Data Engineering Analyst at Accenture Solutions Pvt Ltd
Yaroslav GlukhovYaroslav Glukhov
Senior Software Developer at AliExpress CISSenior Software Developer at AliExpress CIS

CREATE PROFILE

Summary

Overview

Work History

Luxoft

Job overview

EPAM Systems, IncGurugram

Job overview

Emids TechnologiesBangalore

Job overview

ITC Infotech India LimitedBangalore

Job overview

Skills

Personal Information

Education

University Of PunePune, India

Certification

Timeline

Senior Software Developer

Senior Software Engineer

Software Engineer

Big Data Developer

University Of Pune

Similar Profiles

GOPAL KAMANEGOPAL KAMANE

ILAMPARITHI ELANGOILAMPARITHI ELANGO

MOHAMMED ZUBAIRMOHAMMED ZUBAIR

Yaroslav GlukhovYaroslav Glukhov

EPAM Systems, Inc
Gurugram

Emids Technologies
Bangalore

ITC Infotech India Limited
Bangalore

University Of Pune
Pune, India