Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Projects
Timeline
Generic

Sitanshu Tripathi

New Delhi

Summary

Accomplished Data Engineer with advanced Python, SQL, and Spark proficiency and extensive experience with Azure Cloud platforms (data bricks, ADF), Snowflake, and data build tool. Demonstrated expertise in designing and optimizing high-performance data pipelines and solutions, enhancing data accessibility, and enabling data-driven strategies

Overview

5
5
years of professional experience
1
1
Certification

Work History

Big Data Team

CAE
03.2022 - Current
  • Increased efficiency of data processing pipelines through parallelization and optimization techniques.
  • Develop data catalog showing metadata and column/table profiling using Fast API and Angular, reducing metadata search time by 50% and enhancing data accessibility for stakeholders, streamlining insights in dynamic environments.

Big Data Team

EVS
03.2021 - 03.2022
  • Engineered Snowflake-based scripts and Airflow orchestration for data processing, reducing processing time by 60% while cutting costs by 40%, optimizing overall efficiency.
  • Conducted comprehensive analyses to identify patterns and anomalies within large datasets, guiding strategic decision-making processes.

Big Data Team

NCR
01.2020 - 03.2021
  • Implemented a Python automation script for Banking and Commerce Dashboard, reducing manual effort by over 90%, streamlining report generation and data analysis for 50+ business users.

Education

B-Tech - Computer Science & Technology

GLA
Mathura
05.2020

Skills

  • Python
  • SQL
  • Snowflake
  • Spark
  • Data Warehousing & Data Lakes
  • S3
  • Azure DE(ADF, data bricks )
  • DBT
  • GCP(DE Stack)
  • BigQuery

Certification

  • Data Engineering - Azure Cloud Specialization(DP-203)
  • Snowflake Data Application Badge

Accomplishments

  • Achieved a rating of 4.5 for doubt resolution during internship
  • MNIT Rank 2 in project presentation.
  • Qualified to semi-final in Code gladiators 2021
  • Best team award at EVS

Projects

Data Catalog
Technologies: Python, databricks, Fast API, Angular

  • Built a data catalog framework similar to DataHub.
  • Created Python scripts to generate table and column profiling.

Data Ingestion PL
Technologies: Python, Data Lake, API Integration

  • Implemented data fetch from ServiceNow API to Data Lake using pagination and incremental load techniques.
  • Developed a robust system for handling large datasets and ensuring data consistency during updates.

Pyxis
Technologies: Snowflake, SQL, Python, Alteryx, Airflow

  • Built the end-to-end pipeline in Snowflake using SQL and JavaScript.
  • Developed pipelines for sector filtering, business rule transformations, and loading into gold tables.

Sales SAS Report to Snowflake Migrations
Technologies: Snowflake, SAS, SQL, Python

  • Built SQL scripts in Snowflake for various SAS reports.
  • Automated and scheduled processes using airflow for daily, weekly, and monthly execution.
  • Ingested data from multiple discrete datasets (SAS Dataset, Cognos, SQL Server).


Timeline

Big Data Team

CAE
03.2022 - Current

Big Data Team

EVS
03.2021 - 03.2022

Big Data Team

NCR
01.2020 - 03.2021

B-Tech - Computer Science & Technology

GLA
Sitanshu Tripathi