Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Swastik Satyapragyan Sahu

Senior Data Engineer
Bengaluru

Summary

With over 8+ years of experience in data IT, specializing in ETL development, data engineering, and SQL analytics, I am a senior data engineer. I have extensive knowledge in the banking and finance domain, as well as experience in retail, telecom, healthcare, and FMCG industries. My focus is on delivering results, and I possess strong leadership and mentoring skills. I have a proven track record of leading successful data engineering projects and building high-performing teams. My innovative solutions consistently improve data quality and accessibility for organizations.

Overview

9
9
years of professional experience
4
4
years of post-secondary education
3
3
Certifications

Work History

Senior Data Engineer

Accenture Strategy and Consulting
Bengaluru
08.2021 - Current
  • In Accenture, i am working as Senior/Lead cloud data engineer and part of Data and AI family.
  • I have designed, implemented a generic ETL batch data ingestion codebase, structured streaming codebase by using python and PySpark. This is a json configuration driven codebase which can process data into different medallion layers such as bronze, gold and silver by utilizing Databricks as compute. This can be supported in different cloud such as AZURE, GCP and AWS. I am involved in deploying this low code no code application in different clients in retail, healthcare, banking and finance domains. it can process millions or billions of records, it reduces daily basis code maintenance effort to 60%. It reduces pipeline building effort to 70%. As it is low code no code configuration driven, so resourcing efforts reduce to 50%. Also, currently it can handle more than 60 curation rules, aggregated rules, AI based rules altogether.
  • My role is to ensure data quality through testing, validation and by applying business data quality assessments and data profiling.
  • I am involved in different other capabilities such as client demo or participating in strategic planning sessions with stakeholders to asses business needs related to data engineering indicatives or automation. Sometimes i takes lead and mentor junior engineers, guide the team's technical direction, and collaborate with data scientists and analysts to support data-driven decision-making.
  • I also play a key role in implementing data governance and security best practices.

Data Engineer | Senior Consultant

Ernst & Young LLP
Bengaluru
05.2021 - 08.2021
  • I was a part of Business Consulting Risk team in EY. I worked as Data Engineer for Banking Clients like HSBC
  • My role was to create data pipeline to move the data from existing on premises like Hadoop and databases to newly defined one source of truth, which is called Risk Data Mart using PySpark and Google Cloud Platform(Dataproc, Big Query, GCS) and creating different layers for end users like Data Analysts or Data Scientists. Which is results helps in reducing processing time more than 50%. Also, it reduces the cost of storage more than 50%.
  • I worked as a data engineer and my responsibility is to create data pipeline using: Google cloud storage, Airflow, GitHub, PySpark, Databricks. I used databases and GCP cloud storage as source, followed medallion data layers like bronze, silver and gold to write data into GCS and later creating external tables and views in big query, pipeline orchestration using Airflow.

Data Engineer Sr Analyst

Synchrony Financial (Formerly known GE Capital)
Hyderabad
01.2020 - 05.2021
  • Company Overview: premier US based consumer financial services company delivering customized financing programs across key industries including retail, health, travel and home, along with award-winning consumer banking products
  • Creating pipeline of Migration of 12 months rolling Gap customer data to data lake(Hadoop) and analyses Customer engagement behavior data through machine learning.
  • I worked as a data engineer and my responsibility is to create data pipeline end-to-end (ETL, Exploratory data analysis, dimension modelling, feeding data to ML). Mostly the code was developed in Unix scripting, PySpark, hive, machine learning so that architectural risk can be mitigated.

Data Engineer Analyst | Application Developer

Tata Consultancy Services
Hyderabad
08.2016 - 01.2020
  • I worked as data engineer and my role was to decommissioning Teradata Tables and migrating history and current data to Hadoop ecosystem and creating Hive tables for respective Teradata tables
  • I worked in developing ETL Ab Initio graphs around 30 graphs for unloading the history data along with current day data from Teradata tables and created Hive external tables for target location of data. It reduces the compute effort to 50% and also reduces storage cost to 50%.
  • Validated processes with SME. Created control-M jobs for scheduling
  • I reconciled the processes by means of data count match and data match between target Hive and Teradata tables, If matched, then decommission the Teradata tables.

Education

Bachelors in Technology - Mechanical Engineering

GIET(Affiliated To BPUT)
Gunupur, Odisha, India
08.2012 - 05.2016

Skills

Data analysis expertise

Certification

GCP ASSOCIATE CLOUD ENGINEER CERTIFIED

Timeline

Senior Data Engineer

Accenture Strategy and Consulting
08.2021 - Current

Data Engineer | Senior Consultant

Ernst & Young LLP
05.2021 - 08.2021

Data Engineer Sr Analyst

Synchrony Financial (Formerly known GE Capital)
01.2020 - 05.2021

Data Engineer Analyst | Application Developer

Tata Consultancy Services
08.2016 - 01.2020

Bachelors in Technology - Mechanical Engineering

GIET(Affiliated To BPUT)
08.2012 - 05.2016
Swastik Satyapragyan SahuSenior Data Engineer