Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic
Swastik Satyapragyan Sahu

Swastik Satyapragyan Sahu

Bengaluru

Summary

With over 8 years of experience in data IT, specializing in ETL development, data engineering, and SQL analytics, I am a senior data engineer. I have extensive knowledge in the banking and finance domain, as well as experience in retail, telecom, healthcare, and FMCG industries. My focus is on delivering results, and I possess strong leadership and mentoring skills. I have a proven track record of leading successful data engineering projects and building high-performing teams. My innovative solutions consistently improve data quality and accessibility for organizations.

Overview

8
8
years of professional experience
1
1
Certification

Work History

Sr Data Engineer

Accenture Strategy and Consulting
08.2021 - Current
  • In Accenture, i am working as Senior/Lead cloud data engineer and part of Data and AI family.
  • I involve in developing for both batch and stream data pipelines with complex data model. As Senior/Lead Data Engineers, my responsible for designing, building, and maintaining robust data pipelines that ensure data quality, accessibility, and scalability. Mostly i have used Pyspark, python, Databricks, Azure cloud, GCP cloud, Kafka to build the data pipelines.
  • My role is to takes lead and mentor junior engineers, guide the team's technical direction, and collaborate with data scientists and analysts to support data-driven decision-making.
  • I also play a key role in implementing data governance and security best practices.

Data Engineer | Senior Consultant

Ernst & Young LLP
05.2021 - 08.2021
  • I was a part of Business Consulting Risk team in EY. I worked as Data Engineer for Banking Clients like HSBC
  • My role was to create data pipeline to move the data from existing on premises like Hadoop and databases to newly defined one source of truth, which is called Risk Data Mart using Pyspark and Google Cloud Platform(Dataproc, Big Query, GCS) and creating different layers for end users like Data Analysts or Data Scientists.
  • I worked as a data engineer and my responsibility is to create data pipeline using: Google cloud storage, Airflow, GitHub, Pyspark, Databricks. I used databases and GCP cloud storage as source, followed medallion data layers like bronze, silver and gold to write data into GCS and later creating external tables and views in big query, pipeline orchestration using Airflow.

Data Engineer Sr Analyst

Synchrony Financial (Formerly known GE Capital)
01.2020 - 05.2021
  • Company Overview: premier US based consumer financial services company delivering customised financing programs across key industries including retail, health, travel and home, along with award-winning consumer banking products
  • Creating pipeline of Migration of 12 months rolling Gap customer data to data lake(Hadoop) and analyse Customer engagement behavior data through machine learning
  • I worked as a data engineer and my responsibility is to create data pipeline end-to-end (ETL, Exploratory data analysis, dimension modelling, feeding data to ML). Mostly the code was developed in Unix scripting, Pyspark, hive, machine learning so that architectural risk can be mitigated.

Data Engineer Analyst | Application Developer

Tata Consultancy Services
08.2016 - 01.2020
  • I worked as data engineer and my role was to decommissioning Teradata Tables and migrating history and current data to Hadoop ecosystem and creating Hive tables for respective Teradata tables
  • I worked in developing Ab Initio graphs around 30 graphs for unloading the history data along with current day data from Teradata tables and created Hive external tables for target location of data.
  • Validated processes with SME. Created control-M jobs for scheduling
  • I reconciled the processes by means of data count match and data match between target Hive and Teradata tables, If matched, then decommission the Teradata tables.

Education

Bachelors in Technology - Mechanical Engineering

GIET(Affiliated To BPUT)
Gunupur, Odisha, India
05-2016

Skills

  • Big Data Analytics
  • ETL processing and Data Warehousing
  • Data analysis
  • Risk Data Governance and Quality Assurance
  • Research, Reporting & Documentation
  • Agile Framework
  • Data Modelling
  • Spark 30
  • HDFS, Yarn, Apache Hive
  • Python
  • Unix shell scripting
  • Pyspark
  • Ab Initio(32 and 33)
  • SQL and databases (Teradata, Oracle, Mysql)
  • GCP(Data Proc, Big Query, GCS, cloud composer, pub-sub)
  • AZURE(ADLS, Azure SQL, Cognitive Search, Fabric workspace, Purview, Event-hub)
  • Databricks (Azure cloud, GCP cloud, Notebooks, workflow, Unity catalog)
  • Kafka
  • Airflow
  • Different file system: CSV, Json, Parquet, Delta

Certification

  • GCP ASSOCIATE CLOUD ENGINEER CERTIFIED
  • DATABRICKS ASSOCIATE DATA ENGINEER CERTIFIED

Timeline

Sr Data Engineer

Accenture Strategy and Consulting
08.2021 - Current

Data Engineer | Senior Consultant

Ernst & Young LLP
05.2021 - 08.2021

Data Engineer Sr Analyst

Synchrony Financial (Formerly known GE Capital)
01.2020 - 05.2021

Data Engineer Analyst | Application Developer

Tata Consultancy Services
08.2016 - 01.2020

Bachelors in Technology - Mechanical Engineering

GIET(Affiliated To BPUT)
Swastik Satyapragyan Sahu