Summary
Overview
Work History
Education
Skills
Websites
Computer science expertise
Timeline
Generic

Sunil Kumar Pandey

Lead Data Engineer
Bangalore

Summary

Astute Lead Data Engineer with data-driven and technology-focused approach. Communicates clearly with stakeholders and builds consensus around well-founded models. Talented in writing applications and reformulating models.

Overview

11
11
years of professional experience

Work History

Lead Data Engineer

Wells Fargo
6 2021 - Current
  • Built enterprise Data Lakehouse using S3, Iceberg table format and Dremio
  • Implemented End-End low code, no code metadata driven ETL framework based on PySpark
  • Implemented Federated model design for data discovery and data traceability
  • Designed end-end service architecture for self-service portal, metadata repository, data lineage
  • Have led multiple initiatives across the projects to benefit multiple LOBs.
  • Spearheaded initiatives aimed at reducing technical debt within the team by refactoring legacy code and introducing modern development practices fostering more efficient work environment.

Senior Software Engineer

Tech Mahindra
02.2019 - 05.2021
  • Developed data ingestion framework using spark and Scala
  • Have done performance optimization on spark configuration level and application level
  • Developed shell script wrapper to launch the job and created dependency between the jobs.

Senior Associate

Société Générale
07.2016 - 02.2019
  • Developing Spark programs using Scala API'S to compare the performance of Spark with Hive and SQL
  • Designed and Created Hive external tables using Shared meta-store instead of derby with partitioning, dynamic partitioning and bucketing
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.

Senior Consultant

Capgemini
08.2013 - 05.2016
  • Developed Informatica mappings based on business requirement to Extract, Transform & Load from different sources into different target stages
  • Using informatica ETL jobs achieved Data Completeness, Data quality, Performance and Scalability
  • Performed Count Validation, Dimensional Analysis, Statistical Analysis and Data Quality Validation in Data Migration.

Education

BACHELOR OF ENGINEERING - Electronics and Communication

RGPV
Bhopal, IN

Skills

Software Architecture Design

Data Pipeline Design

Python Programming

Advanced SQL

Data Modeling

Performance Tuning

Spark Framework

ETL development

Computer science expertise

  • Programming Langauge: Python, Scala, SQL, Java
  • Frameworks: Spark / PySpark, Hive, Pandas, Spring Boot, Kafka
  • Databases: MongoDB, SQL Server, Oracle, S3, Hive, Dremio, Iceberg, Delta, Hadoop
  • Certifications: Azure Data Engineer Associate, Azure Data Fundamentals

Timeline

Senior Software Engineer

Tech Mahindra
02.2019 - 05.2021

Senior Associate

Société Générale
07.2016 - 02.2019

Senior Consultant

Capgemini
08.2013 - 05.2016

Lead Data Engineer

Wells Fargo
6 2021 - Current

BACHELOR OF ENGINEERING - Electronics and Communication

RGPV
Sunil Kumar PandeyLead Data Engineer