Summary
Overview
Work History
Education
Skills
Certification
Languages
Timeline
Generic

Chhandita Talapatra

Pune

Summary

Results-oriented Data Engineer with 3 years of experience in the fintech sector, specializing in designing, implementing, automating, and optimizing enterprise-scale ETL pipelines and data integration processes and driving performance improvements across distributed systems. Proven track record of accelerating data migration projects and enhancing data quality for financial services. Eager to leverage big data and cloud technologies to solve complex challenges in the finance domain and contribute to high-impact, data-driven decision-making.

Overview

3
3
years of professional experience
1
1
Certification

Work History

Data Engineer

IBM
Pune
01.2023 - Current

Client: Barclays - a UK-based banking and finance company.

  • Contributed to an enterprise data migration and application development project from Cloudera Hadoop to Open Source Hadoop by designing and implementing optimized ETL workflows using PySpark and HiveQL, resulting in a 30% performance improvement across distributed systems.
  • Conducted requirement gathering and performance analysis to optimize project outcomes.
  • Configured multi-environment (DEV, SIT, OAT, PROD) Hadoop clusters, implemented database setup, and applied data modeling and warehouse concepts while automating table creation and job orchestration for seamless deployments.
    Optimized ETL code with thorough validation, unit testing, and debugging, ensuring stability and reliability during global
    deployment.
  • Developed automated data reconciliation utilities to debug migration issues, enforce data quality governance, and ensure accuracy across distributed systems in collaboration with global application teams.
  • Built automated data validation, transformation, and quality-check pipelines using PySpark and Hive to process largescale
    datasets with improved data quality and performance.
  • Analyzed data quality issues using HQL and resolved them through ETL optimization, Hive scripting, and relevant data
    engineering techniques.
  • Collaborated with clients and cross-functional teams on flow graph development, web services, and batch processing,
    while managing stakeholder communication in an Agile delivery environment.
  • Developed and enhanced IBM TWS job definitions and schedules through command-line operations, while monitoring
    batch workflows via the TWS UI and performing quick root-cause analysis and recovery for job failures.
  • Hands-on with Ab Initio GDE, Unix Shell Scripting, Hadoop, Hive, IBM Tivoli Workload Scheduler (similar to Autosys),
    Git (source control), JIRA, and Incident Management for end-to-end workflow automation and monitoring.
  • Shadowed the Barclays project, gaining valuable enterprise-level data engineering experience along with practical
    exposure to risk management, compliance frameworks, and data warehouse architecture in the banking and finance
    domain.

Client: Truist Bank – US-based Bank

  • Identified and fixed data quality issues involving invalid date values in source data using Ab Initio Reformat component, converting bad dates to the previous valid date to ensure stable ETL processing.

Education

Bachelor Of Technology (Hons) - Electronics And Communication Engineering

Netaji Subhash Engineering College
Kolkata, India
07.2022

Skills

  • Big Data Technologies: Hue,Hadoop, Hive, Apache Spark (batch & streaming)
  • Programming Languages: Python (NumPy, Pandas PySpark), Shell Scripting
  • ETL Tools: Ab Initio (GDE, PSET, Plan, Co-op)
  • Cloud & DevOps: AWS S3, CI/CD, Git
  • Scheduling Tools: IBM TWS
  • Databases & Warehousing: IBM DB2, SQL Databases, Data Modelling, Teradata (basic familiarity)
  • Operating Systems: Unix, Linux
  • Other: ETL Automation & Data Integration, Data Reconciliation & Validation, Data Quality, Data Visualization, Agile (JIRA), Incident Management

Certification

  • AWS Certified Cloud Practitioner (CLF-02)
  • PCEP Certified Entry-Level Python Programmer

Languages

  • English, Bilingual or Proficient (C2)
  • Bengali, Bilingual or Proficient (C2)
  • Hindi, Intermediate (B1)

Timeline

Data Engineer

IBM
01.2023 - Current

Bachelor Of Technology (Hons) - Electronics And Communication Engineering

Netaji Subhash Engineering College
Chhandita Talapatra