Summary
Overview
Work History
Education
Skills
Languages
Personal Information
Websites, Portfolios and Profiles
Certification
Timeline
Generic
RAKESH DASH

RAKESH DASH

Bengaluru

Summary

Senior Data Engineer | Data Platform Engineer | Data Analytics Engineer

Python • PySpark • SQL • Snowflake • Data Quality • ETL/ELT • Data Governance

I'm a Senior Data Engineer with 8+ years of experience designing scalable data platforms, building enterprise-grade ETL/ELT pipelines, and delivering data-driven solutions across Banking, Marketing Technology, and Aerospace domains.

Expertise in Python, PySpark, SQL, Snowflake, Data Quality Engineering, Workflow Automation, and Cloud-based Data Processing. Proven track record of processing multi-terabyte datasets, optimizing distributed data pipelines, improving data reliability, and enabling business-critical analytics through scalable engineering solutions.

Experienced in data governance, data validation frameworks, CI/CD implementation, workflow orchestration, and cross-functional collaboration with engineering, analytics, and business stakeholders. Strong focus on automation, performance optimization, and delivering high-quality data products that drive business outcomes.

Overview

8
8
years of professional experience
1
1
Certification

Work History

SENIOR DATA ANALYTICS ENGINEER & SR.DATA ENGINEER

HSBC Electronic Data Processing India
Bengaluru
07.2023 - Current

• Designed and implemented enterprise-scale Python and PySpark data pipelines processing more than 2TB of ESG and financial datasets across multiple business domains.

• Led development of automated data quality and validation frameworks supporting 15+ critical data assets and regulatory reporting requirements.

• Built scalable ETL workflows and CI/CD integration using Jenkins and workflow orchestration platforms to improve deployment reliability and operational efficiency.

• Standardized vendor onboarding and ingestion pipelines across 8+ source systems, improving processing efficiency by 35%.

• Reduced Spark pipeline execution time by 45% through optimization, automation, and process redesign while maintaining 99.8% data accuracy.

• Eliminated more than 25 hours of weekly manual validation effort through automated completeness, reconciliation, and quality checks.

• Collaborated with Data Governance, Vendor Management, and Engineering teams to establish enterprise-wide data quality standards and monitoring practices.

• Enhanced Document AI workflows through automation, testing frameworks, and operational monitoring improvements, increasing extraction accuracy by 25%.

Data Analyst

ZETA GLOBAL
Hyderabad
07.2021 - 05.2023

• Designed and developed end-to-end ETL/ELT pipelines using Python, PySpark, and SQL to process and transform large-scale customer datasets for analytics and marketing platforms.

• Built and optimized Snowflake-based data warehouse solutions supporting high-volume data ingestion, transformation, and reporting workloads.

• Developed advanced customer identity resolution and fuzzy matching frameworks using NLP techniques, probabilistic matching algorithms, and data quality controls.

• Engineered scalable data models and optimized Snowflake query performance, improving analytical workload efficiency and reducing processing costs.

• Automated data migration and transformation workflows for datasets exceeding 50 million records while maintaining high data quality standards.

• Integrated data from multiple enterprise systems and third-party platforms to create unified datasets for reporting, analytics, and customer insights.

• Partnered with Analytics, Product, and Engineering teams to deliver trusted datasets and self-service reporting capabilities across business functions.

• Developed Tableau dashboards and automated reporting solutions that significantly reduced time-to-insight for business stakeholders.

Key Achievements

• Improved customer identity matching accuracy by 35%, contributing to enhanced audience targeting and supporting revenue growth initiatives exceeding $2.5M.

• Optimized Snowflake data warehouse architecture and migration processes for 50M+ records, reducing report generation time by 50%.

• Reduced storage costs by 25% through schema optimization, data modeling improvements, and efficient data lifecycle management.

• Processed and validated over 15 million customer records, reducing data quality issues by 90% through automated validation frameworks.

• Consolidated data from 10+ source systems into unified reporting platforms, reducing business reporting time from 4 hours to under 15 minutes.

• Improved campaign performance and business decision-making through scalable analytics solutions, contributing to an 18% increase in campaign ROI.

Reliability Analyst

CYIENT LTD
Hyderabad
04.2018 - 07.2021

• Developed Python-based automation solutions to extract, transform, and process structured and semi-structured data from XML, HTML, CSV, and relational database sources.

• Designed scalable data processing workflows to support engineering, operational, and maintenance analytics across large enterprise datasets.

• Built automated data validation and cleansing frameworks that improved data quality, consistency, and reporting accuracy.

• Developed Power BI dashboards and KPI monitoring solutions enabling real-time operational visibility and data-driven decision-making.

• Performed exploratory data analysis, trend analysis, and predictive modeling to identify operational improvement opportunities and support business objectives.

• Collaborated with engineering, operations, and business stakeholders to deliver analytics solutions aligned with organizational goals.

• Automated recurring reporting and data preparation processes, significantly reducing manual effort and improving delivery timelines.

• Supported end-to-end data lifecycle activities including ingestion, transformation, validation, analysis, and visualization.

Key Achievements

• Automated processing of more than 10,000 XML and HTML files monthly, reducing execution time from approximately 8 hours to less than 15 minutes.

• Improved data accuracy by 95% through implementation of automated validation, reconciliation, and exception handling frameworks.

• Developed enterprise reporting dashboards tracking more than 15 operational KPIs, reducing manual reporting effort by 80%.

• Built predictive analytics models that improved maintenance forecasting accuracy by 30% and contributed to a 25% reduction in operational downtime.

• Delivered operational efficiency improvements generating approximately $500K in annual cost savings through process optimization and automation initiatives.

• Enabled near real-time reporting capabilities that accelerated decision-making and improved visibility across engineering programs.

Education

Bachelor of Engineering - Mechanical

Institution of Engineers (India)
Bhubaneswar
01-2016

Skills

Core Competencies

Data Engineering
Data Platform Development
ETL / ELT Pipelines
Python Development
PySpark
Apache Spark
SQL Development
Data Warehousing
Data Modeling
Snowflake
Data Quality Engineering
Data Governance
Workflow Automation
CI/CD
Jenkins
Airflow
Distributed Data Processing
Cloud Data Engineering
Data Validation Frameworks
Performance Optimization
Business Intelligence
Analytics Engineering

Technical Skills

Programming Languages:
Python, SQL, PySpark, Scala

Data Engineering:
ETL, ELT, Data Pipelines, Data Warehousing, Data Modeling, Data Quality, Data Governance

Big Data Technologies:
Apache Spark, PySpark

Databases:
Snowflake, PostgreSQL, Oracle, MySQL

Cloud Platforms:
AWS, Azure, GCP

Workflow & DevOps:
Jenkins, Git, CI/CD, Airflow

Analytics & Reporting:
Power BI, Tableau

Testing & Automation:
Pytest, Unittest, Automated Testing Frameworks

Libraries:
Pandas, NumPy, BeautifulSoup

Languages

English
Upper Intermediate
B2
Hindi
Upper Intermediate
B2

Personal Information

Senior Data Engineer | Data Platform Engineer
Python • PySpark • SQL • Snowflake • Data Quality • ETL/ELT • Data Governance

Websites, Portfolios and Profiles

https://rakeshd3.github.io/CV

Certification

• Analyzing Data with Python – edX

• Structuring Machine Learning Projects – DeepLearning.AI

• Design Databases with PostgreSQL – Codecademy

• Analyze Business Metrics with SQL – Codecademy

Timeline

SENIOR DATA ANALYTICS ENGINEER & SR.DATA ENGINEER

HSBC Electronic Data Processing India
07.2023 - Current

Data Analyst

ZETA GLOBAL
07.2021 - 05.2023

Reliability Analyst

CYIENT LTD
04.2018 - 07.2021

Bachelor of Engineering - Mechanical

Institution of Engineers (India)
RAKESH DASH