Summary
Overview
Work History
Education
Skills
Languages
Personal Information
Timeline
Generic

SIBANANDA ROUTRAY

Summary

Accomplished Senior Data Engineer at TEK-Systems, specializing in ETL pipeline development and PySpark programming. Enhanced data processing efficiency by 30% through innovative solutions in cloud environments. Proven ability to collaborate effectively with cross-functional teams, ensuring data integrity and driving impactful analytics for informed decision-making.

Overview

6
6
years of professional experience

Work History

Senior Data Engineer

TEK- Systems ( Client- John Deere)
03.2025 - Current
  • Built robust and scalable ETL pipelines in Databricks using PySpark, following the Medallion Architecture (Bronze, Silver, and Gold layers) to ensure modularity, lineage, and data quality.
  • Automated data ingestion from multiple structured and semi-structured data sources, like ServiceNow, SAP, and flat files; staged in the Raw (Bronze) S3 layer, processed, and enriched in the Cleansed (Silver) layer, and served analytics-ready data in the Curated (Gold) layer.
  • Implemented Delta Lake features such as time travel, schema enforcement, and ACID transactions to ensure data reliability and version control.
  • Developed reusable PySpark modules for data transformation, cleaning, validation, and auditing.
  • Wrote optimized SQL queries and Spark SQL code for large-scale data processing and reporting use cases, reducing processing time by over 30%.
  • Worked extensively with Databricks Notebooks, jobs, and workflows to orchestrate and schedule daily batch pipelines.
  • Maintained and enhanced the organization’s Enterprise Data Lake (EDL) by integrating data from ServiceNow and other systems, ensuring data freshness, and reliability for downstream analytics.
  • Created detailed data models for raw and curated datasets, maintaining lineage, documentation, and compliance with governance policies.
  • Collaborated with stakeholders to gather reporting requirements, and developed insightful Power BI dashboards for asset tracking (e.g., laptops, tractors, and hardware inventory).
  • Handled end-to-end Power BI development, including data modeling, DAX measures, slicers, drill-through, and performance tuning.
  • Experienced in using Python for scripting, API integration, and data validation tasks outside of Spark workflows.
  • Implemented transformations in PySpark. Developed and integrated 50+ transformations into an existing PySpark project, improving data processing efficiency by 25% and reducing runtime by 15%.
  • Implement a CICD (Continuous Integration and Continuous Development) pipeline for code deployment.
  • Experience with the creation of technical documents for functional requirements, impact analysis, and technical design documents.
  • Spearheaded the adoption of innovative BigQuery solutions, improving query performance by 60%.
  • Utilized GitHub for repository management, branch strategy, and pull requests to streamline the development workflow, and code review process.
  • Maintained and optimized GitHub repositories, including documentation and release management, for efficient project lifecycle management.
  • Developed data pipelines to ingest and process large datasets.
  • Designed and implemented data pipelines to support analytics and reporting needs.

Payroll

OLA ELECTRIC
10.2024 - 03.2025
  • Company Overview: Big Data (BIE-1)
  • Designed and implemented end-to-end scalable data pipelines using PySpark to ingest, transform, and process large volumes of structured and unstructured data from diverse sources.
  • Optimized PySpark jobs for batch and real-time data processing, significantly improving pipeline efficiency and reducing latency.
  • Built robust ETL pipelines in PySpark & SQL integrating data from multiple sources into a centralized data lake for downstream analytics and machine learning models.
  • Collaborated with cross-functional teams to design PySpark workflows that ensured data integrity, quality, and consistency throughout the pipeline lifecycle.
  • Automated data processing workflows using PySpark and Apache Airflow, enabling seamless data flow and reducing manual intervention by 80%.
  • Developed fault-tolerant PySpark pipelines with error-handling mechanisms to ensure continuous data flow in large-scale distributed environments.
  • Integrated PySpark pipelines with cloud platforms AWS for storage, processing, and advanced analytics.
  • Designed and scheduled workflows using Apache Oozie to orchestrate and manage complex Hadoop jobs for data ingestion, transformation, and processing.
  • Implemented Oozie workflows to automate ETL processes, ensuring seamless coordination between MapReduce, Hive, and PySpark jobs.
  • Developed error-handling mechanisms in Oozie workflows to ensure data pipeline reliability and reduce job failures by 30%.
  • Big Data (BIE-1)

Payroll

EXL Services (Clint-USA BASED BANK)
09.2022 - 04.2024
  • Company Overview: Consultant, Big Data
  • Evaluated technology stack for cloud-based analytics solutions.
  • Conducted extensive research to identify and implement the best strategies and tools for building end-to-end analytics solutions on the cloud, leading to a 30% improvement in data processing efficiency
  • Extensive experience in data analytics for the banking domain.
  • Led analytics projects that increased data-driven decision-making by 40%, directly contributing to a 15% growth in loan approvals, and a 10% reduction in customer churn.
  • Optimized Banking SAS Code for the Spark Platform.
  • Migrated and optimized over 200 SAS scripts to run on the Spark platform using Microsoft Cloud Technologies, reducing processing time by 50%, and cutting cloud costs by 20%.
  • Data collection and preparation.
  • Collected, cleaned, and transformed 1 TB of raw credit card data into actionable datasets, including Loan Data, Customer Data, and Payments Data.
  • Improved data accuracy by 30%, leading to better predictive modeling outcomes.
  • I worked on Hadoop file formats.
  • Efficiently managed and processed over 500 GB of data using Hadoop file formats like Parquet, ORC, and AVRO, resulting in a 35% improvement in data retrieval speed.
  • Use SQL and Snowflake's native features (like Streams, Tasks, and Stored Procedures) to transform raw data.
  • Good understanding of Spark architecture with Databricks, structured streaming.
  • Setting Up AWS and Microsoft Azure with Databricks, Databricks Workspace for Business Analytics, Manage Clusters in Databricks, Managing the Machine Learning Lifecycle.
  • Oversaw stakeholder management for banking accounts, led onshore and offshore teams to launch insightful projects for credit cards, and generated revenue of over $2M per annum.

Analyst

Clarivate (Fusion Technosoft)
12.2019 - 09.2022
  • Design and Develop ETL Integration Patterns.
  • Developed and implemented ETL integration patterns using Python on Spark, resulting in a 30% improvement in data processing efficiency and reducing ETL job failures by 20%.
  • Monitoring and Error Resolution.
  • Monitored and resolved errors across multiple environments, improving job success rates by 15% and reducing average error resolution time by 40%.

Education

B.COM -

UTKAL UNIVERSITY
06-2018

Skills

  • AWS
  • ETL pipeline development
  • PySpark programming
  • Microsoft Azure
  • GCP
  • Big Query
  • Data Proc
  • SQL Server
  • Data Factory
  • Data Bricks
  • ADF
  • Azure DevOps
  • Agile
  • AWS Serverless Services
  • Lambda
  • Athena
  • EC2
  • AWSGlue
  • RDS
  • GitHub
  • Snowflake
  • SnowpipeAWS
  • Snow SQL
  • BigQuery
  • DataProc
  • Power BI
  • Tableau
  • Micro Strategy
  • Advanced Excel
  • SQL
  • Python
  • PySpark
  • SAS
  • Shell scripting
  • Hadoop
  • Spark
  • Hive
  • Kafka
  • MapReduce
  • Oozie
  • CICD
  • ETL
  • Apache Airflow
  • Cloud Data Warehousing
  • Teradata
  • Docker
  • Kubernetes

Languages

  • English
  • Hindi
  • Odia
  • Bengali

Personal Information

Gender: Male

Timeline

Senior Data Engineer

TEK- Systems ( Client- John Deere)
03.2025 - Current

Payroll

OLA ELECTRIC
10.2024 - 03.2025

Payroll

EXL Services (Clint-USA BASED BANK)
09.2022 - 04.2024

Analyst

Clarivate (Fusion Technosoft)
12.2019 - 09.2022

B.COM -

UTKAL UNIVERSITY
SIBANANDA ROUTRAY