Summary

Overview

Work History

Education

Skills

Certification

Skills

Personal Information

Timeline

Dipak Rout

Bangalore

Summary

IT professional with 6+ years of experience in the Spark and big data ecosystem, including Hive, Sqoop, Hadoop, Impala, and Python PySpark. Skilled in Scala, Java, MSSQL Server, and AWS, as well as Azure cloud. Seeking to broaden horizons in the field of Big Data and apply strong interpersonal and technical skills in a collaborative team environment. Committed to contributing to organizational growth and achieving job satisfaction.

Overview

years of professional experience

Certification

Work History

Senior Data Engineer

WIPRO

BANGALORE

06.2025 - Current

Designed and implemented scalable Hive data models for Consumer Vehicle Lending (CVL) domain to support analytics and downstream reporting.
Built and maintained end-to-end data ingestion pipelines using PySpark and Hive to load data from Oracle and Teradata into production Hive tables.
Developed high level HQL from SaaS code to load data to HIVE.
Performed ETL data validation and quality checks, ensuring data accuracy, completeness, and consistency before production loads.
Conducted regression testing and production data verification, reducing data discrepancies and improving deployment reliability.
Managed Hyper RDBMS (HyRi) ingestion processes and optimized Hive table performance using partitioning and data modeling best practices.
Worked in Linux environment with Python for automation and operational support.
Validated Hive/ETL jobs in lower environments (DEV/UAT) with structured test cases and sample data validation.
Automated build and deployment pipelines using Jenkins, enabling controlled and repeatable releases.
Used Ansible for configuration management and deployment of Hive scripts, HQL, and ETL components across environments.
Monitored Jenkins build logs, resolved deployment failures, and ensured successful promotion of code to UAT and Production.
Followed version control and release management best practices for smooth CI/CD operation.
Architected and implemented a real-time fraud detection platform using Apache Kafka and Spark Structured Streaming, processing 5–10M+ financial transactions per day with
Designed scalable event-driven data pipelines and Medallion Data Lake architecture (Raw → Cleansed → Curated) on S3 using partitioned Parquet, optimizing performance and enabling schema evolution.
Developed advanced fraud detection logic including velocity checks, geo-location anomaly detection, merchant risk scoring, and behavioral feature engineering integrated with real-time ML scoring services.

Data Engineer

Accenture

02.2024 - Current

Prepared XML and JSON ingestion framework using pyspark technology
Optimized data processing by implementing efficient ETL pipelines and streamlining database design.
Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
Worked on flattening for complex type structure
Created the complex type hive table for loading the complex ingestion data like XML and JSON
Created the flattened hive and impala views for client side query on top of complex type hive table
Created the compaction script in scala spark for small Hadoop files
Created too many python script for mailing purpose, deleting the old files, automation
Handled the autosys tool for daily ingestion job
Implemented parallelization in ingestion framework for better performance
Developed Glue ETL job for batch processing of Data from S3 as source and Loaded the transformed data to Redshift Serverless Cluster
Created automated pipeline to ingest Static batch data, Incremental batch data and schema drift batch data to Redshift.
Implemented SCD 2 for Incremental load.
Implemented SNS for Schema Evolution Data.
Integrated multiple files based on the business requirement from AWS S3 and published to multiple vendors.
Handled Schema evolution using Glue Data catalog during Glue ETL job.
Worked on creating complex SQL queries for data extraction, Transformation and Loading (ETL) from different data sources
Migrated on-premises data pipelines to AWS, leveraging S3 and Redshift to reduce storage cost.
Worked on Spark job optimization using salting techniques,Enabling Spark AQE and speculative execution.
Worked on ingesting salesforce data to Hive.

Analyst

TCS

Bangalore

02.2020 - 02.2024

Experience in preparing Business & Functional requirement documents.
Expertise in database modeling, data mapping, ETL, Data Quality management, Data analysis and requirements gathering, SQL and reporting process.
Experience in Apache spark and python programming
Experience in developing data processing tasks using pyspark such as reading data from external sources, merge data, perform data enrichment and load in to target data destination.
Experience on AWS ecosystem, IAM, AWS S3 storage, AWS glue, Athena, Redshift.
Good knowledge on Hadoop, Sqoop and hive
SQL Server, T-SQL Experience, Joins, Data Warehousing, Data Modeling, OLTP, OLAP.
Handling seven Databases with the help of MSSQL Management studio.
Excel - Hlookup, Vlookup, pivots and other advanced functions.
Worked on optimizing Spark Jobs.

Education

B-Tech - Mechanical Engineering

IGIT, SARANG

Dehenkanal Odisha

05-2019

Skills

,,,,,,,,,,,,,,,,,,,,,,

Spark
Pyspark
Python
SQL
Impala
Sqoop
Hive
Hadoop
MSSQL Server
MySQL
Oracle
Libraries: numpy,pandas,Boto3,FastParquet
Linux bash Scripting
core Java

Scala
AWS S3
AWS Glue
AWS Redshift
AWS step functions
Athena
Data governance
Data Quality Checks
Spark framework
Performance tuning
Big data processing
Data warehousing
Data modeling
Data pipeline design
ETL development

Certification

Azure AZ-900, Completed

Skills

Autosys

eclipse

Ansible

Jenkins

Bitbucket

WinScp

JIRA

Personal Information

Timeline

Senior Data Engineer

WIPRO

06.2025 - Current

Data Engineer

Accenture

02.2024 - Current

Analyst

TCS

02.2020 - 02.2024

B-Tech - Mechanical Engineering

IGIT, SARANG

Dipak Rout

Summary

Overview

Work History

Senior Data Engineer

Data Engineer

Analyst

Education

B-Tech - Mechanical Engineering

Skills

Certification

Skills

Personal Information

Timeline

Senior Data Engineer

Data Engineer

Analyst

B-Tech - Mechanical Engineering

Similar Profiles

Srikanth Reddy ChandaSrikanth Reddy Chanda

Keerthana AnnurKeerthana Annur

Rohith PudotaRohith Pudota

Ravish KumarRavish Kumar

Nagaraju VodetiNagaraju Vodeti