Summary

Overview

Work History

Education

Skills

Languages

Timeline

Rajdeep Roy

Kolkata,WB

Summary

Senior AWS Data Engineer experienced in working across the different stages of the data pipeline, including acquisition, integration, storage and data marts. Adept in working quickly and efficiently in close collaboration with analytic, engineering and other stakeholders.

Overview

years of professional experience

Work History

AWS Senior Data Engineer

IBM

09.2023 - Current

The migration and modernization of analytical data pipelines project is about company's brand campaigns terabyte scale data migration from Hive and GCP to a modernized form into AWS data platform
The scope involves remodeling of data, rebuild of analytical data pipelines in AWS, storage into WS S3 data lake and integration into AWS cloud DW for BI use cases
The technology landscape includes - Amazon Redshift, Redshift Spectrum, AWS Glue, Athena, RDS, S3, Step Function, EMR, AWS Lambda, Amazon Event Bridge, PySpark, Hive, Google Big Query, Google Data Transfer, Google Analytics APIs, Google Storage
Contribution: Leading 3 critical modules to accomplish migration and modernization of Hive and GCP Big Query based Genentech Digital Insight analytical data pipelines into AWS Space
Worked on extracting data from the Google Search Console API and transitioned the AS-IS process from Google Cloud Platform (GCP) to Amazon Web Services (AWS)
Additionally, developed multiple accelerators to ensure timely project delivery.

AWS Data Engineer

IBM

08.2020 - 08.2023

Project name: Enterprise Data Platform
Building a cloud-based data lake using a framework that rapidly ingest and curate data using number of cloud services provided by AWS
Data can be consumed through multiple technologies within this architecture, with AWS' Simple Storage Service (S3) being the primary storage platform to leverage cloud object storage
Data is then consumed into compute and analytics engines like EMR, Redshift
Contribution: Worked as a Data Engineer for a US-based Banking Client for building a cloud-based data lake
Responsible to manage data coming from various sources and ingestion, and curation of those flat files
Written extensive Hive queries to transform the data according to the business requirements
Build and implement the framework for automation testing python scripts for Ingestion and Curation.

Big Data Engineer

Retail Client

04.2017 - 07.2019

Involved in developing an Enterprise data lake within IBM cloud infrastructure leveraging the Digital Insights framework to create market-leading analytics
Contribution: Responsible for all the deliverables and documentation while developing Data Lake pipelines in preparing data
Involved in development of the python scripts which are used to ingest the flat file extracts to the raw zone
Involved in data loading of historical and incremental files using the wrapper scripts
Implementation of metadata validation framework
Proactively identified the improvement areas and implemented automated solution for them
Created test scripts, test cases for Quality assurance team of EDL
Implemented security policies to access different Hive databases and HDFS locations
Documenting reports on various activities followed in the project
Implemented Data Integrity and Data Quality checks in Hadoop using Hive and Linux scripts
Automated the DDL creation process in hive by mapping the MySQL data types.

Data Specialist-Informatica

Telecom Client

02.2014 - 03.2017

Involved in building and maintaining an application which extracts data from a legacy system, transforms it as per the business requirements and load to the Oracle database
Contribution: Understanding the user requirements, analyzing the mapping sheet to know the data flow from different sources and the type of transformations to be performed on the raw data
Worked in tuning of a long running job which reduced significant overall processing time of application
Writing script as a countermeasure for few frequent job failures.

Education

Bachelor of Technology in Electronics & Instrument

Bengal Institute of Technology And Management

West Bengal

07-2012

Skills

Big Data Technologies (Spark, Hadoop)
Programming Language (Python, SQL)
Datawarehouse (Hive, Redshift)
Orchestration (Control-M, Stepfunction, Airflow)

Version Control (GIT, Bitbucket)
ETL Tool (AWS Glue, Informatica)
AWS Services- Athena, RDS, S3, EMR, AWS Lambda, Amazon Event Bridge

Languages

English

Hindi

Bengali

Timeline

AWS Senior Data Engineer

IBM

09.2023 - Current

AWS Data Engineer

IBM

08.2020 - 08.2023

Big Data Engineer

Retail Client

04.2017 - 07.2019

Data Specialist-Informatica

Telecom Client

02.2014 - 03.2017

Bachelor of Technology in Electronics & Instrument

Bengal Institute of Technology And Management

Rajdeep Roy

Summary

Overview

Work History

AWS Senior Data Engineer

AWS Data Engineer

Big Data Engineer

Data Specialist-Informatica

Education

Bachelor of Technology in Electronics & Instrument

Skills

Languages

Timeline

AWS Senior Data Engineer

AWS Data Engineer

Big Data Engineer

Data Specialist-Informatica

Bachelor of Technology in Electronics & Instrument

Similar Profiles

Sayyad FarhanSayyad Farhan

RAJKUMAR PRAJKUMAR P

Pronit B MariyilPronit B Mariyil

SADHANA ANUMULUSADHANA ANUMULU

SUBHADEEP MAJUMDERSUBHADEEP MAJUMDER