Biswajeet Mahato

Summary

5 + years of experience with core focus on building and maintaining ETL pipeline in Azure and AWS. New team members' integration and skill- building into the project. Developed strong skills in Spark, Cloud Technology, and ETL technologies like Databricks and AWS EMR while working on several projects.

Overview

4

years of post-secondary education

5

Certifications

Work History

Consultant - Data Engineer

Capgemini Invent

Bengaluru

Preclinical Pipeline - data42(Novartis)

Build allometry scaling pipeline (measures the relation between CL & VSS
Efficiently writing PySpark script to transform data and integrate with preclinical pipeline.
Implementation of scientific Use cases in preclinical pipeline.
Tech: PySpark, Python, Foundry

Data Lake Development for multiple regions (APAC, LATAM, EMEA etc.) - Roche

Involved in writing various utilities (Data Extraction, Data Ingestion, Data Transformation).
Onboarding and upskilling new team members on project.
Created plans and communicated deadlines to ensure project sprints were completed on time.
Created STTM(Source to Target Mapping) to apply business logic to transformed data.
Expert in resolving various productions issues(Airflow DAG, Spark Jobs).
Tech: Python, PySpark, SQL , AWS (S3, EMR, Athena, Redshift, Glue)

Epidemiology Parquet Ingestion(EPI) - Gilead Sciences.

Efficient in building ADF pipeline and integrating with Azure Databricks.
Efficient in developing PySpark script to apply various business logic on data.
Created Standard Operation Procedure Docs for Operations team.
Created project plans and sending daily updates on project status.
Tech: Databricks, ADF, ADLS, Python, PySpark, SQL

Enterprise Data Lake 2.0 (EDL) - Amgen

Analysis and implementation of production issue fixes across environment(Dev, Test and Production) by taking change request approval from Project Manager.
Build utility to extract data from various sources.
Experience in spark code optimization and spark configuration optimization.
Quickly learned new skills and applied them to daily tasks, improving efficiency and productivity (During Project Migration from Cloudera to Databricks).
Created project plans, tracked delivery milestones, provided daily and weekly project status to client and top management
Tech: Hadoop, Hive, Sqoop, Python, SQL, Spark, Databricks

Education

PG Diploma - Big Data Analytics

Centre For Development of Advanced Computing

Bengaluru, India

Bachelor of Technology - Electronics And Communication Engineering

Tezpur University

Tezpur, Assam

08.2013 - 06.2017

Skills

Python
PySpark
SQL
Databricks
Azure Data Lake Storage
Azure Synapse Analytics

Azure Data Factory
AWS(EMR, S3, Athena, Glue, Redshift)
Hadoop(Hive, Sqoop)
Airflow
Git

Data Domains

Preclinical
Patients
Health Care Professional
Drugs Manufacturing
Drugs Market Data in Pharmaceutical

Certification

Microsoft Azure Data Fundamentals

Timeline

Bachelor of Technology - Electronics And Communication Engineering

Tezpur University

08.2013 - 06.2017

Consultant - Data Engineer

Capgemini Invent

PG Diploma - Big Data Analytics

Centre For Development of Advanced Computing

Summary

Overview

Work History

Consultant - Data Engineer

Education

PG Diploma - Big Data Analytics

Bachelor of Technology - Electronics And Communication Engineering

Skills

Data Domains

Certification

Timeline

Bachelor of Technology - Electronics And Communication Engineering

Consultant - Data Engineer

PG Diploma - Big Data Analytics

Similar Profiles

Balram Chowdary KondraguntaBalram Chowdary Kondragunta

Naren MinukuriNaren Minukuri

TRIBHUVAN CTRIBHUVAN C

Sonia PanditaSonia Pandita

Akankshya BiswalAkankshya Biswal