Summary
Overview
Work History
Education
Skills
Data Domains
Certification
Timeline
Generic
Biswajeet Mahato

Biswajeet Mahato

Consultant-Data Engineer
Bengaluru

Summary

5 + years of experience with core focus on building and maintaining ETL pipeline in Azure and AWS. New team members' integration and skill- building into the project. Developed strong skills in Spark, Cloud Technology, and ETL technologies like Databricks and AWS EMR while working on several projects.

Overview

4
4
years of post-secondary education
5
5
Certifications

Work History

Consultant - Data Engineer

Capgemini Invent
Bengaluru

Preclinical Pipeline - data42(Novartis)

  • Build allometry scaling pipeline (measures the relation between CL & VSS
  • Efficiently writing PySpark script to transform data and integrate with preclinical pipeline.
  • Implementation of scientific Use cases in preclinical pipeline.
  • Tech: PySpark, Python, Foundry

Data Lake Development for multiple regions (APAC, LATAM, EMEA etc.) - Roche

  • Involved in writing various utilities (Data Extraction, Data Ingestion, Data Transformation).
  • Onboarding and upskilling new team members on project.
  • Created plans and communicated deadlines to ensure project sprints were completed on time.
  • Created STTM(Source to Target Mapping) to apply business logic to transformed data.
  • Expert in resolving various productions issues(Airflow DAG, Spark Jobs).
  • Tech: Python, PySpark, SQL , AWS (S3, EMR, Athena, Redshift, Glue)

Epidemiology Parquet Ingestion(EPI) - Gilead Sciences.

  • Efficient in building ADF pipeline and integrating with Azure Databricks.
  • Efficient in developing PySpark script to apply various business logic on data.
  • Created Standard Operation Procedure Docs for Operations team.
  • Created project plans and sending daily updates on project status.
  • Tech: Databricks, ADF, ADLS, Python, PySpark, SQL

Enterprise Data Lake 2.0 (EDL) - Amgen

  • Analysis and implementation of production issue fixes across environment(Dev, Test and Production) by taking change request approval from Project Manager.
  • Build utility to extract data from various sources.
  • Experience in spark code optimization and spark configuration optimization.
  • Quickly learned new skills and applied them to daily tasks, improving efficiency and productivity (During Project Migration from Cloudera to Databricks).
  • Created project plans, tracked delivery milestones, provided daily and weekly project status to client and top management
  • Tech: Hadoop, Hive, Sqoop, Python, SQL, Spark, Databricks

Education

PG Diploma - Big Data Analytics

Centre For Development of Advanced Computing
Bengaluru, India

Bachelor of Technology - Electronics And Communication Engineering

Tezpur University
Tezpur, Assam
08.2013 - 06.2017

Skills

  • Python

  • PySpark

  • SQL

  • Databricks

  • Azure Data Lake Storage

  • Azure Synapse Analytics

  • Azure Data Factory

  • AWS(EMR, S3, Athena, Glue, Redshift)

  • Hadoop(Hive, Sqoop)

  • Airflow

  • Git

Data Domains

  • Preclinical
  • Patients
  • Health Care Professional
  • Drugs Manufacturing
  • Drugs Market Data in Pharmaceutical

Certification

Microsoft Azure Data Fundamentals

Timeline

Bachelor of Technology - Electronics And Communication Engineering

Tezpur University
08.2013 - 06.2017

Consultant - Data Engineer

Capgemini Invent

PG Diploma - Big Data Analytics

Centre For Development of Advanced Computing
Biswajeet MahatoConsultant-Data Engineer