Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
Generic

Prathamesh Shewale

Pune

Summary

4.2+ years of Software development expertise, specializing in data analytics solutions for real-world business challenges. Proficient in handling complex legal and manufacturing documents.Skilled in the complete software development lifecycle, delivering user-centric solutions within tight timelines. Experienced with demanding technologies like PySpark, AWS services, and Python.

Overview

4
4
years of professional experience

Work History

Clinical Hub for Adverse Event Reporting Solution

Saama Technologies India pvt
Pune
12.2023 - Current
  • The CHAERS solution is responsible for automating the Adverse event reporting for USMA managed outsourced studied that are not supported by internal AERO safety reporting pipeline.
  • Designed the architecture of this module, based on budget to determine which services must be used or not.
  • Designed and implemented scalable data pipelines using AWS Glue jobs.
  • Built ETL workflows using Step functions and loaded data from various sources into target data warehouses
  • Written Python and Pyspark Scripts to get data from API and dump into the tables and Athena is used for data validations.
  • Processed XML and JSON files in Pyspark.
  • Developed CI/CD to copy code to S3 bucket from Gitlab.
  • Automate the deployment in different environment using AWS cloudformation.
  • Worked documentation part like data dictionary.

Project : Genentech PCT Module

Saama Technologies India pvt
Pune
05.2023 - 11.2023
  • Designed and implemented scalable data pipelines using PySpark, processing large volumes of data in the Life Science domain for analytics and reporting purposes.
  • Built ETL workflows using AWS Glue and loaded data from various sources into target data warehouses.
  • Developed and maintained data ingestion processes from different data sources, ensuring data quality and consistency. Files given by Product Owners are placed on Amazon S3 location. We check their format and columns and validate data to ensure correct processing.
  • Airflow DAG is used for pipelining jobs. Business logic is implemented using Pyspark and Amazon EMR is used for clustering purposes
  • Athena is used for data validations. Collaborated with cross-functional teams, including data scientists and business analysts, to understand data requirements and deliver effective solutions.
  • Collaborated with cross-functional teams, including data scientists and business analysts, to understand data requirements and deliver effective solutions and maintain data governance standards.

Project : Smart Data Quality

Saama Techonogies India pvt
Pune
07.2022 - 04.2023
  • Automating and accelerating data management processes.
  • With SDQ, data discrepancies are automatically identified as they are captured, reducing time to issue a query from over 25 days to under 2 days.
  • Flask API development for CRUD operations on postgres SQL tables.
  • Written testcases for flask API testing using postman integration with Jenkins.

Project : Deep Learning Intelligent Assistant

Saama Technologies India pvt
Pune
04.2021 - 06.2022
  • DALIA is an AI-based assistance that provides easy to use , content and domain -aware conversational experiences with key data and insights from saama's award winning Life Science Analytics Cloud(LSAC)
  • Developed and deployment chatbot on the server
  • To implement entity automatic complex query generation part for different questions and intents
  • Previously worked for single intent then Developed program for multiple intent
  • To optimize the code to reduce response time and implement entity auto suggestion functionality
  • Developed a way to Maximum 100 user can be process per sec.

Project : CDH

Saama Technologies India pvt
Pune
01.2020 - 03.2021
  • Clinical Data Hub is designed to handle every task related to data generated in clinical trials, from registering the study to ensuring the results conform to the CDISC standards
  • Developed automate the pipeline for converting Raw data to standard clinical format
  • Created flask API for Collecting source data information from s3 and sftp source
  • Minimised the reading and loading data time into tables by using dask and modin dataframes
  • Written data cleaning python script using pandas and store the data into target layer
  • Modularizing the existing code as part of enhancement using python oops concepts
  • And written test cases to eradicate bugs and inconstancies in the code.

Education

Post Graduation Diploma - Big-Data Engineer

CDAC
Pune
2019

Bachelor of Engineering - Electronic and tele communication

PVG College of Engineering
Nashik, MH
2018

Skills

  • Python
  • Pandas
  • Flask API
  • Postgres
  • SQL
  • Pyspark
  • AWS Services
  • GitHub
  • Postman
  • Jenkins

Accomplishments

  • Received 'Shining Star of the Month' for June 2022 and Oct 2023

Timeline

Clinical Hub for Adverse Event Reporting Solution

Saama Technologies India pvt
12.2023 - Current

Project : Genentech PCT Module

Saama Technologies India pvt
05.2023 - 11.2023

Project : Smart Data Quality

Saama Techonogies India pvt
07.2022 - 04.2023

Project : Deep Learning Intelligent Assistant

Saama Technologies India pvt
04.2021 - 06.2022

Project : CDH

Saama Technologies India pvt
01.2020 - 03.2021

Post Graduation Diploma - Big-Data Engineer

CDAC

Bachelor of Engineering - Electronic and tele communication

PVG College of Engineering
Prathamesh Shewale