Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic
Sunit Jena

Sunit Jena

Pune

Summary

Results-driven Data Engineer with 3+ years of experience, having passion for data analytics and ETL platform, targeting role as Data Engineer with exposure to huge datasets and helping the same in transforming to usable products that informs and improves business decisions. Possess the talent and eagerness to build analytics platform with huge datasets that helps the customers and the organization in taking better business decisions. Proven acumen in driving multiple projects and contributing as an emerging leader for training workshops. Determined trends in sets of data, accurately read data models and code, and developed data rules from the analysis of the same. Developed solutions for real-time distributed data processing, as well as computational pipelines. Proficient in working on Databricks, Informatica and SQL.

Overview

2
2
years of professional experience

Work History

OMNI MDM INTEGRATION HUB

ZS ASSOCIATES
07.2024 - Current
  • Revamping OMNI MDM pipeline using a new, configurable framework, enabling faster onboarding of over 30 data sources, including IQVIA Onekey, Leads, Symphony, and others.
  • Developed and implemented ingress pipeline in Databricks, replicating existing IICS logic and adapting it to meet business requirements across multiple data sources.
  • Utilized configuration tables for standardization, delta detection, and JSON creation, streamlining data processing for diverse sources and enhancing flexibility.
  • Integrated data from diverse sources into Reltio for master data management, ensuring seamless data loading and merging for downstream teams.
  • Collaborated with cross-functional teams to ensure efficient data publishing and mastering, supporting accurate and up-to-date data for business operations.

CUSTOMER BRIDGING & SALES DEDUPLICATION

ZS ASSOCIATES
01.2023 - 07.2023
  • Developed an efficient MDM (Master Data Management) system for a pharmaceutical client, leveraging IQVIA data to identify and eliminate duplicate customers across multiple sources
  • Analyzed existing BDM (Business Data Model) mappings, collated necessary changes, and proposed enhancements to optimize the system by removing duplicate customer and sales data
  • Created parameterized ETL pipelines using Python and SQL on Databricks, enabling code execution across multiple environments for seamless data processing
  • Utilized AWS S3 for efficient archival and file storage, ensuring secure storage of configuration files
  • Implemented a Databricks workflow leveraging Apache Spark and collaborative notebooks to streamline and automate data pipelines
  • Orchestrated workflows using Tidal, ensuring smooth execution and scheduling of data processing tasks
  • Utilized GIT integration for effective version control, enabling code management and seamless collaboration across multiple environments
  • Assisted the team in designing project timelines based on client requirements, actively contributing to the development of enhancements and requirements, and conducted comprehensive testing (Unit testing, System Integration Testing, and User acceptance testing)
  • Worked diligently to ensure end-to-end data integrity and quality checks post-production, guaranteeing the smooth functioning of the system in the live production environment

IN-VIVO DATAMANAGEMENT AND INSIGHTS

ZS ASSOCIATES
08.2022 - 02.2023
  • Developed an intuitive ETL platform for pharma In-Vivo data, enabling seamless data consumption and transformation for end users
  • Empowered users with self-service capabilities for data ingestion, transformation, and loading, enhancing operational efficiency
  • Conducted comprehensive data profiling, exploratory data analysis (EDA), and data standardization to ensure data quality and consistency, aligning with project requirements
  • Leveraged IICS functionalities, including mappings, mapping tasks, mapplets, and task flows, to construct end-to-end data pipelines
  • Streamlined data integration and transformation, optimizing the workflow process for efficient data processing
  • Implemented Facts and Dimensions (SCD1 & SCD2) in Oracle DBMS, utilizing Python and SQL on IICS to build ETL pipelines
  • Ensured parameterization of mappings for reusability and scalability
  • Scheduled Task flows using Tidal Orchestration tool, monitoring job execution and notifying the operation team via email in case of failures, maintaining uninterrupted data processing
  • Utilized GIT integration for version control, facilitating efficient code management, collaboration, and version tracking
  • Mentored and trained new associates, fostering their successful integration into the team and promoting a knowledge-sharing culture
  • Assisted the team in designing project timelines, interpreting client requirements, and played a key role in the development of enhancements
  • Conducted thorough testing, including unit testing, system integration testing, and user acceptance testing, ensuring high-quality deliverables and meeting project milestones

Education

Bachelor of Technology - Information Technology

Army Institute of Technology
Pune
06.2021

12th -

Army Public School
Secunderabad
05.2016

10th -

Army Public School
Secunderabad
05.2014

Skills

  • Databricks
  • Informatica
  • Oracle Server
  • MY SQL
  • GitHub
  • Power BI
  • SQL
  • Spark SQL
  • Pyspark
  • Tidal
  • Data Analytics
  • Data Modeling
  • AWS (S3, AWS IAM)

Timeline

OMNI MDM INTEGRATION HUB

ZS ASSOCIATES
07.2024 - Current

CUSTOMER BRIDGING & SALES DEDUPLICATION

ZS ASSOCIATES
01.2023 - 07.2023

IN-VIVO DATAMANAGEMENT AND INSIGHTS

ZS ASSOCIATES
08.2022 - 02.2023

Bachelor of Technology - Information Technology

Army Institute of Technology

12th -

Army Public School

10th -

Army Public School
Sunit Jena