Project 1: Enterprise Data Migration Validation – UDAP Platform
Environment: Azure Databricks, PySpark, SQL, ADLS Gen2, MFT
Responsibilities
- Performed data validation for enterprise data migration from multiple source systems into the UDAP data platform.
- Validated datasets ingested from EDP Storage, EDW systems, and business file sources.
- Executed Databricks PySpark notebooks to verify ingestion and transformation results.
- Conducted source-to-target validation including record count checks, column validation, and data reconciliation.
- Verified data processing across Raw, Staging, and Curated layers in the data platform.
- Performed PPV (Post Production Validation) to ensure production data was processed correctly after deployment.
- Conducted data quality checks on migrated datasets to ensure completeness and accuracy.
- Validated monthly data delivery process via MFT to client Q-Drive location.
- Ensured latest monthly datasets were successfully delivered to downstream systems.
Project 2: Medicaid Claims Data Pipeline Enhancements
Environment: Azure Data Factory, Azure Databricks, SQL, PySpark, ADLS Gen2, Azure DevOps
Responsibilities
- Performed QA validation for enhancements made to existing Medicaid claims data processing pipelines.
- Validated execution of Azure Data Factory pipelines triggering Databricks notebooks.
- Monitored pipeline execution scheduled through Azure DevOps CI/CD workflows.
- Executed Databricks notebooks and verified output datasets after pipeline execution.
- Performed source-to-target validation on Medicaid claims datasets including Medicaid IDs and insurance claims records.
- Logged defects for inconsistencies found in processed datasets and coordinated with developers for fixes.
- Conducted re-testing and regression testing after bug resolution.
- Created and executed UAT test cases for pipeline enhancements and data processing logic.
- Performed PPV (Post Production Validation) after deployment to confirm production data pipelines executed successfully.
Ensured data integrity, completeness, and compliance with business rules across claims datasets.