This assignment is for creating Data Lake for the client by loading the data from RDBMS sources like PostgreSQL and MySQL to Database Environment by using Airflow jobs and storing the same in AWS S3 folders.
- Loading the data from RDBMS sources like PostgreSQL and MySQL to database Environment by using Airflow jobs and storing the same in AWS S3 folders.
- Testing and validating data present in the Data base by comparing it with RDS data.
- Data Validating by using AWS Athena / using SQL queries in CLI
- Testing functionality of build and verifying partitions created in S3 storage.
- Regression testing for verifying new functionality with old existing code
- If any Bugs found during the testing will be raised to concerned developers by sharing the logs produced during the job and tracking the same using Jira.
- Following and interacting with Dev folks in mitigating the bugs raised.
- Involved in Daily stand ups with team and Client meetings if required