
Client:
Description:
Roles and Responsibilities:
Client:
Description:
Roles and Responsibilities:
• Managed data from sources like S3 and worked extensively with various file formats for data extraction
and transformation.
• Developed EMR steps to read and write data at S3 on a scheduled basis.
• Executed PySpark code for transformations to achieve desired data outcomes and monitored jobs
using CloudWatch.
• Transformed and loaded data from S3 to Redshift with custom transformations as per client requests.
• Designed and optimized databases and data schema in AWS Redshift.
• Orchestrated job execution using Airflow for data pipelines scheduling.
• Engaged in sprint planning, review, retrospectives, grooming sessions, and peer review processes.
Client:
Description:
Roles and Responsibilities:
• Employed DML operations to modify existing records, add new data, and remove obsolete entries based
on business requirements.
• Defined data structures, enforced data integrity constraints, and optimized database performance
through appropriate indexing strategies.
• Utilized built-in methods and operators to perform operations on different data types, ensuring efficient
data handling and manipulation.
• Designed and implemented custom functions using the def keyword, encapsulating reusable blocks of
code to promote modularity and code reuse.
• Pyspark code optimization and modularizing codes into utilities
Title: CLOUD DATA PROFESSIONAL