Experienced and certified AWS Data Engineer with 6.6 years of expertise in developing and optimizing ETL processes. Proficient in PySpark, AWS, SQL, and Python, delivering scalable and 100% efficient data solutions. Demonstrated success in optimizing workflows, ensuring data quality, and collaborating with cross-functional teams to achieve business objectives.Successfully built scalable and robust ETL processes on AWS, ensuring efficient data extraction, transformation, and loading. Developed and maintained data pipelines using AWS Glue, leveraging PySpark for data transformations and optimizing workflow performance with 100% efficiency.
·Implemented and managed data lakes on Amazon S3, ensuring secure and efficient storage of large datasets. Delivered an event-driven system using AWS SNS and SQS for seamless communication and coordination with upstream system. Utilized AWS Lambda for serverless computing, automating data processing tasks and improving overall system efficiency, reducing manual efforts . Integrated diverse data sources into cohesive pipelines, demonstrating proficiency in handling data from Amazon S3, AWS Glue Catalog, AWS EMR .Collaborated closely with stakeholders to gather data requirements and transform them into effective Spark SQL queries and dataframe operations.
EDP DATA LAKE DATA INTEGRATION HUB
Data ingestion and integration frameworks provide a standardized, easy and reliable way for various Barclays departments to load their data to data-warehouse. Frameworks are built using enterprise et1 tools. Frameworks reduce coding complexity from users, and tool specific knowledge, and at the same time let applications use enterprise level features such as security, data lineage, high performance, high reliability. Standardized, reusable solution for ingesting and integrating data from various sources at the bank, to central data warehouses. The frameworks are used by Barclays UK, Cards and payments, Corporate, Investment bank, Group risk and technology, legal, Fin-crime, HR, compliance tech, wealth.
• Collaborating with Stakeholders to gather requirements and understand data integration needs.
• Developing data integration using aws services and defining integration patterns.
• Performance monitoring and optimization of data integration workflows.
• Orchestrating the ETL jobs using AWS Step Functions.
• Develop and maintain ETL jobs using AWS Glue for data extraction, transformation, and loading.
• Developing scripts to define data transformation logic.
Working with other roles to establish data partitioning, organization, and backup strategies
Cloud
AWS DATA ENGINEER ASSOCIATE