Project Summary:
The project mainly focuses on building the Assisted Data Engineering (ADE), which is a framework designed to streamline and automate the data transformation process. It empowers users to define the transformation job steps using predefined blocks. These blocks serve as modular components that encapsulate common data manipulation tasks, such as filtering, aggregating, and joining datasets. By utilizing these blocks, users can construct complex data workflows without needing extensive coding knowledge.
Roles:
Currently serving as an individual contributor Senior Cloud Data Architect, with a strong emphasis on hands-on development, coding, and DevOps within AWS environments. Collaborating closely with AWS U.S. teams (AWS ProServ Engagement), I contribute to architecting and engineering scalable, secure, and high-performing data solutions.
Key responsibilities include:
Project Summary:
The project TTRT Self-Service Engine revolves around the development of a Self-Service Engine, which serves as an intuitive user interface designed for business users to execute queries using customizable filters. The engine empowers users to easily launch queries without needing to write complex SQL, streamlining their ability to gather data insights.
At the core of the system, a Python-based query engine dynamically forms the actual queries based on the user inputs. These queries are then sent to AWS Athena for execution. The results are subsequently fetched back to an on-premises UI server for display to the user.
In addition to the self-service functionality, the project includes the creation of a Business-As-Usual (BAU) pipeline, built using AWS Glue and PySpark. This pipeline is responsible for establishing a data lake in Amazon S3, curating the data, and preparing it for reporting and ad hoc analysis. The curated data is made available to the data science team for further processing and insights generation.
The overall solution ensures both efficient, user-friendly query execution for business users, and a robust, scalable infrastructure to support advanced analytics and reporting. The combination of AWS services and Python-driven query generation allows the project to deliver a powerful, high-performance data processing, and reporting solution.
Role:
I played the role of an individual contributor as an AWS data architect, as well as leading the delivery team at Barclays.
Project Summary:
The OneReg project aims to modernize an on-premises Rainstor application archival data store by migrating it to AWS, with a focus on processing large-scale data for reporting and predictive analytics. The primary objectives include transforming the data ingestion process using PySpark, Python, and AWS Glue, creating a centralized data lake on Amazon S3, and leveraging AWS Lake Formation for data governance. Additionally, the project utilizes Amazon Athena for ad hoc query execution, and Amazon Redshift as the data warehouse to handle large datasets efficiently.
For reporting purposes, Tableau will be integrated as the reporting layer, providing end users with interactive dashboards and insights. The project also includes training machine learning models for predictive analytics, ensuring that data processing, storage, and analytics capabilities are optimized in the cloud environment.
This modernization will enable the organization to harness the power of AWS services for scalable, efficient, and secure data management and analytics, ultimately empowering data-driven decision-making and advanced predictive insights.
Role:
Project Summary:
The project focused on designing and implementing a centralized Data Lake solution on the Azure Cloud Platform for multiple business verticals of Johnson & Johnson (JnJ), including Medical Devices (MD), Pharmaceuticals, Supply Chain, and Consumer divisions.
Key objectives included:
The project enabled JnJ to leverage a robust data foundation, ensuring faster insights, improved decision-making, and optimized business processes across all verticals.
Role:
I played the role of data architect with hands-on expertise in the Azure platform. I led over 16 data engineers in this JnJ project.
Project Summary:
The iHub v1 program was initiated by Vodafone UK to develop a centralized and consolidated database solution aimed at enhancing KPI (Key Performance Indicator) reporting across Vodafone’s diverse local market landscape.
Key objectives of the solution included:
The implementation of iHub v1 empowered Vodafone to drive better business insights, promote transparency, and streamline performance reporting processes across its operations.
Roles:
Played the Senior Data Engineer and Offshore Lead Data Engineer, with strong expertise in Data Warehousing, ETL Development, AWS Cloud Services, and Oracle Data Integration (ODI). Experienced in managing end-to-end data solutions, from requirement analysis to production deployment, across both onshore (London) and offshore (India) environments.
Key achievements and responsibilities include:
Project Summary:
The LOAD Engine is a comprehensive financial information system built on the Oracle E-Business Suite (OeBS) R12 platform, designed to support three major carriers—CNC, ANL, and DELMAS—along with several agents. The system operates within an Oracle-based environment, comprising four key databases: Engine, OeBS, Reporting, and DEA.
The system ensures seamless financial data management, robust accounting operations, and real-time reporting capabilities, enabling efficient financial oversight across multiple carriers and agents.
Roles:
Experienced BI Engineer with a strong focus on ODI (Oracle Data Integrator) development and Load Engine application support, played a dual role in both development and AMS (Application Management Support) activities. Adept at developing and updating ODI scenarios to meet evolving business requirements, while ensuring smooth and efficient support of critical applications.
Key responsibilities include:
A reliable contributor to both the ODI development team and the AMS team, consistently delivering high-quality solutions, and ensuring client satisfaction.
Data Architecture
Data Engineering
Data Warehousing and Data Modeling
Redshift
Oracle
Azure Synapse
Snowflake
Spark
PySpark
AWS Glue
Databricks, EMR,
Kinesis
Azure Data Factory (ADF)
Athena
Python,
SQL and PL/SQL
PostgreSQL
GitHub, Bitbucket, Jira
Airflow,
Autosys
AWS CloudFormation
Infrastructure as Code (IaC)