Job Profile
Mar 2023 - Jan 2024
Role: Data Lead
- Worked closely with Clients / Solution Architect and understand the requirements.
- Perform Data Modeling
- Build and Maintain Terraform IaC for ETL Pipelines
- Write Code for ETL Pipelines in PySpark
- Translate Business Requirements into User Stories and Create Tasks and milestones to achieve desired outcome.
- Documenting Solutions for future reference
- PR Creation / PR Review of Code Changes
- Propose Data Product Architecture
Feb 2022 - Feb 2023
Role: Lead Software Engineer
- Work closely with Clients and understand the requirements.
- Propose Microservice Architecture
- Backend Development
- Documenting Solutions for future reference
Awards / Certifications / Specific Achievements
- Completed Architect Academy - an elite Academy from Globallogic Practices
- Completed Architect Apprenticeship Program from Globallogic
- Received Quarterly Eminence award for Q4 2022
- Received Customer Award in Sep 2023 on exceptional contribution to Data Pipelines and Data Infrastructure
- Improvised Data Ingestion Strategy to reduce load on source database and batch latency
- Created Utility module that can be shared across Glue Jobs
Project 1: Data Engineering (Mar 2023 to till date)
Roles: Data Lead, DevOps Engineer
Tech Stack: Glue, Lambda, Terraform, GitHub CI/CD, Snowflake, Quicksight, Pyspark, RDS, SNS, S3, Athena, CloudWatch, EventBridge, EC2, EKS, React, Java
Responsibilities:
- Review / Build and maintain data pipelines in AWS Glue / PySpark
- Automate data ingestion and data processing using cloud-native tools
- Data Modeling
- Continuously improve data infrastructure to increase efficiency and scalability
- Terraform scripting to create Data Pipeline resources in AWS
Project 2: User Discovery (Sep 2022 - Feb 2022)
Project 3: Unified User Data Modeling (UUDM) (Feb 2022 - Aug 2022)
Role: Software Engineer Lead
Tech Stack: CDK, Typescript, Lambda, DynamoDB, S3, SNS, Kafka, OpsGenie, CloudWatch, Node.js / Typescript
Responsibilities / Achievements:
- Designed Serverless Solution for User Discovery Microservice that consumes millions of messages and publishes User Churn and User Segmentation messages to Kafka
- Backend Development in Typescript in line with Organization's principles / patterns - Domain Driven Design (DDD) and Test Driven Development(TDD)
- CI/CD pipeline developed following GitOps model
- Receives millions of messages and publishes output without any batch latency
- 100% Serverless and highly Cost Effective
Project 4: CDK Typescript Accelerator for Serverless
Project Type: Self Initiative - published in Organization Practices portal
Tech Stack: CDK, Node.js / Typescript, S3, Lambda, SQS, SNS, IAM, API Gateway, Custom Authorizer Lambda, CloudFront, Cognito, EventBridge and S3 Events, Custom Resource Manager, CloudFormation, DynamoDB, KMS, RDS
Role: Individual Contribution
- Helps in deploying following AWS resources on Day 1 for a serverless Microservice : Lambda, API Gateway, RDS, DynamoDB, S3, SNS, SQS, Kafka Streaming, Eventbridge, IAM Roles, IAM Policies, Cognito, Custom Authorizer & KMS
Project 5: POV on Hadoop Migration
Project Type: Self Initiative
Highlights:
- Technical Strategy on how to migrate Petabytes of Hadoop Data to GCP with zero down time