Project: Deltalake Housekeeper
Develop a spark based SAAS application and java client which can integrate with any pipeline , optimize the run time of the spark job and perform delta table maintenance activities as well
Roles and Responsibilities
Project: Schema Manager
Spark and kubernetes based multi-tenant application built to create version controlled, contract obligating and maintenance friendly tables across delta lake , postgres , snowflake.
Roles and Responsibilities:
Achievements:
Project: Egress Share
Project: Low Code Platform
Roles and Responsibilities:
Achievements:
Project : Nike Communications platform
Platform to campain users on nike owned apps.
Roles and Responsibilities:
1) Built scalable, maintenance-friendly data solutions to provide reliable and consistent data to stakeholders.
2) Build and manage workflows of batch and streaming fashion, consuming and writing to various sources such as kafka , s3,postgres etc.
3) Analyze customer needs and make them into requirements.
Project: Nike Data Quality Framework
Designed and Developed configurable data quality solution working seamlessly with airflow dags without any changes to existing codes, dags. Solution provided functionalities like Alerting, EOD reports, Slack interactives etc.
Project: SLA Monitoring Framework
A framework built on top of airflow metadata to monitor pipelines and raise alarms as necessary. The product is built by using AWS lambda, Postgres and airflow dag. Seamlessly integrates with slack interactives to provide reports on demand.
Achievements:
1) Implemented a streaming framework based on spark and improved developer productivity significantly.
2) Designed process for sanity, monitoring, and consistency of the streaming and batch jobs.
3) DQ solution was adopted for 100+ tables and 60 + dags in the vertical.
Project: Albertsons
Roles and Responsibilities
Achievements:
NB Churn/Customer Attrition:
A product aimed to build an attrition model with and support marketing teams to retain customers on and monthly to quarterly basis.
Habitual AI:
The flagship product of the company aims to improve customer engagement via prescriptive analytics and make them engaged to client services across domains. Works as a single point for all product recommendations.
Technology used: HortonWorks Ambari, Hadoop, Hive, Stream sets,
Spark/Scala, PostgreSQL.
Responsibilities:
1) Contribute to product development by creating data pipelines and implementing models with Spark, Scala.
2) Work closely with internal and external key stakeholders regarding the performance and evaluation of deliveries.
3) Identify correct features to improve model metrics
4) Design pipelines and architecture for automated model training, tuning and data preparation for monthly activities.
5) Evaluate and improve model performance by identifying and adding new features based on the domain knowledge of the banking industry and Exploratory data analysis.
Achievements:
1) Implemented code to generate 400+ features dynamically for the attrition model thereby improving the model accuracy significantly in spark/scala.
2) Managed billions of data points on confined hardware by tuning and improving the code design and following spark tuning guidelines to provide timely delivery to clients.
3) Achieved a recall of 87+ % and a model accuracy of above 90% for banking clients.
4) Designed pipelines across two banks to ensure smooth functioning of the monthly recommendation process.
Project: Barclays Compliance Application Support
Spark,Spark Streaming,Pyspark,Kafka,kafkasql,Databricks,Pandas,Glue Catalog