Principal Engineer with over 13 years of comprehensive experience, including 3 years dedicated to Core Data Engineering and a robust 10-year foundation in Software Engineering and Machine Learning. Self taught, up skilled & proved in building scalable data pipelines, optimizing ETL workflows, and designing cloud-based big data solutions. Real hands-on in Data Modeling, Columnar Storage, Apache Spark, Kafka, Airflow, Snowflake, BigQuery, AWS Glue, and real-time streaming. Passionate about driving innovation, data governance, automation, and distributed computing.
Project: Learning Portal (Client: CSOD)
1. Led & developed core Data ETL from application to
Datalake for looker based dashboards & to power UI based reports by hosting rollup data to elastic search.
2 . Major Business tasks accomplished was revamp ETL based on multi-tenant to single-tenant DB architecture using AWS Glue, DBT, and BigQuery, Developed an event-driven pipeline for real-time leaderboard score computation with a 1-hour NRT, Delivered Asynchronous & distributed data deletion framework based on Python SDK for decommissioned business orgs across AWS & GCP, automating 100% of compliance to comply GDPR.
3. Minor enhancements delivering multiple customer requested features wrt establishing new ETL from scratch to power AI tools or consumed data in downstream tools like workato .
1. Real-time monitoring framework for CDC connector health , automate ETL data synchronization validation, storing insights in BigQuery for daily Tableau reporting.
2. Apache Airflow orchestration with dynamic job scheduling, reducing pipeline delays, cutting processing time by 50%, accelerating analytics for business users.
3. Optimized resources (BigQuery, Cloud Run, Cloud Functions) post analysis from insights in execution & memory pressure , requests throtling
4. Performed Elastic search optimization by migrating excess shards data over 500 GB to multiple shards using cluster reindexing and reduce bottleneck for data ingestion.
5. Created Export , Import jobs to migrate data from Production to lower environment for data evaluation.
1. Led architecture discussions with senior management, Business analysts , provided technical guidance to teams, followed developed early & fail fast approach , scheduled pilots demo early to get business suggestions and make robust delivery.
2. Led a 4-member team, mentoring juniors, help to fast track & bring on the project speed .
Project: CRM Sales Fusion Application
Product Development