GIT

Innovative and results-oriented data engineer with over 8 years of experience and a strong background in building high-performance data pipelines and REST API frameworks. Expert in PySpark, Databricks, and binary CAN data extraction from vehicles. Demonstrated success in implementing event-based data logging approaches and enhancing data processing systems for efficient use in digital twin projects. Accustomed to working closely with system architects, software architects and design analysts to understand business or industry requirements to develop scalable data pipeline applications.
Data Engineering
PySpark
Databricks
Event-Based Data Logging
Binary CAN Data Processing
REST API Development (FastAPI)
Version Control (eg, Git)
Continuous Integration (CI)
Team Collaboration and Workflow Optimization
Azure
Real time data processing
GIT
Cloudera
Spark 3, DataBricks
Python Flask/Fast Api
Oozie
AWS(EMR, Lambda, Kinesis, Cloudwatch), Azure
Splunk, Tableau
Hive
Mongo DB
Airflow
Azure
Iceberg
Trino
Open Metadata
Grafana