
Data Engineer with 4+ years of experience designing, building, and maintaining scalable, production-grade data pipelines across enterprise environments. Strong hands-on expertise in Apache Spark (PySpark), SQL, Python, ETL automation, and cloud-ready architectures, with growing specialization in Databricks-style lakehouse patterns. Proven experience working with high-volume financial and ERP datasets, integrating relational systems via JDBC, implementing data quality and reconciliation frameworks, and optimizing performance for large-scale transformations.
Apache Spark (PySpark), Spark SQL
ETL Pipeline Design (Read → Transform → Write)
Performance optimization on distributed Data
Parquet, Structured Data Processing
SQL Server MySQL (JDBC integration)
Data Modelling, Data warehousing & Schema Design
Azure Databricks