Data Engineer
- Optimized enterprise data warehouse performance and improved query efficiency through schema redesign and ETL enhancements.
- Conducted in-depth analysis of existing warehouse schema and identified bottlenecks in query performance.
- Re-engineered data models and normalized schema structures to reduce redundancy and improve scalability.
- Implemented efficient partitioning, bucketing, and indexing strategies in Spark SQL, reducing query execution time by 30-40%.
- Refactored ETL pipelines using PySpark to handle large-scale data loads with improved fault tolerance.
- Collaborated with analysts and business users to ensure optimized data accessibility for reporting and dashboards.
