Results-driven Data Engineering Leader with 15 years of experience designing and managing large-scale distributed systems and high-performance data platforms. Skilled in building real-time, low-latency data pipelines and OLAP systems handling terabytes of structured and semi-structured data. Proven ability to deliver scalable, reliable data infrastructure for analytics, reporting, and machine learning applications.
Key Projects:
Languages & Programming: Python, Java, Scala, SQLBig Data & Distributed Systems: Apache Spark, Apache Flink, Apache Kafka, Hadoop, HDFS, Hive, HBase, IcebergDatabases & OLAP Systems: Apache Pinot, Apache Druid, Apache Cassandra, Aerospike, Postgres, MS SQL ServerCloud Platforms: AWS, GCPWorkflow & Orchestration: Apache Airflow, Apache NiFiData Visualization: Power BI, Apache SupersetSearch & Query Engines: ElasticSearch, Apache PrestoFrameworks: Microservices, REST APIs