Summary
Overview
Work History
Education
Skills
Interests
Timeline
Generic
Sanjay Aswani

Sanjay Aswani

Associate Director – Data Engineering
Mumbai

Summary

Results-driven Data Engineering Leader with 15 years of experience designing and managing large-scale distributed systems and high-performance data platforms. Skilled in building real-time, low-latency data pipelines and OLAP systems handling terabytes of structured and semi-structured data. Proven ability to deliver scalable, reliable data infrastructure for analytics, reporting, and machine learning applications.

Overview

15
15
years of professional experience
4
4
years of post-secondary education
2
2
Languages

Work History

Associate Director – Data Engineering

Media.net
12.2011 - Current
  • Lead a cross-functional team of 18 engineers (Data Engineers, SDEs, and SREs), overseeing hiring, mentoring, project planning, and performance reviews.
  • Designed and developed multiple products from scratch, including data ingestion and transformation pipelines for structured and semi-structured data.
  • Built scalable ETL pipelines using Apache Spark, Hive, Iceberg , HDFS, and Kafka. implemented validation and data quality checks.
  • Developed high-throughput real-time systems using Apache Flink and Kafka, processing up to 2000 QPS.
  • Set up and optimized OLAP platforms (Apache Pinot, Druid) handling 20TB+ of data with sub-second query latency.
  • Designed scalable APIs for data access and ingestion using Python and Java (Spring Boot).
  • Created dashboards and reporting tools using Power BI and Apache Superset.
  • Implemented Airflow/NiFi-based orchestration for complex ETL workflows.



Key Projects:

  • Real-time Budgeting System: Built real-time ad spend accounting system using Flink + Kafka + Cassandra to process 150000 QPS. Enabled dual data center sync for real-time spend management.
  • OLAP Platform for Reporting: Setup Pinot & Druid clusters managing 20TB+ of hot data with sub-second latency for 100+ QPS. Used extensively for business analytics and optimization.
  • ML Model Feedback Pipeline: Built Flink-based streaming system (500M+ rows/day) to deliver near-real-time feedback to ML models using Aerospike as a sink.
  • Generic Analytics Migration: Migrated analytics platform from Druid to Pinot to support backfill/reprocessing use cases; deployed scalable REST APIs in Java for platform access.
  • Spam Detection Pipeline: Implemented Spark-based data quality pipeline to flag spam traffic using rule-based classification logic on HDFS-backed data.

Data Engineer

Financial Technologies
08.2010 - 11.2011
  • Designed data models and ETL pipelines for financial data processing.
  • Built star-schema based data warehouses and OLAP Cubes (SSAS) for business reporting.
  • Developed data APIs for stats auditing, customer reporting, and account reconciliation.
  • Collaborated with cross-functional teams to deliver analytics features aligned with business needs.

Education

B.Tech -

Rajasthan Technical University
01.2006 - 01.2010

Skills

Languages & Programming: Python, Java, Scala, SQLBig Data & Distributed Systems: Apache Spark, Apache Flink, Apache Kafka, Hadoop, HDFS, Hive, HBase, IcebergDatabases & OLAP Systems: Apache Pinot, Apache Druid, Apache Cassandra, Aerospike, Postgres, MS SQL ServerCloud Platforms: AWS, GCPWorkflow & Orchestration: Apache Airflow, Apache NiFiData Visualization: Power BI, Apache SupersetSearch & Query Engines: ElasticSearch, Apache PrestoFrameworks: Microservices, REST APIs

Interests

Swimming
Traveling
Tracking

Timeline

Associate Director – Data Engineering

Media.net
12.2011 - Current

Data Engineer

Financial Technologies
08.2010 - 11.2011

B.Tech -

Rajasthan Technical University
01.2006 - 01.2010
Sanjay AswaniAssociate Director – Data Engineering