Summary
Overview
Work History
Education
Skills
Certification
Timeline
SELECTED PROJECT HIGHLIGHTS
Generic

Rishikesh Srinivas

Bengaluru

Summary

Data Engineer with 4 + years of experience building large-scale, fault-tolerant data platforms and multi-tenant reporting solutions. Expertise spans AWS S3/Iceberg, Spark, Airflow, Trino, PostgreSQL/Citus, and Java Spring Boot application engineering. Proven ability to architect and deliver secure, high-performance data infrastructure handling multi-terabyte workloads and millions of transactions per day while meeting PCI DSS and SOC compliance.

Overview

5
5
years of professional experience
1
1
Certification

Work History

Data Engineer

M2P Solutions
08.2021 - Current

Software Development Engineer II / Data Engineer – Team Lead & Solutions Architect
(Promoted from SDE I – Sept 2024)

DataOps & Platform Architecture

  • Designed a multi-tenant AWS S3 + Apache Iceberg data lake with Trino connector for high-performance SQL analytics across 9 + tenants.
  • Deployed Apache Airflow with S3-synced DAGs and a Spark cluster in client mode, enabling auto-scaling slave nodes and reducing operational costs.
  • Implemented a schema-evolution framework with versioned models and dbt testing, ensuring seamless upstream changes and non-breaking pipelines.
  • Built a reusable Python alerting middle layer providing on-call and custom notifications for all data tools and pipelines, centralising monitoring and improving incident response.
  • Implemented Citus-sharded PostgreSQL with per-shard replicas and automated fail-over, distributing load and improving query performance by ~40 %.
  • Engineered data-encryption pipelines for S3 and data-warehouse loads, ensuring full PCI DSS and SOC audit compliance.

Data Engineering & Reporting

  • Built the initial Python/Pandas ETL pipeline; integrated Airbyte ingestion and dbt transformations for scalable, maintainable data flows.
  • Developed a multi-threaded Python reporting stack for transaction monitoring, reducing report generation time by 60 % and increasing compliance-team productivity by 30 %.
  • Created custom data connectors and optimised SQL queries, improving dashboard refresh speed by 50 %.

Application Engineering & Architecture

  • PDF Generation Service: Co-architected and coded a Java Spring Boot microservice capable of generating 5 million PDFs/day, with dynamic templates, encryption, and secure delivery.
  • Reporting Portal: Designed and maintained a multi-tenant reporting portal that allows users to configure reports and deliver them to email, SFTP, S3, or other blob stores.
    Supported connectors for Trino, MySQL, and PostgreSQL, enabling simultaneous multi-database report generation without performance degradation.
    Included advanced features such as encryption, hashing, and role-based security controls, meeting stringent financial-data requirements.
  • Designed service APIs, caching strategy, and concurrency controls to maintain sub-second response times during peak loads.
  • Implemented CI/CD pipelines and Kubernetes deployments ensuring 99.99 % uptime.

Leadership & Collaboration

  • Led a 4-member cross-functional team (1 Data Modelling/Superset engineer, 2 Application engineers, plus self).
  • Provided indirect mentorship to an SDE II across Data Engineering and Application Engineering, guiding architecture and code reviews.
  • Partnered with product managers and compliance leads to prioritise roadmap items and establish coding/monitoring standards.

Machine Learning & Anomaly Detection (Earlier Work)

  • Built sentiment-analysis and contactability-score models (F1 0.90), improving customer-contact success rate by 25 %.
  • Developed an anomaly-detection model for financial transactions with 98 % fraud-detection accuracy.

Data Science Intern

Origa.ai(Acquired by M2P Solutions)
12.2020 - 07.2021
  • Developed a multilingual WhatsApp loan-collections chatbot using Dialogflow, cutting support response time by 30 % and increasing resolution rate by 25 %.
  • Integrated Rocket.Chat for live-agent handoff and created 10 + Superset dashboards, reducing data-retrieval time by 40 %.

Education

Bachelor in Computer Science -

Vidya Vikas Institute of Engineering And Technology
Mysore, India
07-2021

Skills

    Languages: Python, SQL, Java, PySpark, Shell

    Data & Cloud: AWS S3, Apache Iceberg, Trino, PostgreSQL, Citus, NoSQL

    Orchestration & ETL: Apache Airflow, dbt, JupyterHub, Airbyte

    Application Engineering: Java Spring Boot, REST APIs, PDF generation, Multi-tenant reporting portals

    BI & Visualisation: Apache Superset, Star-schema modelling, Keycloak SSO

    Security & Compliance: End-to-end encryption, hashing, PCI DSS & SOX audit readiness

    DevOps/Infra: Docker, Kubernetes, Linux, CI/CD, Git, Monitoring & Alerting

Certification

  • The Machine Learning Data
  • Warehousing on AWS
  • Pipeline on AWS
  • Deep Learning on AWS
  • MLOps Engineering on AWS

Timeline

Data Engineer

M2P Solutions
08.2021 - Current

Data Science Intern

Origa.ai(Acquired by M2P Solutions)
12.2020 - 07.2021

Bachelor in Computer Science -

Vidya Vikas Institute of Engineering And Technology

SELECTED PROJECT HIGHLIGHTS

  • Multi-Tenant Analytics Platform – Forked and customised Apache Superset with Keycloak SSO and star-schema modelling, delivering secure dashboards for 9 + tenants with role-based access control and real-time analytics.
  • High-Volume ETL & Alerting – Engineered Spark + Airflow pipelines processing 100 GB+/day, integrated a unified Python alerting layer for on-call and custom notifications, and enabled automated cluster scaling for cost-efficient performance.
  • Secure Data Pipeline & Compliance – Implemented end-to-end encryption and hashing at ingestion and storage, ensuring seamless PCI DSS & SOX audit readiness across multi-cloud environments.
  • Citus-Sharded Data Warehouse – Deployed a Citus PostgreSQL cluster with automatic shard replication and fail-over, achieving high availability, load distribution, and ~40 % faster query performance compared to a single-instance setup.
  • JupyterHub Development Platform – Designed and deployed a multi-tenant JupyterHub with Spark in headless mode and custom plugins, enabling isolated development and testing environments that replaced Databricks for cost-effective analytics.
Rishikesh Srinivas