Expert in architecting data solutions and establishing data pipelines that efficiently handle real-time and batch processing of extensive datasets through Lakehouse and data warehouse frameworks.
Have overall 10+ years of experience in data domain.
Overview
10
10
years of professional experience
Work History
Lead Data Engineer
HackerRank
09.2022 - Current
Architected a centralized data platform to bring all product data under one robust, scalable architecture.
Designed and implemented large-scale batch + real-time/ETL pipelines (Spark, Kafka, Trino, Hudi).
Led a team of data engineers; owned solution design, reviews, and delivery across the release lifecycle.
Partnered with DevOps to manage infra and reliability (AWS, Spark/Kafka clusters, governance & monitoring).
Implemented a modern data lakehouse solution using HMS + Hudi, with Trino/StarRocks for fast analytics.
Reduced data platform cost by ~65% and saved ~$25,000 per month on AWS through cost/performance initiatives.
Enabled self-serve BI/monitoring via Looker, Redash, Metabase; integrated streaming via Kinesis & Kafka Connect.
Senior Software Engineer
Rishabh Software Pvt. Ltd.
Vadodara
04.2018 - 07.2021
Design-Develop-Test-Release-Maintain data pipeline solutions as per business requirement.
Coordinating with client for project requirements and possible solutions.
Establishing the overall application release plan and effort estimation.
Guide junior team member for solution approach and technical bottlenecks.
Largest digital media owner in UK
Client requirement to provide compliance on ad-copy played on booked inventory across EMEA and APAC regions. Developed data processing component, designed data-flow architecture, and expanded product to additional markets (countries).
Spark streaming with Kinesis used for ingestion. Cleaned raw datasets, applied business rules via Spark batch jobs (Spark SQL) with unit/integration tests, and routed outputs to Cassandra and/or Parquet/CSV on AWS S3 for visualization.
Cassandra used as a DWH and for showcasing ongoing ingestion via a Java Spring Boot application. End-to-end data-flow deployed on AWS; CI/CD via Jenkins and orchestration via Apache Airflow; Redash on AWS Athena for analysis and visualization for support users.
Senior Systems Engineer
Infosys Ltd.
Pune
08.2015 - 03.2018
Managing key account activities like resource on-boarding and tracking project implementation.
Developing solutions for complex bottlenecks in the build process.
Debugging the application and preparing it for release to the client.
Coordinating with client for project requirements and possible solutions.
Developing application and thorough unit testing until release.
Client requirement to build reports on real-time bidding against inventory, including deal details and revenue. Data ingested from a data lake (AWS S3) in AVRO format.
Designed and developed Spark application to denormalize nested forms, club supply/demand entities, and write outputs to S3.
Performed aggregations (Spark SQL) and generated revenue details for markets.