Summary
Overview
Work History
Education
Skills
Timeline
Generic
Manoj Saini

Manoj Saini

Lead Data Engineer | Data Architect | Data Platform Leader
Gurgaon

Summary

Results-driven Data Engineering Leader with 15+ years of experience designing and delivering enterprise-scale data platforms across AWS and Snowflake ecosystems. Expert in building and governing ETL/ELT pipelines, real-time streaming systems, and analytical data warehouses using Snowflake, Redshift, SQL Server, Airflow, Python, and SQL. Proven expertise in leading cross-functional teams, defining data strategy, modernizing legacy architectures, implementing governance frameworks, and building scalable batch and real-time data solutions using Spark, Airflow, Glue, Kafka, Kinesis, Redshift, and Snowflake. Experienced in stakeholder management, cloud transformation, platform engineering, and driving business value through data-driven innovation.

Overview

3
3
years of post-secondary education
2027
2027
years of professional experience

Work History

Lead of Data Engineering

Wortgage Finance Private Limited (WeRize)
09.2025 - Current
  • Led and scaled a high-performing data engineering team of 15 or more data engineers across multiple projects.
  • Led cloud migration programs involving AWS, Snowflake, Redshift, Glue, Airflow, Spark, Kafka, and serverless technologies.
  • Reduced data quality issues by 60% through the implementation of automated validation and reconciliation frameworks.
  • Successfully migrated over 300 ETL workflows and more than 500 datasets from legacy platforms to cloud-native architectures.
  • Designed and maintained scalable data lakes on AWS S3 for storing structured, semi-structured, and unstructured data.
  • Developed RESTful APIs for secure data ingestion, extraction, and integration with internal and external systems.
  • Designed and implemented ACID-compliant data lakes using Delta Lake to support reliable and scalable data processing.
  • Reduced data processing failures by 30% through the adoption of Dynamic Frames for schema-evolving source systems.
  • Optimized Lambda performance through memory tuning, concurrency controls, and cold-start mitigation strategies.
  • Developed and maintained Snowflake databases, schemas, tables, views, materialized views, Dynamic Tables and secure views to support analytics and reporting requirements.
  • Optimized complex Snowflake queries, reducing execution times by up to 40% through clustering strategies and query tuning.
  • Developed metadata-driven ELT frameworks using Snowflake, AWS Glue, and PySpark for enterprise data integration.
  • Designed and implemented real-time data streaming pipelines using Apache Kafka to process millions of events daily.
  • Developed real-time ETL pipelines integrating Kinesis, Lambda, S3, and Snowflake for near real-time analytics.
  • Defined data engineering standards, best practices, and architectural guidelines across the organization.
  • Implemented distribution styles, sort keys, and compression encodings in Amazon Redshift to improve query performance.
  • Implemented role-based access control (RBAC) and least-privilege access models across cloud data platforms.
  • Designed and managed enterprise-grade workflow orchestration using Apache Airflow for complex ETL, and ELT pipelines.
  • Processed structured, semi-structured, and streaming data from various enterprise sources using PySpark.
  • Built metadata-driven Glue jobs using PySpark to process large-scale datasets from multiple source systems.
  • Conducted technical interviews and hiring initiatives to build and strengthen data engineering capabilities.
  • Increased engineering productivity by 40% through reusable components and automation.

Associate Principle Architect (Data Engineering)

LTIMindtree
06.2024 - 12.2024
  • Collaborated with Product Owners, Business Analysts, Data Scientists, and DevOps teams to align technical solutions with business objectives.
  • Designed and implemented streaming data solutions using Apache Pulsar, Kinesis, Kafka, Snowflake streaming features, Apache Airflow, and AWS services.
  • Led data integration initiatives to move data from MongoDB to Redshift and Snowflake, ensuring scalability and reliability.
  • Optimized Amazon Redshift workloads through query tuning, distribution strategies, and performance enhancements.
  • Developed real-time ETL pipelines integrating Kinesis, Lambda, S3, and Snowflake for near real-time analytics.
  • Automated enterprise data pipelines using Airflow, Glue, Redshift, and PySpark reduce manual operational effort by 80%.
  • Established and enforced data security and compliance measures to protect sensitive information.
  • Leveraged zero-copy cloning to create lower environments and support testing without increasing storage costs.
  • Built high-performance ELT pipelines to load data from AWS S3 into Snowflake using Snowpipe, COPY commands, and automated ingestion frameworks.
  • Mentored team members on Snowflake performance optimization, data modeling, and cloud data engineering concepts.
  • Utilized Glue Crawlers and Data Catalog to automate schema discovery and metadata management.

Sr. Technical Architect

NexGen IOT Solutions Pvt. Ltd.
12.2022 - 05.2024
  • Led a team of 20+ data engineers in designing and implementing cloud-native data platforms using different AWS services, Airflow, Snowflake, Redshift, and Spark.
  • Performed workload management, query tuning, and capacity optimization to improve Redshift warehouse efficiency.
  • Developed ETL solutions using AWS Glue Dynamic Frames to process semi-structured and schema-evolving datasets.
  • Built decoupled, fault-tolerant data workflows using Amazon SQS, SNS for asynchronous processing, pipeline resilience, and event notifications and alerts.
  • Designed API endpoints for real-time data exchange between cloud applications and data platforms.
  • Designed secure IAM roles and policies to enforce least-privilege access across AWS data services.
  • Optimized SQL Server queries and stored procedures to improve database performance, reduce execution time, and enhance system reliability.
  • Developed serverless applications using AWS Lambda to automate data processing and orchestration tasks.
  • Designed and implemented data pipelines leveraging Snowflake features such as Snowpipe, Streams, Tasks, Dynamic Tables, and Materialized Views to automate data ingestion, transformation, and analytics workflows.
  • Improved ETL processing performance by 50% using Snowflake Streams, Tasks, and optimized ELT design.
  • Implemented time travel and fail-safe capabilities in Snowflake to support data recovery and compliance requirements.
  • Optimized complex Snowflake queries, reducing execution times by up to 60% through clustering strategies and query tuning.
  • Configured masking policies and row-level security to protect sensitive business information.
  • Integrated Kafka with Spark Streaming, Snowflake, Redshift, and AWS services for real-time analytics and reporting.
  • Leveraged Kinesis Firehose to automate data delivery into S3, Redshift, and Snowflake.
  • Maintained 99.9% uptime for data pipelines ingesting streaming and transactional data from 8 to 10 primary sources, using Redshift, S3, Apache Pulsar, Python, and AWS Lambda.
  • Developed reusable Airflow operators, sensors, and custom plugins to standardize pipeline development.
  • Developed multiple analytics APIs integrated with applications across Android, iOS, Roku, Smart TVs, and web platforms, enabling consistent and scalable data access.
  • Built CI/CD pipelines using Jenkins to automate the deployment of data pipelines, ETL workflows, and analytics services.
  • Integrated Glue jobs with S3, Lambda, EventBridge, Kinesis, and Airflow for end-to-end data orchestration.
  • Built reusable transformation frameworks using Spark DataFrames and Spark SQL.
  • Implemented Spark-based data quality, validation, and reconciliation frameworks.

Associate Director - Technology & Data Engineering

Zivore (Libas)
4 2022 - 11.2022
  • Managed AWS environments for application deployment, ensuring scalability, security, and high availability.
  • Collaborated with Product Owners, Business Analysts, Data Scientists, and DevOps teams to align technical solutions with business objectives.
  • Automated data ingestion workflows from multiple sources into S3 using AWS services, and custom ETL pipelines.
  • Performed workload management, query tuning, and capacity optimization to improve Redshift warehouse efficiency.
  • Implemented Dynamic Frames for data lake ingestion workloads, where source system structures frequently changed.
  • Managed SQL Server databases, ensuring availability, performance, and data integrity.
  • Implemented Lambda functions for data validation, transformation, and enrichment before loading into downstream systems.
  • Implemented monitoring and alerting solutions using CloudWatch, SNS, and custom dashboards.
  • Improved data warehouse query performance by 60% through Redshift schema optimization, and workload tuning.
  • Designed and optimized Snowflake architecture for multi-domain data processing and enterprise-scale analytics workloads.
  • Designed and implemented serverless, real-time data pipelines using Amazon Kinesis for continuous event processing.
  • Developed dynamic and metadata-driven Airflow DAGs to orchestrate data ingestion, transformation, and loading processes.
  • Leveraged Spark's distributed architecture to reduce processing times and improve data pipeline scalability.
  • Automated enterprise data pipelines using Airflow and Glue, reducing manual operational effort by 80%.

Solution Architect (Senior Manager)

Genpact India Pvt. Ltd.
09.2019 - 04.2022
  • Designed and managed cloud solutions using AWS Elastic Beanstalk, EC2, Lambda, Load Balancer, CloudWatch, S3, Secrets Manager, ACM, CloudFront, Route 53, RDS, DynamoDB, DocumentDB, Kinesis, Glue, Athena, EventBridge, CloudFormation, CodeCommit, CodeBuild, and CodeDeploy.
  • Integrated Snowflake with AWS services, including S3, Lambda, Glue, and EventBridge, to support automated data processing.
  • Implemented role-based access control (RBAC) and least-privilege access models across Snowflake environments.
  • Developed high-performance ETL pipelines to load large-scale datasets into Redshift from S3.
  • Designed and maintained scalable data lakes on AWS S3 for storing structured, semi-structured, and unstructured data.
  • Integrated Airflow (MWAA) with AWS services, including S3, Glue, Lambda, Redshift, Snowflake, and EMR.
  • Designed and developed scalable ETL pipelines using AWS Glue for batch and incremental data processing.
  • Optimized complex Snowflake queries, reducing execution times by up to 60% through clustering strategies and query tuning.
  • Built and maintained high-performing teams through technical coaching, knowledge-sharing sessions, and career development initiatives.
  • Successfully led a team of 12 data engineers in delivering a cloud-based data platform that processes more than 3 TB of data daily.
  • Improved team productivity by 30% through reusable ETL frameworks, standardized development practices, and mentoring initiatives.
  • Owned production support and incident management processes, ensuring adherence to SLAs, and rapid issue resolution.
  • Reduced end-to-end data latency from 4 hours to less than 2 minutes through the implementation of a Kafka-based event streaming architecture.
  • Improved ETL processing performance by 50% using Snowflake Streams, Tasks, and optimized ELT design.
  • Led a team of data engineers in successfully migrating over 300 tables from legacy systems to Snowflake with minimal downtime.
  • Developed serverless applications using AWS Lambda to automate data processing and orchestration tasks.
  • Optimized Lambda performance through memory tuning, concurrency controls, and cold-start mitigation strategies.
  • Built scalable, microservices-based APIs using Python (Flask/FastAPI) to support data engineering workflows.
  • Developed large-scale, distributed data processing applications using PySpark.
  • Built real-time and batch processing pipelines, integrating Spark with Kafka, Kinesis, S3, Snowflake, and Redshift.

Lead Data Engineer

Cars24
04.2019 - 09.2019
  • Developed and maintained Snowflake databases, schemas, tables, views, materialized views, and secure views to support analytics and reporting requirements.
  • Successfully led a team of 10 or more engineers in delivering a cloud-based data platform that processes more than 1 TB of data daily.
  • Optimized complex Snowflake queries, reducing execution times by up to 60% through clustering strategies and query tuning.
  • Led the migration of legacy ETL workloads to the AWS cloud, resulting in 35% cost savings and improved scalability.
  • Built scalable REST APIs supporting over 100,000 requests per day, with high availability and security controls.
  • Created event-driven architectures leveraging Lambda triggers from S3, SNS, SQS, and EventBridge.
  • Optimized Lambda performance through memory tuning, concurrency controls, and cold-start mitigation strategies.
  • Implemented time travel and fail-safe capabilities to support data recovery and compliance requirements.
  • Designed and implemented a real-time data pipeline to process structured and semi-structured data using Python, Amazon S3, and Amazon RDS.
  • Implemented Change Data Capture (CDC) solutions using Kafka to enable near real-time data synchronization across systems.
  • Automated ETL processes handle millions of rows of data, reducing the manual workload by 75% on a monthly basis.

Lead Data Architect (Manager)

Jabong.com (Novarris & Jade Pvt. Ltd.)
08.2012 - 04.2019
  • Automated data ingestion workflows from multiple sources into S3 using AWS services, and custom ETL pipelines.
  • Managed secure access to S3 buckets using IAM roles and bucket policies.
  • Automated data ingestion processes using Lambda and APIs, reducing manual effort by 80%.
  • Developed high-performance ETL pipelines to load large-scale datasets into Redshift from S3.
  • Improved Redshift query performance by implementing appropriate distribution styles, sort keys, and compression encoding.
  • Integrated third-party APIs to automate data acquisition and business process workflows.
  • Developed RESTful APIs for secure data ingestion, extraction, and integration with internal and external systems.
  • Integrated Lambda with API Gateway to expose scalable, serverless APIs for data services.
  • Led and mentored a team of 8+ data engineers, driving the design, development, and support of enterprise-scale data platforms.

VBA & SQL Developer (Analyst)

Genpact
08.2011 - 08.2012
  • Designed, built, and maintained data pipelines that automate the extraction, transformation, and loading (ETL) of data from various sources (APIs, DBs, CSV, or JSON files).
  • Optimized complex SQL queries.

VBA & SQL Developer

T&M Services Consulting Pvt. Ltd.
07.2009 - 01.2011
  • Created Inventory management tool using VBA and SQL.

Data Analyst

Barclays Finance Pvt. Ltd. (ATS & Penta)
01.2008 - 04.2009
  • Automated manual reports using VBA, SQL, excel and access.

Education

Bachelor of Arts - Art Education

Kurukshetra University
Kurukshetra, Haryana
06.2004 - 06.2007

Skills

Snowflake

Amazon Redshift

PySpark

Amazon Web Services (AWS)

ETL / ELT Pipeline Design

Data Pipeline Orchestration (Airflow/MWAA)

Batch & Real-Time Data Processing

Data Integration & Ingestion

Data Migration

Python

SQL / T-SQL

Relational Databases

NoSQL Databases

Query & Performance Tuning

Data Security & Governance

CI/CD Pipelines

API Development & Integration

Timeline

Lead of Data Engineering

Wortgage Finance Private Limited (WeRize)
09.2025 - Current

Associate Principle Architect (Data Engineering)

LTIMindtree
06.2024 - 12.2024

Sr. Technical Architect

NexGen IOT Solutions Pvt. Ltd.
12.2022 - 05.2024

Solution Architect (Senior Manager)

Genpact India Pvt. Ltd.
09.2019 - 04.2022

Lead Data Engineer

Cars24
04.2019 - 09.2019

Lead Data Architect (Manager)

Jabong.com (Novarris & Jade Pvt. Ltd.)
08.2012 - 04.2019

VBA & SQL Developer (Analyst)

Genpact
08.2011 - 08.2012

VBA & SQL Developer

T&M Services Consulting Pvt. Ltd.
07.2009 - 01.2011

Data Analyst

Barclays Finance Pvt. Ltd. (ATS & Penta)
01.2008 - 04.2009

Bachelor of Arts - Art Education

Kurukshetra University
06.2004 - 06.2007

Associate Director - Technology & Data Engineering

Zivore (Libas)
4 2022 - 11.2022
Manoj SainiLead Data Engineer | Data Architect | Data Platform Leader