Summary
Overview
Work History
Education
Skills
Certification
PERSONAL DETAILS
Timeline
Generic

Rajnikant Vanpratiwar

Pune

Summary

Dynamic data platform professional with over 18+ years of experience in data management and engineering, specializing in agile methodologies and cross-functional collaboration. Proven expertise in modernizing legacy systems using AWS technologies, transitioning to real-time ETL processes, and designing scalable data pipelines with Python, PySpark, and DBT. Skilled in architecting advanced data warehousing solutions and integrating cloud-native workflows while promoting secure and cost-effective cloud practices. Strong leadership abilities focused on mentoring teams, delivering impactful analytics solutions, and driving data-driven decision-making through innovative statistical modeling and real-time architecture.

Overview

19
19
years of professional experience
1
1
Certification

Work History

Senior Solution Architect

CitiusTech
11.2023 - Current
  • Tech Stack: Python, SQL, AWS Services- Athena, Glue, EMR, IAM, EC2, Redshift, Sagemaker ECS/EKS, DBT, Databricks etc
  • Roles:
  • Modernized legacy data systems using AWS (S3, Glue, Athena, Redshift, EC2, EKS, CloudWatch). Designed end-to-end architecture for ingestion, processing, and analytics. Led migration from batch to real-time ETL
  • Built scalable pipelines using Python (OOP), Spark SQL, DBT for modular modeling, and PL/SQL for legacy integration.
  • Deployed containerized apps using Docker and Kubernetes (EKS). Enabled observability with CloudWatch and enforced cost-efficient, secure cloud practices.
  • Designed optimized Redshift warehouses with partitioning and compression. Managed S3-based data lakes integrated with Glue Data Catalog.
  • Supported RFPs and POCs with pre-sales teams. Presented solutions aligned with client needs and business goals.
  • Led delivery teams across the lifecycle. Mentored engineers on cloudadoption, pipeline orchestration, and data engineering best practices.
  • Solutions implemented with GenAI like Data Integration service using Apache NiFi, Synthetic Data Generator, SQL based platform migration & Modernization, MultiCloud data Synchronization, Conversational Analytics etc

Associate Principal-Architecture

LTIMindtree
07.2021 - 11.2023
  • Tech Stack: -Spark, Python, Unix Shell Scripting, GCP DataProc, Compute VM, GCS, BigQuery, Cloud Function, Airflow, EC2, Redshift, Databricks, PostgreSQL, Confluent Kafka, Looker/Tableau
  • Roles:
  • GCP Data Migration-Conceptualization of migration strategy and implemented migration of On-Prem Data warehouse to GCP.
  • GCP Data Migration-GCP Cloud function to parse DDL file from Onprem to create Config table for Generic ingestion process. GCP DataProc service to execute pyspark based ETL jobs.
  • GCP Data Migration-Apache Airflow used for orchestration of databricks/DataProc jobs using inbuilt airflow operators.
  • GCP Data Migration-BigQuery Service to get summarized view of datasets to be published over to Tableau/Data Studio for visualization.
  • GCP Data Migration-GCP Log Explorer to collect customized job logs and application logs. GCP Monitoring API for resource monitoring.
  • Creating a serviced led delta architecture to prepare Operation Data Store (ODS) in Databricks. Designed & Implemented SCD1&2
  • Workflow in Raw and Refined Zone based in Cloud Postgres & Big Query.
  • Implemented Delta Live Table (DLT) workflows in Data Bricks to capture data from Landing Zone to Raw/Refined Layer. Implemented DeeQue/DVT(Data Validation Tool) in Python for Data consistency check in source and target zones.
  • Designed data models for complex analysis needs. Collected, cleansed and provided modeling with analyses of structured and unstructured data used for business initiatives for cost cutting.
  • Performed impact analysis, performance tuning, capacity planning for enterprise data warehouse and its infrastructure as source system and new integration business rules and logic are introduced.
  • Prototyped machine learning applications and quickly determined application viability.
  • Created customized applications to make critical predictions, automate reasoning and decisions and calculate optimization algorithms.
  • Designed, implemented and evaluated new models and rapid software prototypes to solve problems in machine learning and systems engineering.

Big Data Consultant

Vodafone shared services India Pvt Ltd
11.2018 - 07.2021
  • Tech Stack: Spark, Python, GCP DataProc, BQ, Airflow, Splunk, Tableau, Data Studio, PagerDuty, Hive, Impala, Oozie etc
  • Roles:
  • Involved in Analysis and design of APIX/DxL Products
  • Preparation of Splunk dashboards for European markets and API Specific for Ops users
  • Designed and presenting Tableau Dashboards (Based on OnPrem Hadoop Cluster) for Business users
  • Datameer based jobs Migration to Pyspark for performance improvement and stability.
  • End to end workflow setup of APIX Analytics jobs on GCP Platform using Apache Airflow tool.
  • POC – DataStudio Matrix report from Stack driver logs in GCP. This is basically to see if Splunk logs can be used in GCP with DataFlow.
  • POC – PagerDuty setup for alerting mechanism from Splunk dashboards.
  • GCP Monitoring API to setup failure alerts for daily job execution, Prepared dashboards for performance measurement and details about resource consumptions.
  • Spark Performance Tuning- Optimized all KPIs to improve on response time thereby changing Instream data format (Avro to Json) and output format to Parquet while uploading to Native BigQuery Tables, implemented purge policy to automate deletion of expired files etc

Assistant Consultant

Tata Consultancy Services Ltd
08.2013 - 11.2018
  • Tech Stack: Spark, Python, BigData Hadoop, CloudEra Manager, Navigator, Shell Script, Hive/Impala, Apache Ignite etc
  • Roles:
  • Collaborated with Hadoop administrators to design and configure CDH/Hadoop clusters, ensuring successful sign-off and deployment.
  • Defined and implemented Application Onboarding and Change Management processes, streamlining the application lifecycle on the data lake platform.
  • Handled capacity and workload management, monitored cluster health, and supported platform upgrades and maintenance.
  • Published best practice guidelines for platform users by conducting POCs in access control, security, workload management, and design patterns.
  • Designed data ingestion patterns for various source systems and developed reusable utilities and APIs to streamline data loading.
  • Conducted performance benchmarking and defined use cases for ecosystem tools like Hive-on-Spark, Impala, Hive MR, and Spark SQL.
  • Set up and managed Kafka brokers and agents, defined Kafka topics to standardize streaming ingestion.
  • Implemented Security with Kerberos authentication to secure HDFS and data objects, ensuring compliance with enterprise security standards.
  • Developed robust monitoring and auditing tools to enforce platform standards and provide visibility into platform usage.
  • Researched and evaluated industry solutions for use cases like real-time ingestion using in-memory databases and cross-platform job scheduling tools.
  • Led lift-and-shift migration of legacy Hadoop applications onto the newly established Data Lake platform.

Tech Lead

iNautix Technologies, The Bank of New York Mellon Corporation
11.2012 - 08.2013
  • Tech Stack: Unix Shell Script, DB2, etc
  • Roles:
  • Performed detailed impact analysis for new requirements and system changes to ensure stability and compliance across modules.
  • Prepared functional specifications and technical design documents in alignment with business requirements and system architecture.
  • Led development and unit testing efforts, ensuring timely delivery and adherence to quality standards.
  • Coordinated with cross-functional teams including QA, infrastructure, and business analysts to ensure smooth end-to-end project execution.
  • Managed project implementation across multiple releases, ensuring successful deployment and post-release support.

Senior Software Engineer

HSBC Global Technology India Pvt Ltd
03.2007 - 11.2012
  • Roles:
  • Conducted requirement gathering through direct engagement with business users and stakeholders.
  • Developed proof-of-concept (POC) solutions to validate approaches and gain client approvals for finalizing functional and technical specifications.
  • Possessed strong domain knowledge in Account Systems, Bill Payments, Funds Transfers, ACH, and Wires, supporting end-to-end system design and implementation.
  • Held sole responsibility for system analysis, requirement documentation, system design, testing, and implementation support.
  • Provided Level 2 and Level 3 production support aligned with the ITIL framework, including Problem Management (IFS) and Change Management.
  • Served as an Onsite Coordinator in Buffalo, NY, successfully delivering multiple Internet Banking projects and coordinating with offshore and cross-functional teams.

Education

Executive Program - AI & ML: Data Science

IIM Lucknow
Lucknow
01-2023

Bachelor of Engineering - undefined

Shri Guru Gobind Singhji Institute of Engineering & Technology
Nanded, MS
01.2006

Skills

  • Database Technologies: SQL, MySQL, Postgres, Impala/Hive, Splunk
  • Cloud Platforms: AWS – EC2, Athena, Glue Catalog, CloudWatch, ECS/EKS, SageMaker, S3, Redshift, EMR, IAM, K8S etc
  • Cloud Platforms: GCP - Cloud Composer, DataProc, Monitoring API, BigQuery, Data Studio, Apache Airflow, Cloud Functions, Cloud Postgres SQL, Databricks
  • Industry Standards & Tools: HL7, FHIR
  • Programming: Python programming with OOPs, Unix Shell scripting, Scala

Certification

  • AWS Certified AI Practitioner
  • Google Certified Professional Data Engineer
  • Google Certified Professional Data Architect
  • Databricks Developer Essentials | Generative AI Fundamentals
  • Coursera-USA Healthcare
  • Coursera - Functional programming principles in scala | Big Data analysis with scala and spark | Hadoop Platform and application framework | Hive Optimization & Big data analysis

PERSONAL DETAILS

Languages Known: English, Hindi, Marathi

Timeline

Senior Solution Architect

CitiusTech
11.2023 - Current

Associate Principal-Architecture

LTIMindtree
07.2021 - 11.2023

Big Data Consultant

Vodafone shared services India Pvt Ltd
11.2018 - 07.2021

Assistant Consultant

Tata Consultancy Services Ltd
08.2013 - 11.2018

Tech Lead

iNautix Technologies, The Bank of New York Mellon Corporation
11.2012 - 08.2013

Senior Software Engineer

HSBC Global Technology India Pvt Ltd
03.2007 - 11.2012

Bachelor of Engineering - undefined

Shri Guru Gobind Singhji Institute of Engineering & Technology

Executive Program - AI & ML: Data Science

IIM Lucknow
Rajnikant Vanpratiwar