Pratyush Kanti Kar

Bangalore

Summary

Results-driven Senior Data Engineer with over 8 years of experience in building and optimizing scalable, production-ready data solutions in cloud-native environments (GCP, AWS). Proven success in reducing data pipeline runtimes, driving forecasting accuracy, and delivering high-impact ETL solutions across healthcare and financial domains. Skilled in Python, SQL, BigQuery, Snowflake, Spark, and Airflow. Known for architecting efficient data models, implementing best practices for observability, and enabling data-driven decision-making across enterprise-scale systems.

Overview

years of professional experience

Work History

Senior Data Engineer

Cardinal Health International India

Bangalore

10.2022 - Current

Developed and deployed 20+ robust ETL pipelines using Python, SQL, and Airflow, reducing operational delays, and improving shipment accuracy.
Engineered ML-ready datasets for time-series forecasting and regression models, helping reduce overtime by 18%, and labor costs by 12%.
Delivered end-to-end ingestion from MS SQL, Oracle, MySQL, and Manhattan WMS into BigQuery, processing over 500 GB per day.
Improved pipeline runtime by 50% and cut deployment cycle time by 30% by redesigning workflows and leveraging Vertex AI notebooks.
Revamped schema design using normalized/denormalized tables and SCD handling; improved dashboard refresh times by 40% in Tableau.
Implemented containerized Airflow jobs on Kubernetes, improving fault tolerance, and reducing pipeline recovery time by 60%.

Data Engineer

Legato Health Technologies LLP, Anthem Inc.

Bangalore

08.2021 - 10.2022

Created 15+ scalable ETL pipelines with PySpark and SQL, integrating data from HDFS, RDBMS, and Teradata into Hive, S3, and Snowflake.
Optimized Spark jobs to reduce memory usage by 35% and execution time by 30%, leading to approximately $25K in annual cloud cost savings.
Designed a Python/SQL-based tool for automated Teradata script impact analysis, reducing manual QA time by 90%.
Implemented partitioning and bucketing strategies, enhancing data processing throughput by 3x, and boosting query performance.
Provided L2 production support, resolved critical issues within SLA, and improved data pipeline reliability.

ETL Developer

TATA CONSULTANCY SERVICES INDIA

Bangalore

09.2016 - 08.2021

Built and maintained PySpark-based ETL pipelines, processing over 1 TB per month, loading structured and unstructured data into Hive.
Led a performance optimization initiative, reducing ETL latency by 40% by profiling and tuning Spark jobs.
Collaborated with data analysts and business stakeholders to understand requirements, ensuring clean, audit-ready data delivery.
Wrote advanced SQL scripts for Oracle to support analytics teams, and proactively resolved schema discrepancies.
Managed release cycles, prepared technical documentation, and ensured successful UAT, and production deployments.

Education

Bachelor of Technology -

Biju Patnaik University of Technology

Bhubaneswar, Odisha

04-2016

Skills

Programming & Scripting: Python, SQL, PySpark
Cloud Platforms: GCP (BigQuery, Vertex AI, Cloud Storage), AWS (S3, Lambda, Glue)
Data Warehousing: BigQuery, Snowflake, Redshift, Hive, Teradata, Oracle, MS SQL
Big Data & Processing: Apache Spark, Hadoop, YARN

Workflow Orchestration: Apache Airflow
Infrastructure & DevOps: Kubernetes, Docker, GitHub
Data Governance & Observability: Data quality frameworks, metadata management, JIRA, Confluence
Visualization Tools: Tableau, Looker, LucidChart

Timeline

Senior Data Engineer

Cardinal Health International India

10.2022 - Current

Data Engineer

Legato Health Technologies LLP, Anthem Inc.

08.2021 - 10.2022

ETL Developer

TATA CONSULTANCY SERVICES INDIA

09.2016 - 08.2021

Bachelor of Technology -

Biju Patnaik University of Technology