Summary
Overview
Work History
Education
Skills
Projects
Timeline
Generic

Kummetha Jayasimha Reddy

Bangalore

Summary

Accomplished Data Engineer with expertise from TCS, specializing in architecting scalable data pipelines and optimizing performance using PySpark and Azure. Proven track record in enhancing data processing speed by 40% and ensuring data integrity through automated validation. Strong collaborator and problem-solver, adept at delivering actionable insights for business intelligence.

Overview

3
3
years of professional experience

Work History

Data Engineer

TCS
Bangalore
07.2022 - Current
  • As a key member of the data engineering team at TCS, I architected and implemented scalable, fault-tolerant data pipelines, enabling business intelligence teams to gain actionable insights from large-scale data.
  • Designed and optimized end-to-end data pipelines using PySpark, Hadoop, and Airflow, processing over 15 TB of data daily, improving data processing speed by 40%.
    Implemented real-time data ingestion solutions using Kafka and AWS Lambda, handling high-velocity data streams for analytics and reporting.
  • Built and deployed scalable data models in Azure Synapse Analytics, resulting in a 30% improvement in query performance for data analysis.
  • Led the design of ETL workflows to ingest data from AWS S3, apply transformations with PySpark, and load to Redshift, ensuring data integrity and consistency across platforms.
  • Automated data validation processes (null checks, uniqueness, foreign key constraints) to ensure accurate data in production systems.
  • Collaborated with product teams to build custom reporting solutions using Tableau and Looker.

Education

B Tech - Computer Science And Engineering

Karunya University
Coimbatore
06-2022

Skills

  • Programming: Python, SQL, Java, Scala
  • Big Data Technologies: Apache Spark (PySpark), Hadoop, Kafka, Hive
  • Cloud Platforms: Azure (Synapse, Data Factory), GCP (BigQuery, Dataflow, Pub/Sub)
  • Data Pipelines: Apache Airflow, Data Factory
  • Data Warehousing: BigQuery, Synapse
  • LIBRARIES/FRAMEWORKS MapReduce Hadoop
  • DevOps/Automation: Docker, Terraform, Jenkins, CI/CD pipelines
  • Tools: Git, Jupyter Notebooks

Projects

Protegrity Data Security Implementation

Role: Data Engineer / Security Integration Specialist, Tools & Tech: Protegrity, Hadoop, Spark, Hive, HDFS, Policy Manager

  • Implemented enterprise-grade data encryption and tokenization solutions using Protegrity to secure sensitive PII and financial data.
  • Integrated Protegrity seamlessly into the big data ecosystem (Hadoop, Spark), optimizing policies to minimize latency and preserve performance
  • Collaborated with data governance and compliance teams to align with industry regulations (e.g., PCI DSS, GDPR).
  • Performed end-to-end testing and fine-tuning of security workflows to ensure operational efficiency and scalability.

Hortonworks to In-House Platform Data Migration

Role: Data Engineer / Migration Specialist, Tools & Tech: Hortonworks, Spark, Hive, HDFS, Airflow, Python, SQL

  • Led the migration of enterprise data pipelines from Hortonworks to an internally hosted Big Data environment, ensuring compatibility and optimized performance.
  • Analyzed and redesigned existing ETL workflows for scalability, rebuilding them using Spark and Hive with best practices for data processing efficiency.
  • Tuned Spark jobs and Hive queries to improve execution times and reduce resource consumption.
  • Managed schema mapping, data validation, and reconciliation to maintain data integrity and consistency throughout the transition
  • Ensured minimal downtime by conducting detailed impact assessments, dry runs, and post-migration performance monitoring.

Timeline

Data Engineer

TCS
07.2022 - Current

B Tech - Computer Science And Engineering

Karunya University
Kummetha Jayasimha Reddy