Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Timeline
Generic
DAYANAND PALLEMONI

DAYANAND PALLEMONI

Hyderabad,Telangana

Summary

Results-driven Data Engineer with over 6+ years of expertise in building and optimizing large-scale data processing solutions. Skilled in Spark Scala, SQL, Hadoop, Hive, and Google Cloud Platform (GCP), I specialize in designing high-performance data pipelines for seamless ingestion, transformation, and analysis. My experience includes developing and optimizing ETL workflows using Spark Scala and SQL for massive datasets, building scalable data architectures on GCP to ensure efficiency and reliability, and orchestrating workflows with Apache Airflow for automation and seamless execution. With a strong focus on performance tuning and resource optimization, I enhance data processing speed while ensuring data integrity, security, and compliance with industry best practices. Collaborating with cross-functional teams, I have successfully improved data pipeline efficiency, reduced processing time, and automated workflows, leading to better decision-making and operational excellence. Passionate about delivering scalable, cost-effective, and high-performance data solutions, I continuously strive to drive innovation and business success through data.

Overview

7
7
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Techno Identity
01.2025 - Current
  • Designed and optimized ETL workflows using Spark Scala for large-scale data processing.
  • Developed and maintained efficient data pipelines utilizing SQL, Hadoop, and Hive for structured and unstructured data.
  • Built scalable data architectures on Google Cloud Platform (GCP) to support batch and real-time processing.
  • Automated data pipeline orchestration and deployment using Apache Airflow, ensuring seamless workflow execution.
  • Optimized Spark jobs through performance tuning techniques such as partitioning, bucketing, and caching to enhance processing speed.
  • Collaborated with cross-functional teams to gather business requirements and implement data-driven solutions.
  • Integrated multiple data sources into Hadoop ecosystems, ensuring data consistency and accessibility.
  • Developed and fine-tuned SQL queries for data transformation, reporting, and analytics.
  • Ensured data quality, governance, and compliance with industry best practices.
  • Troubleshot and resolved pipeline failures, optimizing system performance and reliability.
  • Managed big data processing with Hive, Spark SQL, and Hadoop, leveraging distributed computing frameworks for efficiency.

Data Engineer

Turbito Infotainment Private Limited
10.2018 - 11.2024
  • Led the design and enhancement of ETL workflows utilizing PySpark and Spark Scala for efficient data processing.
  • Architected and implemented scalable data solutions on Google Cloud Platform (GCP).
  • Built and maintained robust data pipelines for both batch and real-time processing using PySpark and Spark Scala.
  • Partnered with cross-functional teams to gather business requirements and deliver optimized data solutions.
  • Optimized Spark jobs to enhance processing efficiency, reduce execution time, and improve resource utilization.
  • Streamlined data pipeline deployment through automation using Apache Airflow.
  • Integrated multiple data sources and ensured seamless data ingestion and transformation for analytics.
  • Developed and optimized complex SQL queries for data extraction, transformation, and reporting.
  • Ensured data integrity, quality, and compliance with industry standards and best practices.
  • Conducted root cause analysis and troubleshooting to resolve pipeline failures and improve system reliability.
  • Implemented performance tuning techniques such as partitioning, bucketing, and caching in Spark to enhance query performance.
  • Worked with distributed computing frameworks, leveraging Hadoop, Hive, and Spark SQL for big data processing.

Education

Btech - Mechanical Engineering

JNTUH
Hyderabad
01.2017

Skills

  • Big Data Technologies
    PySpark

    Spark

    Hive

    HDFS

    Data Integration & Storage
    Sqoop

    MapReduce

    HBase

    M7 Database

    Databases & Cloud

    MySQL

    PostgreSQL

    Google Cloud (GCP)

    Amazon Web Services (AWS)

    Programming & Operating Systems
    Python

    Scala

    SQL

    Windows / Linux

Accomplishments

  • Reduced data processing time by 35% through advanced performance optimization.
  • Successfully migrated on-premises data pipelines to Google Cloud Platform, improving scalability and reducing costs by 20%.
  • Automated over 90% of ETL processes, significantly reducing manual effort and enhancing operational efficiency.
  • Enhanced data security by implementing strict IAM roles and policies.

Certification

Projects GCP Data Lake Implementation

Technologies: GCP (BigQuery, Dataflow, Cloud Storage, Cloud Pub/Sub), PySpark, Python, SQL

  • Led the end-to-end implementation of a centralized GCP-based data lake, enabling enterprise-wide data management.
  • Designed and developed real-time and batch data ingestion pipelines using Cloud Pub/Sub and Dataflow, enhancing data availability and reducing processing latency.
  • Utilized BigQuery for scalable data warehousing and complex analytical queries, enabling faster business insights.
  • Built PySpark-based data transformation workflows, ensuring data quality, consistency, and reliability across multiple sources.
  • Automated data pipeline deployment and monitoring using Apache Airflow, reducing manual intervention by 90%.
  • Integrated machine learning models into data pipelines, enhancing predictive analytics for improved business decisions.
  • Implemented data security best practices, including IAM roles, encryption, and audit trails to ensure compliance.
  • Conducted cost optimization and capacity planning, leading to a 20% reduction in cloud expenses.
  • Provided comprehensive documentation and conducted training sessions for internal teams, ensuring a smooth transition and effective platform maintenance.
Customer Matching System

Technologies: PySpark, Python, SQL, Hive, Spark (Scala), GCP

  • Developed and optimized data pipelines to streamline customer matching and profiling for a financial services client.
  • Utilized Apache Spark (Scala and PySpark) for efficient large-scale data processing and transformation.
  • Designed ETL workflows to ensure timely and accurate data processing for reporting and analytics.
  • Collaborated with business stakeholders to align data solutions with operational and compliance requirements.
Collection & Recovery Data Platform

Technologies: Apache Spark, Spark SQL, Scala, Hive, Shell Scripting, SQL

  • Engineered and maintained scalable data pipelines to support collection and recovery operations for financial services.
  • Leveraged Apache Spark for large-scale data ingestion, transformation, and analytics, ensuring timely insights.
  • Optimized ETL processes to enhance data accuracy, consistency, and performance.
  • Partnered with cross-functional teams to develop custom data solutions tailored to business needs.
  • Designed and managed data warehouses using Hive, improving query performance and accessibility for reporting teams.

Timeline

Senior Data Engineer

Techno Identity
01.2025 - Current

Data Engineer

Turbito Infotainment Private Limited
10.2018 - 11.2024

Btech - Mechanical Engineering

JNTUH
DAYANAND PALLEMONI