Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Timeline
Generic

Dileep A

Data Engineer
Hyderabad,TG

Summary

Data Engineer with over 4 years of experience in designing and implementing scalable data pipelines and optimizing data infrastructure. Expertise in big data tools, cloud platforms, and data visualization, complemented by strong skills in Python, SQL, and distributed computing. Demonstrated ability to support machine learning and deep learning models, focused on driving innovation in data-driven decision-making.

Overview

4
4
years of professional experience
1
1
Certification

Work History

Data Engineer

Sony India Software Center
Bangalore
08.2022 - 07.2024
  • Designed and implemented real-time streaming data pipelines using Azure Event Hubs, Azure Stream Analytics, and Apache Spark, reducing data latency by 40%.
  • Developed ETL pipelines with Azure Data Factory (ADF) and PySpark, automating data ingestion and transformation for large-scale datasets.
  • Built and optimized Azure Synapse Analytics-based data warehouses, improving query performance by 35% through indexing, partitioning, and caching strategies.
  • Integrated machine learning models into data pipelines using Azure Databricks, enabling automated insights for business intelligence and analytics teams.
  • Designed and managed data orchestration workflows using Azure Data Factory (ADF) and Apache Airflow, ensuring automated and fault-tolerant data processing.
  • Implemented data monitoring and logging frameworks using Azure Monitor, Log Analytics, and the ELK stack, improving system reliability and debugging efficiency.

Technologies: Azure Data Factory (ADF), Azure Synapse Analytics, Azure Databricks, PySpark, Apache Kafka, Azure Event Hubs, Airflow, Azure Blob Storage, Azure Monitor, ELK Stack, Docker, and Kubernetes.

Data Engineer

Olam
Chennai
08.2021 - 05.2022
  • Developed ETL pipelines for ingesting and processing satellite imagery and environmental data using Python, Airflow, and SQL.
  • Designed scalable workflows to support news analytics and web scraping.
  • Optimized data workflows using Apache Airflow, reducing manual processing time by 40%.
  • Managed geospatial datasets with PostgreSQL/PostGIS and QGIS, improving spatial analysis capabilities.
  • Deployed data workflows on cloud platforms (Azure), ensuring high-performance data processing. Developed and maintained scalable ETL pipelines for ingesting and processing hazelnut crop yield datasets, improving data accessibility for analytics.
  • Designed real-time data ingestion solutions using Azure Functions and Azure Cosmos DB, reducing manual data processing efforts by 50%.
  • Created interactive dashboards using Tableau and Power BI, leveraging SQL-based queries for data extraction and visualization.
  • Implemented data preprocessing techniques for satellite imagery and climate data, enhancing the predictive capabilities of crop yield models.
  • Worked on spatial analytics using GIS data, integrating location-based insights into agricultural strategy planning.

Technologies: Python, SQL, Tableau, Power BI, AWS Lambda, DynamoDB, Airflow, GIS

Junior Data Engineer

Quanted Technologies
Bangalore
04.2020 - 07.2021
  • Developed and optimized Apache Spark (PySpark) jobs on Hadoop clusters, improving data retrieval performance, and efficiently handling large-scale datasets.
  • Designed and implemented HDFS-based data pipelines, ensuring efficient storage and processing of structured and unstructured data.
  • Tuned Hive and Impala queries, reducing query execution time, and improving overall system performance for analytics use cases.
  • Integrated Apache Kafka for real-time event processing, enabling seamless data ingestion and stream processing.
  • Monitored and troubleshot YARN-based job executions, identifying performance bottlenecks, and optimizing resource allocation for better job execution efficiency.

Technologies : Hadoop,Pyspark,SQL,Apache Kafka,Cloudera,Apache Hive,Apache Pig.

Education

MTech - Data Science

Amrita Vishwa Vidyapeetham
Bangalore, India
01.2022

BTech - ECE

Aurora Engineering College
Hyderabad, India
01.2019

Skills

  • Apache Spark
  • Apache Kafka
  • Apache Airflow
  • Hadoop
  • DVC
  • Python
  • SQL
  • Java
  • Scala
  • Azure
  • AWS
  • PostgreSQL
  • MySQL
  • MongoDB
  • Cassandra
  • PySpark
  • Tableau
  • Power BI
  • Matplotlib
  • Seaborn
  • TensorFlow
  • PyTorch
  • Scikit-Learn
  • OpenCV
  • Git
  • GitHub
  • Azure DevOps
  • Docker
  • Kubernetes
  • Windows
  • Linux
  • ETL processes
  • Dashboard visualization

Certification

  • Azure Data factory for Data Engineers - Udemy
  • The Data Analyst Course - Data Analyst Complete Bootcamp - Udemy
  • Python for Computer Vision with OpenCV and Deep Learning - Udemy
  • Amazon Web Services Machine Learning Essential Training - LinkedIn-Learning
  • SQL - Udemy
  • Python - Coursera
  • Big Data - Coursera

Accomplishments

  • Dileep/Dr. Beena IEEE/Conference (2022) - 2nd International Conference on Intelligent Technologies (CONIT): Global Distribution and Price Prediction of Electric Vehicles using Machine Learning

Timeline

Data Engineer

Sony India Software Center
08.2022 - 07.2024

Data Engineer

Olam
08.2021 - 05.2022

Junior Data Engineer

Quanted Technologies
04.2020 - 07.2021

MTech - Data Science

Amrita Vishwa Vidyapeetham

BTech - ECE

Aurora Engineering College
Dileep AData Engineer