Summary

Overview

Work History

Education

Skills

Personal Information

Accomplishments

Domains

Overall Experience

Timeline

Ramesh Sunkaraboina

Hyderabad

Summary

Data Processing Proficiency: Demonstrated expertise in constructing and maintaining large-scale data pipelines, with practical experience in handling petabyte-scale batch and streaming data using Apache Spark and Apache Pulsar. Cloud Platforms Mastery: Skilled practitioner in Google Cloud Platform (GCP) services including BigQuery, Dataflow, Pub/Sub, and Cloud Storage, with proficiency in ETL/ELT processes, data modeling, and real-time data streaming. Data Pipeline Architecture: Proven track record in designing and constructing data pipelines, ensuring data optimization through cleaning, transformation, and management across various cloud sources such as AWS, GCP, Redshift, and Azure. Data Storage Solutions: Proficient in the design and implementation of sophisticated Data Warehouses, Data Lakes, and Data Lakehouses, strictly adhering to Data Warehousing Principles. Comprehensive Core Competencies: Well-versed in Data Engineering and SQL Development, with particular emphasis on ELT/ETL processes and Apache Spark utilization. Project Execution and Management: Successfully built end-to-end pipelines, integrating sources like Teradata and Oracle, and embracing methodologies such as Scrum for agile deployment. Data Quality and Performance Optimization: Initiated data quality checks at the initial stages, implemented data consistency validations across systems, and finetuned dataset performance. Transformation and Flow Management: Devised and executed ELT pipelines, overseeing 10+ pipelines and 20+ workflows, directed towards streamlining cleaning and transformation, thereby enhancing data accessibility. Continuous Integration and Deployment: Managed release cycles using Jenkins, resolving build issues promptly and ensuring on-time delivery of commitments. Technological Agility: Adept in leveraging various programming languages and cloud technologies for data processing, analytics, orchestration, and quality assurance such as Python, PySpark, ELT/ETL, Hadoop, and more.

Overview

years of professional experience

Work History

Senior Data Engineer

Synechron Technologies Pvt Ltd

08.2024 - Current

The project involved managing massive amounts of data generated from various retail source systems such as Teradata, Oracle, and CDP. This data originated from POS systems, e-commerce platforms, supply chains, and customer interactions. Efficient management of this data was critical for optimizing inventory, which required building scalable, reliable, and cost-effective data pipelines for processing and analyzing data using Google Cloud Dataproc (a managed Apache Spark & Hadoop service) and Apache Airflow (or Cloud Composer for orchestration).
Built pipelines from various sources, including Teradata, Oracle, Minerva, and CDP.
Implemented data quality checks at the initial stage while bringing raw data to GCP storage buckets.
Applied schema changes and validated data consistency across systems.
Implemented data cleaning, deduplication, and enrichment logic for retail datasets (e.g., sales, inventory, customer behavior).
Managed the Portfolio table across all platforms and core IDs.
Performed performance tuning on necessary datasets.
Built end-to-end pipelines to bring data from CDP to GCP and made necessary transformations.
Created BigQuery views on top of Hive tables and adjusted them accordingly.
Managed backdated updates from source to target in Hive tables and performed data validations.
Provided production support for the pipeline builds during month-end processes for a smooth transition to the derived layer team.
Made necessary changes in the standard layer according to requirements.
Resolved TeamCity and Jenkins build issues.
Participated in continuous releases using Jenkins.
Environment: Teradata, Oracle, CDP, Google Cloud Dataproc, Apache Airflow, BigQuery, Hive, Jenkins, TeamCity.

Data Engineer

The Modern Data Company

07.2022 - 07.2024

Developed and optimized a data processing platform for a retail company using Apache Spark, handling both batch and real-time data. Streamlined the processing of transactional and customer data from various sources, including databases, cloud storage, and APIs. Designed and implemented scalable ETL pipelines to ensure efficient data transformation and loading into analytics platforms. Improved data accuracy and reporting speed, supporting better decision-making and enhancing customer experience across the retail network.
Designed and constructed an end-to-end Data Lakehouse independently, following Medallion Architecture principles, utilizing tools powered by Apache Iceberg.
Processed a large volume of batch and streaming data daily from diverse sources, including Amazon Redshift, Amazon S3, and REST APIs, for the data pipeline.
Designed and developed ELT pipelines for processing extensive volumes of both batch and streaming data, reaching into the petabyte range, leveraging the capabilities of Apache Spark and Apache Pulsar.
Established over 10 pipelines and 20 workflows to streamline data cleaning and transformation processes, channeling output into Amazon S3 buckets and making it accessible to the Search Portal serving over 1 million customers.
Analyzed logs in Splunk to debug issues.
Addressed and implemented 50+ change requests and bug fixes, ensuring the seamless operation of the data pipeline.
Participated in continuous releases using Jenkins.
Responsible for requirements analysis, technical design, implementation, testing, and documentation.
Ensured on-time delivery of reports as per defined timelines.
Environment: PySpark, Python, Scala, Data Integration, Metadata Management, Data Lineage, Teradata, Oracle, SQL, Spark, Hive, Git, CI/CD, Hadoop, ETL, Data Modeling, Data Quality, Extraction, Performance Tuning.

Associate Data Engineer

National Institute of Indian Medical Heritage (NIIMH)

07.2018 - 07.2022

Developed a real-time data processing system for a healthcare organization using Apache Spark and Kafka, enabling the integration and analysis of patient data from various sources, including electronic health records (EHR), IoT devices, and lab results. Designed ETL pipelines to ensure timely and accurate data transformation, supporting predictive analytics and real-time monitoring of patient health. Implemented robust data cleaning and validation processes to maintain data integrity and compliance with healthcare regulations. Enhanced decision-making capabilities, improving patient outcomes and operational efficiency across the healthcare network.
Involved in requirement gathering, design, and deployment of the application using Scrum (Agile) as the development methodology.
Developed Hive SQL queries, mappings, and tables for analysis across different banners, and worked on partitioning, optimization, compilation, and execution.
Implemented Spark using Scala for faster processing of data.
Utilized batch processing in Spark to improve performance.
Imported data from various sources into Spark RDD for processing.
Used Sqoop to import data from RDBMS to Hadoop.
Created Hive target tables to hold the data after all ETL operations using HQL.
Employed Cloudera Manager for the installation and management of the Hadoop cluster.
Gained experience in working with Spark SQL for processing data in the Hive tables.
Involved in Amazon Web Services (AWS) EMR and S3 for data processing and storage.
Environment: Apache Spark, Kafka, Hive SQL, Scala, Sqoop, Cloudera Manager, AWS EMR, S3.

Technical Consultant

Symbioun Technologies

06.2018 - 07.2019

Company Overview: Developed and maintained a user-friendly support interface, enhancing customer interaction and issue resolution efficiency. Implemented real-time chat features, automated ticketing systems, and integrated knowledge bases to streamline support processes and improve user satisfaction.
Gathered functional and technical requirements for the application.
Made necessary changes to the application that improved its performance, modifying workflow components like forms and links as needed.
Adjusted the user interface according to requirements.
Developed the client-side using Angular 6 and the server-side using PHP and MySQL.
Hired technical personnel for various projects.
Developed and maintained a user-friendly support interface, enhancing customer interaction and issue resolution efficiency. Implemented real-time chat features, automated ticketing systems, and integrated knowledge bases to streamline support processes and improve user satisfaction.

Education

Bachelor of Technology - Computer Science & Engineering

Jawaharlal Nehru Technological University

Skills

SQL
PL/SQL
BigQuery
Teradata
Oracle
Python
PySpark
Apache Spark
DataProc
ELT/ETL
Data Modeling
Hadoop
GCP
Microsoft Azure

CDP (Cloudera Data Platform)
Airflow
Docker
Apache Pulsar
Data Lakehouse
Databases
FastAPI
Soda Data
Quality Framework
OpenMetadata
Benthos
Git
Unix

Personal Information

Title: Data Engineer

Accomplishments

Received Star Award at Modern Data

Domains

Banking
Insurance
HealthCare
Retail

Overall Experience

6 Years 9 Months

Timeline

Senior Data Engineer

Synechron Technologies Pvt Ltd

08.2024 - Current

Data Engineer

The Modern Data Company

07.2022 - 07.2024

Associate Data Engineer

National Institute of Indian Medical Heritage (NIIMH)

07.2018 - 07.2022

Technical Consultant

Symbioun Technologies

06.2018 - 07.2019

Bachelor of Technology - Computer Science & Engineering

Jawaharlal Nehru Technological University

Ramesh Sunkaraboina

Summary

Overview

Work History

Senior Data Engineer

Data Engineer

Associate Data Engineer

Technical Consultant

Education

Bachelor of Technology - Computer Science & Engineering

Skills

Personal Information

Accomplishments

Domains

Overall Experience

Timeline

Senior Data Engineer

Data Engineer

Associate Data Engineer

Technical Consultant

Bachelor of Technology - Computer Science & Engineering

Similar Profiles

Nidhi SinhaNidhi Sinha

Anand Vishnu Sankar Anand Vishnu Sankar null

NISCHITHA MNISCHITHA M

ANIMA GUPTAANIMA GUPTA

Ruchitha ReddyRuchitha Reddy