Summary
Overview
Work History
Education
Skills
Achievements
Languages
Timeline
Generic

Prashanth Kumar Majji

Bengaluru

Summary

Experienced Data Engineer with over 3 years of expertise in developing scalable big data solutions across cloud platforms (AWS, Azure). Strong proficiency in PySpark, Databricks, and architecting comprehensive data pipelines. Demonstrated capability to enhance large-scale ETL workflows, minimize cloud expenditures, and uphold data integrity through contemporary frameworks.

Overview

3
3
years of professional experience

Work History

Data Analyst

HCL Technologies
Bengaluru
12.2023 - Current
  • I built real-time data ingestion pipelines using AWS SQS, SNS, and Databricks.
  • Developed reusable PySpark ingestion frameworks to process data from APIs, SFTP, Oracle, and S3.
    Created a data quality validation framework using Deequ and Great Expectations.
  • Achieved approximately $36,000 annual cost reduction in Databricks through performance tuning.
  • Designed ETL workflows and performed heavy data processing with a focus on minimal shuffling and resource optimization.

Analyst Programmer

In Technet
Hyderabad
09.2022 - 12.2023
  • Developed Azure-based data pipelines for pharmaceutical clients within FDA domain.
  • Executed business rule validations and transformations utilizing PySpark.
  • Streamlined file sanity checks while orchestrating workflows through Azure Data Factory.

Education

Bachelor of Science - Electrical Engineering

VRSEC
Vijaywada, India
06-2022

Skills

  • Languages: Python, Scala, SQL Big Data: Spark, Hadoop, Hive, HBase, Sqoop
  • Cloud platforms:

Azure: Data Factory, Databricks, ADLS, Azure Functions,

AWS: S3, SQS, SNS, Redshift, EC2

  • Tools and frameworks: Databricks, PySpark, Great Expectations, Deequ, Git, REST APIs, GraphQL
  • Data Engineering: ETL Frameworks, Delta Lake, Data Validation
  • Databases: Oracle, Redshift, HDFS, NoSQL
  • Other: Agile, Unit Testing (PyTest), Performance Tuning, CI/CD

Achievements

  • Built a scalable ingestion framework with PySpark for 10+ products, reduced Databricks costs by 36K USD annually through optimization, designed an automated data quality check framework using Deequ and Great Expectations, created REST and GraphQL APIs for efficient data serving.

Languages

English
First Language

Timeline

Data Analyst

HCL Technologies
12.2023 - Current

Analyst Programmer

In Technet
09.2022 - 12.2023

Bachelor of Science - Electrical Engineering

VRSEC
Prashanth Kumar Majji