Summary
Education
Skills
Objective
Project -1
Project-2
Declaration
Timeline
Generic

Bhanu Sree Maanvi Kolli

Summary

Data Engineer with over four years of experience in designing and implementing robust data solutions. Proficient in Python, PySpark, SQL, Pandas, Apache Spark, and AWS, with a proven ability to build scalable data pipelines, optimize processing workflows, and ensure high data quality.

  • Built and maintained large-scale ETL pipelines using Python and PySpark for complex data transformation and processing tasks.
  • Optimized SQL queries, achieving a 20% reduction in query execution time through performance tuning and indexing strategies.
  • Designed and deployed data warehouse solutions on AWS, enabling efficient storage, retrieval, and analysis of structured data.
  • Developed data integration workflows to ensure seamless data movement across diverse systems and platforms.
  • Collaborated with data scientists to deploy machine learning models on Apache Spark, supporting real-time data processing, and predictive analytics.
  • Performed comprehensive data quality assessments, implementing data cleansing and validation techniques to maintain accuracy and integrity.
  • Partnered with cross-functional teams to gather requirements and deliver tailored technical solutions, aligned with data-driven business objectives.
  • Demonstrated strong problem-solving skills and deep expertise in big data technologies and cloud-based architectures.

Education

Bachelor of Technology - Computer Science Engineering

Aditya Engineering College
Surampalem
06-2022

Skills

    Programming Languages: Python

    Big Data Technologies: PySpark

    Databases:SQL

    Cloud Platforms: Amazon Web Services (AWS)

    Data Processing and Analysis: Pandas, Redshift, Athena

    ETL Tools: Apache Airflow, AWS GLUE

    Version Control: Git

    IDE : Jupyter Notebook

Objective

To thrive in a dynamic and challenging environment where I can effectively apply my skills and knowledge, contributing meaningfully to both organizational success and my own continuous growth

Project -1

  • Project name: Data Lakehouse with ETL and optimization
  • Period:
  • Technologies used: Python, AWS, PySpark, SQL
  • Description: Built a scalable data lakehouse architecture to store, process, and serve structured and semi-structured data, used Delta Lake on AWS S3 for ACID-compliant storage, and AWS Glue for ETL orchestration
  • Responsibilities:

1.Designed data zones (raw, curated, analytics) with Delta Lake formats.
2.Built PySpark ETL jobs with partitioning and caching strategies.
3.Deployed jobs using AWS Glue, scheduling via triggers and workflows.
4.Queried transformed data with Athena and Redshift Spectrum using SQL.
5.Optimized schema and storage for performance and cost efficiency.

Project-2

  • Project name: Data Pipeline Automation
  • Period:
  • Technologies Used: Python, SQL, AWS , Pandas
  • Description: Develop an automated data pipeline using Python and AWS services to extract, transform, and load data from various sources into a centralized data repository.
  • Responsibilities:

      1. Design and implement the data pipeline architecture        

      2. Develop ETL scripts and workflows

     3. Handle data transformation and cleansing           

     4. Automate data ingestion processes                                                                            

     5. Monitor and troubleshoot pipeline issues                           

     6. Collaborate with stakeholders to understand data requirements

Declaration

I hereby declare that all information furnished by me is true to the best of my knowledge

Timeline

Bachelor of Technology - Computer Science Engineering

Aditya Engineering College
Bhanu Sree Maanvi Kolli