Summary
Overview
Work History
Education
Skills
Timeline
Accomplishments
Key Highlights
Languages
Values
Skills
Appreciations
Research Projects & Thesis (Masters)
Languages
Hi, I’m

Pravalika Bollampalli

Berlin
Pravalika Bollampalli

Summary

Dynamic Data Engineer with 6+ years of experience architecting and delivering end‑to‑end data platforms on AWS. Proficient in Spark and Python/Scala pipelines to ingest, transform, and build end-to-end cloud based ETL pipelines. Passionate about optimizing data pipelines and driving insights for enhanced, efficient decision-making. Good at working with stakeholders to convert business goals to technical requirements. Currently expanding technical and strategic acumen through a Master's in Data Science and Engineering.

Overview

6
years of professional experience

Work History

Athena Health

Senior Decision Management Associate
01.2024 - 06.2025

Job overview

  • Managed a data engineering team of five professionals, overseeing data strategy, governance, and analytics.
  • Collaborated with senior leadership to translate business goals into data-driven strategies.
  • Implemented cloud-based data solutions and automations, cutting operational costs.
  • Defined performance metrics to measure data initiatives' success and their impact on decision-making.
  • Ingested and processed 3 TB of daily data of claims in the Billing Workflow Architecture (healthcare claims) via Spark streaming/batch jobs, ensuring low-latency delivery and high availability.
  • Built and maintained core data tables and ETL pipelines on AWS/Databricks, sourcing from S3, Kafka, Kinesis, and PostgreSQL to power BI reporting and analytics.
  • Spearheaded autoscaling strategies—triggered by volume and lag metrics—to optimize compute usage, yielding $180 K in annual cost savings.
  • Proposed an initiative and architectured the workflow to reduce the code and move to a codeless architecture using DMN which was highly appreciated by the leadership.
  • Transitioned legacy streaming pipelines to a hybrid, partial-streaming architecture, cutting compute spend by 35% while retaining near real-time insights.
  • Implemented AWS Lambda–backed data quality checks and CloudWatch alerts, reducing data errors by 50% and bolstering stakeholder trust.
  • Led storage‑optimization efforts using compression on Delta Lake, driving a 40% reduction in data storage costs.
  • Collaborated with software engineers, data scientists, and product owners to deliver mission‑critical data solutions on schedule.
  • Automated CI/CD for data pipelines in Jenkins, slashing manual deployments by 30% and standardizing release processes.
  • Experimented with Parquet, Delta Lake, and formats to enhance throughput and reliability of large-scale workflows.
  • Designed, built, and optimized scalable ETL pipelines and data models that processed terabytes of data daily.
  • Built Power BI dashboards for the analytics from the processed data for the business team to infer.
  • Helped to automate the workflow using the AWS Stepper function to orchestrate the pipelines and reduce manual efforts
  • Worked with stakeholders to ensure data solutions aligned with business objectives.
  • Managed agile ceremonies on daily basis, crafted JIRA's for sprints and refined backlogs.

LTIMindtree

Senior Data Engineer
04.2022 - 12.2023

Job overview

  • Handled the Whole Loans and Performance/Risk Attribution teams, demonstrating leadership in orchestrating and developing data solutions.
  • Implemented data loaders to extract, transform, and load data from xls files into Snowflake, utilizing Databricks modules and AWS services.
  • Worked on supporting multiple client ad-hoc requests that involved complex SQL Queries.
  • Created Denodo and Tableau views, enhancing business portfolio analysis and showcasing proficiency in BI reporting and dashboard development.
  • Successfully transitioned from legacy JAVA code to Spark using Scala, actively contributing to the Apache Big Data stack.
  • Automated anomaly detection in daily trades through delinquency checks and dashboard reports using AWS Services like AWS S3, EKS, Lambda, IAM, EC2, EMR, VPC, CodePipeline and Redshift, showcasing expertise in ETL workflow management.
  • Managed and led the team of 7 Developers, fostering collaboration, and ensuring project success.
  • Supported day-to-day production activities, incidents, and change requests for the Western Asset Management client.
  • Developed a Denodo view to track changes in loan custodian information using Snowflake.
  • Utilized Hive in data processing tasks to ensure efficient data handling and analysis.
  • Integrated Redshift and Netezza in data processing tasks, ensuring optimal performance and scalability.
  • Maintained code repositories on GitLab and implemented CI/CD processes using Jenkins.
  • Pro-actively contributed to TDD approach by writing E2E Unit testing.
  • Led a team of six junior engineers—providing coaching, assigning support tasks, and producing daily operations reports to forecast staffing and support volume.
  • Actively contributed to SDLC using Agile with SCRUM and Kanban Methodologies.

Tredence Analytics

Senior Software Engineer
11.2021 - 04.2022

Job overview

  • Contributed as a Senior data engineer for the Informa client, building an orchestration ecosystem in Apache Airflow.
  • Rewritten multiple validation modules using PySpark to automate the validation pipeline in Spark-SQL.
  • Implemented and deployed the workflow of each task in a DAG using different Airflow operators like Bash, SSH and Python.
  • Developed Spark jobs to extract data from xls files, transforming and loading it into PostgreSQL database tables.
  • Worked on loading and transforming data from multiple file formats such as CSV, Parquets, and text in the HDFS.
  • Played a key role in migrating Oracle database tables to a PostgreSQL environment, ensuring a seamless transition.
  • Implemented a series of Airflow steps to enhance the quality of delivered data to clients by introducing quality checks.
  • Significantly improved workflow efficiency by removing shell scripts and automating processes as Airflow steps.
  • Utilized Snowflake in the development of data solutions for the client.

TCS

Software Engineer
05.2019 - 11.2021

Job overview

  • Orchestrated and developed RPD+ modules in Apache Airflow using Spark, Scala and Python technologies.
  • Wrote a series of Scala and Python codes for automation to handle 90% of the manual touch points in Media Computations Applications that have received notable client appreciation.
  • Led major developments, including the direct writing of local HDFS data into AWS S3 and the removal of NAS in the entire application.
  • Was able to significantly redefine the application's architecture by migrating the legacy TIBCO orchestration to Apache Airflow.
  • Managed Hive tables for data processing tasks to ensure efficient data handling and analysis.
  • Integrated Redshift and Netezza in data processing tasks, ensuring optimal performance and scalability.
  • Developed data validation scripts using the Spark framework, demonstrating proficiency in Hadoop development.
  • Conducted technical sessions on Scala and other relevant technologies.
  • Handled huge amounts of data using AWS services and fine-tuned their EMR cluster usage to load the complete US TV Projections data into the AWS S3 Bucket using Spark from Oracle and PostgreSQL databases.
  • Developed the SQS sensor, connecting the Airflow environment and Amazon AWS queue at the architecture level.
  • Utilized Snowflake, Redshift, and Netezza in the development of data solutions for the client.
  • Maintained code repositories on Git and Bitbucket, demonstrating experience in version control.
  • Served as primary support lead for Nielsen Media’s analytics platform—managing application and database (Postgres, Redshift, Hive) performance monitoring, tuning, and incident resolution to meet SLAs.
  • Troubleshot complex data‐processing issues across Spark, Scala, and Python pipelines, automating 90% of manual touchpoints and reducing mean time to repair.
  • Collaborated with DBAs, architects, and engineering teams to assess system capacity and define operational strategies—guiding cloud migrations (TIBCO→Airflow, Netezza→Redshift, Sybase→Postgres) and OS upgrades (RHEL→CentOS).
  • Developed and maintained ETL support utilities on AWS (EMR, S3, SNS/SQS, IAM, auto‑scaling and CloudWatch alarms) to ensure high availability, self‑healing workflows, and cost‑effective storage/versioning.
  • Authored runbooks, run‑throughs, and documentation (Design Architecture, Maintenance Manuals) and conducted regular knowledge‑transfer sessions to streamline on‑call rotations and hand‑offs.

Education

Rajalakshmi Institute Of Technology College
Chennai

B.E. from Computer Science and Engineering
03-2019

University Overview

GPA: 7.8 CGPA

University Of Europe For Applied Sciences
(Pursuing) - Potsdam,Berlin

Master of Science from Data Science And Engineering

University Overview

Skills

  • Languages & Scripting: Scala, Java, Python, SQL, Shell scripting
  • Big Data & Analytics: Apache Spark, Hadoop, Hive, Databricks, AWS EMR
  • Cloud Platforms & Services: AWS (Lambda, Step Functions, Glue, Athena, S3, EC2, EKS, RDS, IAM, VPC, Redshift, CloudWatch)
  • Data Warehousing & Databases: Snowflake, Oracle, PostgreSQL, Amazon Redshift
  • Workflow Orchestration & Scheduling: Apache Airflow, Autosys
  • DevOps & CI/CD: Git, Bitbucket, Jenkins
  • Monitoring & Observability: Grafana, OpenSearch, AWS CloudWatch
  • Methodologies & Frameworks: Scaled Agile, Scrum, Kanban, DMN (Decision Model & Notation)

Timeline

Senior Decision Management Associate
Athena Health
01.2024 - 06.2025
Senior Data Engineer
LTIMindtree
04.2022 - 12.2023
Senior Software Engineer
Tredence Analytics
11.2021 - 04.2022
Software Engineer
TCS
05.2019 - 11.2021
Rajalakshmi Institute Of Technology College
B.E. from Computer Science and Engineering
University Of Europe For Applied Sciences
Master of Science from Data Science And Engineering

Accomplishments

Accomplishments
  • Top 5 finalists of the Smart India Hackathon 2018 Hardware Edition in Medical Devices and Health Care.
  • Finalist for the Smart India Hackathon 2018 Software Version held in Bangalore for Defense Ministry.
  • Student Council Secretary for RIT during the year 2017-2019

Key Highlights

Key Highlights
  • Received the Spot Award for Outstanding Contributions to 21C Product Deliverables at Greenway Health.
  • Twice honoured with the 'Star of the Month' Award in 02/22 and 08/22 by Greenway Health.
  • Winner of the Innovation Award at the Smart India Hackathon 2018 Software Edition, presented by the Ministry of Human Resource Development, Government of India.
  • Ranked among the Top 5 finalists in the Smart India Hackathon 2018 Hardware Edition.
  • Guided a team in the research, proof of concept, and implementation of HL7 USCDI standards in a patient-facing application.

Languages

Languages
  • Tamil
  • English
  • Malayalam
  • Hindi

Values

Values
  • Customer-Centric
  • Adaptability
  • Analytical Thinking
  • Attention to Detail
  • Communication Skills
  • Problem-Solving
  • Leadership Qualities

Skills

Skills
  • Data Engineering and building ETL pipelines
  • Build Data Visualization dashboards
  • Scalable and cloud based ETL Architectures
  • Real‑Time & Batch Data Pipelines
  • Test-Driven Development (TDD)
  • Site Reliability, Production support and Maintenance
  • Infrastructure as Code (IaC)
  • CI/CD Pipeline Design
  • SRE & Observability Practices
  • Agile & Scrum Methodologies
  • Version Control with Git
  • Customer-Centric Product Engineering
  • Cross-Functional Team Collaboration
  • Strong Analytical & Problem-Solving Abilities
  • Communication & Leadership Skills

Appreciations

Appreciations
  • Got a notable appreciation from the Client and was awarded Star of the Month (07/01/21) for Stormy Check Automation which significantly reduced the expenses for each quarter.
  • Was awarded 'On Spot' award by TCS for quick development on the Simulcast changes (09/01/21).
  • Was awarded multiple on-spot awards in LTIMindtree for continuous contribution towards the project.
  • Was awarded Star of the Month (07/01/23) at LTI Mindtree for significant productivity
  • Was Awarded Excellence Engineering award for my initiative towards codeless architecture in Athena Health during the month December 2024
  • Won the Best Collaborator award in the Athena hackathon(March 2025) for providing the best PI, which proposed a saving of around $70,000 per year in the removal of legacy scripts and to automate the data regression pipelines
  • Was Awarded best team player for the month June 2025 in Athena Health

Research Projects & Thesis (Masters)

Research Projects & Thesis (Masters)

Perishable Inventory Management Dashboard (Power BI) (Retail)

  • Objective:
    Enable small‑scale retailers to move from manual spreadsheets to an interactive, data‑driven approach for tracking, forecasting, and replenishing perishable stock—minimizing waste and stock‑outs.
  • Solution:
    Developed a Power BI analytics dashboard that:
    Ingests sales, spoilage and supplier data via Power Query
    Models it into a star schema (fact “Sales” table + “Products,” “Suppliers,” “Calendar” dimensions)
    Defines dynamic DAX measures for spoilage rates, sales velocity, and turnover gaps
  • Key Features:
    Real‑Time Expiry & Stock Alerts:
    Automatic low‑stock and impending‑expiry warnings, delivered via Power BI subscriptions or mobile alerts.
    Spoilage Analytics: Interactive treemaps and bar charts showing spoilage cost by category and product, plus “spoilage‑to‑sales” efficiency matrices.
    Time‑Series Demand Forecasting: Built‑in line charts and heatmaps slicing by day‑of‑week and season to expose peak demand periods.
    Supplier Performance Comparison: Clustered visuals comparing sales volume vs. spoilage rate for national vs. local suppliers.
    Promotion ROI Dashboard: Scatter plots correlating discount levels with sales lift and post‑promo spoilage “echo.”
    Turnover Gap Heatmaps: Matrix visuals of Days‑in‑Stock distributions to flag slow‑moving SKUs.
  • Technologies Used:
    Power BI (Power Query & DAX), CSV/Excel data sources (POS exports), Azure SQL (or local SQL Server), Power BI Service for sharing and mobile notifications.

Real‑Time E‑Commerce Personalization via Hybrid Recommendation Engine (Retail)

  • Objective:
    Overcome the limitations of static, batch‑mode recommenders by delivering highly relevant, context‑aware product suggestions in real time—boosting engagement and conversion on e‑commerce platforms.
  • Solution:
    Designed and prototyped a hybrid recommendation architecture that seamlessly ingests streaming user interactions, updates models on the fly, and serves low‑latency predictions through a scalable pipeline.
  • Key Features:
    Streaming Data Ingestion:
    Captures clickstreams, searches, add‑to‑cart events, and purchase logs via Apache Kafka for immediate processing.
    Explainable Recommendations: Generates feature‑level attributions (via attention scores) so stakeholders can interpret “why” an item was suggested.
  • Technologies Used:
    Data & Streaming:
    Apache Kafka, Apache Flink (or Spark Streaming), Redis/Feast feature store
    Modeling: Python, StreamLit
    Deployment: Docker, AWS Lambda for realtime scoring
    Evaluation & Monitoring: Grafana dashboard

Languages

Languages
  • English
  • Telugu
  • Tamil
  • Hindi
  • German(Pursuing)
Pravalika Bollampalli