Summary

Overview

Work History

Education

Skills

Certification

Projects

Timeline

Harika Saroja Ivaturi

Denton

Summary

Data Engineer with over 3 years of experience in Business Intelligence Reporting, Google Cloud services, Big Data/Hadoop ETL, and supply chain product development. Proficient in building and managing large-scale data pipelines using Python and PySpark. Skilled in data cleaning, transformation, and visualization with a strong background in Google Cloud Platform (GCP) and container orchestration.

Overview

years of professional experience

Certification

Work History

AI/ML Intern

Inclined Analytics

01.2025 - Current

Handled large Medicare Part B datasets using Pandas and NumPy for data manipulation and cleaning.
Designed ML pipelines for healthcare fraud detection using K-Means clustering, Mahalanobis Distance, and Z-score analysis.
Engineered features using weighted embeddings based on service volume and charges.
Conducted anomaly detection to flag irregular billing patterns and over-utilization of HCPCS codes.
Built scalable Python-based data pipelines and machine learning models.
Performed statistical analysis to reduce false positives and improve fraud detection accuracy.
Created insightful visualizations (scatter plots, box plots) using Seaborn and Matplotlib to present findings to stakeholders.

Data Engineer

Accenture

10.2021 - 12.2022

Built end-to-end data engineering workflows using MySQL, Hadoop, Oracle, and NoSQL databases (HBase, Cassandra).
Defined and automated ETL pipelines using Oozie; supported MySQL-to-Hadoop migration efforts.
Participated in database architecture planning and SQL performance tuning (execution plans, indexing, materialized views).
Deployed Oracle databases on AWS EC2 instances.
Used PySpark to optimize Hive SQL queries, including non-equi joins.
Conducted POCs on Hive table bucketing and partitioning to assess performance.
Used Apache Sqoop for data migration and managed datatype handling post-transfer.
Utilized Python collections for data manipulation and processing of custom objects.

Data Engineer

TEK Systems Global Services

09.2020 - 10.2021

Built and optimized ETL pipelines using Python, PySpark, Hive SQL, and Presto, including data transformation and cleansing to ensure high-quality, reliable data for reporting and analysis.
Migrated SAS programs and on-prem data pipelines to Python and cloud platforms (AWS, GCP), improving performance and scalability.
Developed backend systems and automated scripts for data aggregation, migration, and ingestion from APIs and flat files.
Created SQL/PLSQL procedures, Oracle Reports, and dashboards using Matplotlib and Plotly for business insights and forecasts.
Used AWS services (S3, Redshift) and Kubernetes for scalable, containerized deployments and data archival strategies.
Conducted functional and system testing; implemented logging, monitoring, and documentation for data workflows.
Applied data masking, anonymization, and compliance techniques (GDPR, HIPAA) for secure data handling.
Improved ETL efficiency through parallel processing, partitioning strategies, and Spark performance tuning.
Coordinated with stakeholders to translate business requirements into technical data solutions and reporting logic.
Ensured data integrity with reconciliation checks and root cause analysis across diverse data sources.

Education

Master of Science - Advanced Data Analysis

University of North Texas

Denton, TX

05.2024

Bachelor of Technology - Electronics And Communications Engineering

GVPCEW

Visakhapatnam, India

06.2020

Skills

Python and SQL
PL/SQL and Hive
Hadoop and PySpark
Cloud platforms (GCP and AWS)

Data visualization (Power BI, Seaborn, Matplotlib)
Container orchestration (Docker and Kubernetes)
Version control (GitLab)
Data ingestion (Apache Sqoop)

Certification

Architecting with Google Kubernetes
AWS Fundamentals
Professional Google IT Support
Cybersecurity
AZ 900: Microsoft Azure Fundamentals
Supervised Machine Learning: Regression and Classification

Projects

In-Vehicle Coupon Recommendation System, Built a machine learning model using Python and Scikit Learn to recommend coupons based on user behavior and real-time location data. Achieved a 20% increase in coupon redemption rates through personalized suggestions. Customer Churn Prediction, Developed a machine learning model using Python and Scikit Learn to predict customer churn for a telecom company. Improved retention by 15% using Logistic Regression and XGBoost for classification.

Timeline

AI/ML Intern

Inclined Analytics

01.2025 - Current

Data Engineer

Accenture

10.2021 - 12.2022

Data Engineer

TEK Systems Global Services

09.2020 - 10.2021

Master of Science - Advanced Data Analysis

University of North Texas

Bachelor of Technology - Electronics And Communications Engineering

GVPCEW

Harika Saroja Ivaturi

Summary

Overview

Work History

AI/ML Intern

Data Engineer

Data Engineer

Education

Master of Science - Advanced Data Analysis

Bachelor of Technology - Electronics And Communications Engineering

Skills

Certification

Projects

Timeline

AI/ML Intern

Data Engineer

Data Engineer

Master of Science - Advanced Data Analysis

Bachelor of Technology - Electronics And Communications Engineering

Similar Profiles

Aswini RajulapudiAswini Rajulapudi

GOKA VANDANAGOKA VANDANA

Manideep KoletiManideep Koleti

ABHAY SINGH THAKURABHAY SINGH THAKUR