Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Hemanth Sai Satti

Data Engineer
Hyderabad

Summary

Data Engineer with over 4 years of experience in designing and implementing scalable data solutions. Proficient in Azure cloud services, with hands-on expertise in building real-time data pipelines, optimizing databases, and automating workflows to reduce costs and enhance performance. Skilled in cross-platform data integration, cloud architecture, and troubleshooting complex data issues. Strong analytical capabilities combined with a collaborative mindset to drive business agility, support data-driven decisions, and optimize data environments for high efficiency and scalability.

Overview

5
5
years of professional experience
3
3
Certifications

Work History

Data Engineer

Maslo Consultancy (ASDA Stores, UK)
04.2023 - Current

Project : AI-Driven Shelf & Stock Management Integration


Description: Implemented an AI-powered shelf and inventory monitoring pipeline by integrating retail data with third-party computer vision platforms. Enabled automated shelf analytics, including stock availability, planogram compliance, and pricing accuracy, using scheduled data transfers orchestrated through Azure Databricks.


Roles and Responsibilities:

  • Designed and developed end-to-end data pipelines to support AI-driven shelf and stock management across retail stores using Azure Databricks.
  • Coordinated secure daily/weekly transfers of inventory, pricing, planogram, sales, and metadata datasets to third-party AI vendors for model training and real-time analytics.
  • Integrated API-based and SFTP-based data exchange processes, validating authentication workflows using Postman (API) and FileZilla (SFTP) during POC/testing.
  • Implemented production-grade data delivery workflows in Databricks, including scheduling, automated retries, monitoring, and exception handling.
  • Deployed workflows using Databricks Asset Bundles to ensure version-controlled, reproducible, and CI/CD-aligned releases.
  • Collaborated with engineering, vendors, and store operations teams to ensure high-quality data feeds, improving shelf-availability insights, and reducing manual audits.
  • Enhanced operational efficiency by automating data handoff processes, enabling faster and more consistent AI-based shelf monitoring across pilot stores.


Project: UNIFIED CUSTOMER MATCHING ENGINE


Description: Developed a unified customer matching engine by integrating and reconciling data from diverse sources, applying rule-based logic to assign a unique customer identifier for consistent entity resolution across systems.


Roles and Responsibilities:

  • Designed and implemented a unified customer matching engine to streamline data extraction and classification across physical and digital sales channels for customers.
  • Integrated encrypted transactional data from online and offline sources using PySpark and Azure Databricks, enabling secure and scalable data unification.
  • Applied deterministic matching logic and rule-based algorithms to assign unique customer identifiers for consistent cross-platform recognition.
  • Collaborated with cross-functional teams to improve customer data accuracy, resulting in a 35% increase in unified customer identification.
  • Enhanced personalization strategies for targeted marketing campaigns, contributing to a 20% improvement in marketing effectiveness through enriched customer profiles.


Project: Rewards Data Ingestion.


Description: Developed a streaming data warehousing solution for mobile rewards data by ingesting Event Hub streams via Databricks, handling schema evolution, and implementing a medallion architecture to deliver business-specific insights through stakeholder collaboration.


Roles and Responsibilities:

  • Designed and implemented a real-time data pipeline using Azure Databricks to ingest 150 million daily records from a mobile application, supporting a rapidly growing retail network with 125,000 daily active users across multiple stores.
  • Ingested streaming rewards data from a mobile application via Azure Event Hub and APIM, and processed it in Azure Databricks by decoding binary payloads into JSON-formatted strings in the Bronze layer, then parsing relevant fields in the Silver layer to build structured Delta tables for downstream consumption.
  • Collaborated with business stakeholders to identify and extract key payload elements relevant to customer rewards and personalization.
  • Applied the medallion architecture to build Bronze, Silver, and Gold streaming tables, ensuring scalable and organized data transformation layers.
  • Migrated raw and processed data to Azure Data Lake Storage (ADLS), improving data accessibility, and reducing processing costs marginally.
  • Optimized pipeline throughput and streaming model performance, resulting in a 25% boost in decision-making speed, and unlocking new revenue opportunities.


Project: Sales Adjustments Pipeline


Description: Built a scalable sales adjustments pipeline using 15 source tables to generate basket- and item-level sales data, enabling accurate supplier billing, SAP integration, and daily finance reporting for evaluating reward-driven sales across stores.


Roles and Responsibilities:

  • Designed and developed a scalable sales adjustments pipeline using 15 source tables to compute rewards-based sales at both the basket and item levels.
  • Implemented campaign-type-specific logic (e.g., repeatable/non-repeatable missions, coupon-based earnings, and star product rewards) to accurately derive sales adjustments across varied promotional structures.
  • Modeled item-level sales adjustment data to generate SAP-ready outputs, streamlining integration with ERP systems for financial reconciliation.
  • Automated daily file generation capturing store-level sales deltas, with debit-credit adjustments, supporting market sheet reporting for the finance team.
  • Collaborated with finance stakeholders to ensure adjusted sales data met audit and reporting requirements for evaluating rewards program performance.
  • Built and maintained downstream tables to support the supplier billing portal, enabling accurate tracking of product utilization, and supplier settlements.
  • Orchestrated the end-to-end pipeline using Databricks Asset Bundles, enabling scalable deployment and optimized performance through dynamic overwrite partitioning; handled late-arriving data by running D-2 logic, and supported backfill across multiple dates in case of source delays.


Project: Data Ingestion from Third-Party Sources.


Description: Developed a metadata-driven, reusable ingestion pipeline in Azure Data Factory to onboard legacy third-party data into the new Azure data platform. Integrated dbt for scalable transformation, data modeling, and quality checks across Delta Lake layers, enabling end-to-end orchestration, and improved governance during the platform migration.


Roles and Responsibilities:

  • Designed and implemented a reusable, metadata-driven ingestion pipeline in Azure Data Factory to onboard legacy third-party data feeds, ensuring seamless migration, and business continuity during the transition to the new Azure data platform.
  • Developed support for multiple file formats (CSV and Parquet) with flexible load types, including incremental, full, partition overwrite, and upsert logic, driven entirely by metadata configuration.
  • Integrated Event Hub triggers to orchestrate the ingestion flow based on file drop notifications, automating job execution, and reducing operational overhead.
  • Performed schema-driven data quality checks and encrypted PII columns before loading curated data into staging, and enriched Delta Lake layers.
  • Incorporated dbt to build modular, maintainable SQL transformation pipelines on top of Delta Lake, including layered modeling (staging → intermediate → mart models), automated schema testing, uniqueness checks, and data constraints, reusable macros for applying business rules and standard transformations, and documentation and lineage tracking using dbt docs.
  • Enabled scalable and structured data storage using separate raw, staging, and enriched zones in ADLS, improving traceability, maintainability, and downstream usability.
  • Built audit tracking capabilities in Synapse Serverless SQL pools to monitor pipeline health and step-level status across all ingestion workflows.

Software Developer

Reckonsys TechLabs
04.2021 - 03.2023

Project : SIGNEDLY


Description: Built Signedly, a SaaS platform for secure digital document signing, with a strong focus on data infrastructure and pipeline optimization. Designed scalable data ingestion, transformation, and storage mechanisms using Python and MongoDB, and deployed cloud-native systems on Microsoft Azure. Specialized in data modeling, containerization, and cost-efficient cloud architecture, tailored for high-volume document processing.


Roles and Responsibilities:

  • Engineered scalable data pipelines and backend data services to support high-throughput ingestion and retrieval of signed documents.
  • Designed optimized MongoDB schemas to manage document metadata and content, enabling efficient indexing and query performance.
  • Implemented document version control, audit trails, and metadata indexing for regulatory compliance, and secure access.
  • Deployed infrastructure on Azure Virtual Machines (VMs), and stored documents in Azure Blob Storage for durability and availability.
  • Containerized services with Docker to support reproducible deployments and CI/CD pipelines.
  • Leveraged Redis for caching and asynchronous message queues to optimize processing throughput.
  • Reduced infrastructure and storage costs using indexing strategies and Azure cost management tools for monitoring and optimization.


Project: PHARMALLAMA


Description: Developed Pharmallama, a prescription-based medicine ordering SaaS platform, with a focus on secure, real-time data processing, and scalable infrastructure. Led backend data architecture, automated data-driven workflows, and optimized prescription handling pipelines using MongoDB, Redis, and Azure.


Roles and Responsibilities:

  • Designed and implemented data services and pipelines for ingesting, transforming, and storing prescription data at scale.
  • Architected MongoDB-based data models to handle large volumes of prescriptions, and ensure high-performance queries.
  • Integrated Redis queues to enable fast, asynchronous processing for notifications, and workflow automation.
  • Automated workflows, such as cost sheet generation, delivery scheduling, and prescription lifecycle updates, with backend job orchestration.
  • Deployed backend services using Docker containers on Azure VMs, with sensitive documents securely stored in Azure Blob Storage.
  • Implemented monitoring, logging, and performance tuning strategies to ensure high availability, and cost-effective operations.

Education

Bachelor of Engineering - Electronics and Communication

Vellore Institute of Technology (VIT)
Chennai, India
04.2021

Skills

Python: PySpark Pandas Django

Certification

DP-900 Azure Data Fundamentals

Timeline

Data Engineer

Maslo Consultancy (ASDA Stores, UK)
04.2023 - Current

Software Developer

Reckonsys TechLabs
04.2021 - 03.2023

Bachelor of Engineering - Electronics and Communication

Vellore Institute of Technology (VIT)
Hemanth Sai SattiData Engineer