Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Work Availability
Timeline
Generic
SAPNA HARGUNANI

SAPNA HARGUNANI

Lead Data Engineer

Summary

Technical Lead and Domain SME with over 9 years of experience in Software Development, focusing on Data Engineering, Data Analysis, Data Warehousing, Data Modeling, Data Lineage, and Data Governance. My expertise includes leveraging advanced technologies and frameworks such as Snowflake, Data Vault 2.0, Spark, AWS, Airflow, Python, Java, Hadoop and MapReduce ecosystem to deliver scalable and efficient data solutions. I specialize in and building robust Extract, Transform, Load (ETL) data pipelines, designing data warehouse using Data Vault methodology, client communication, query optimization.

Overview

9
9
years of professional experience
3
3
years of post-secondary education
2
2
Certifications

Work History

Lead Data Engineer

Accenture Technologies PVT. LTD.
01.2022 - Current
  • Led the design and implementation of robust ETL pipelines, ingesting data from Oracle, SAP HANA, and FTP servers into Amazon S3 data lakes. Developed transformation workflows using AWS Step Functions, Lambda, and Glue Jobs with PySpark, ultimately loading processed data into Athena tables and Amazon RDS for PostgreSQL. contributed to infrastructure automation by creating Terraform templates and enabling seamless deployments through CI/CD pipelines.
  • Designed a Snowflake Data Warehouse using the Data Vault 2.0 methodology, leveraging VaultSpeed for automation. Delivered enterprise-grade solutions by developing Data Marts based on the Star Schema, integrating data from five distinct sources into Snowflake. Wrote and optimized Snowflake queries across multiple data layers to ensure efficient performance and scalability.
  • Parsed ETL job lineage information from Databricks PySpark jobs using Spline, and stored the metadata in AWS Neptune. Utilized Gremlin queries executed via AWS Lambda, triggered through API Gateway, to expose the data through a Consumer API. This API enabled the UI team to visualize end-to-end data lineage on the portal.
  • Developed and implemented an OnDemand data load pipeline to efficiently load multi-terabyte Timestream data into InfluxDB hosted on ECS. The pipeline incorporates parallel processing, resulting in a 30% acceleration in data loading.
  • Worked on pushing the CloudWatch logs to Splunk through the AWS Kinesis Data Firehose.

Data Engineer

Impetus Technologies PVT. LTD.
05.2018 - 01.2022
  • Requirement gathering, design & development of ETL pipelines within the Hadoop ecosystem. Built data transformation workflows using MapReduce and later optimized performance by migrating to the Tez engine. Developed PySpark and Hive scripts to process both historical and incremental data for datalake population. Created custom Hive UDFs in Java, utilized Sqoop for data ingestion from MySQL to Hive, and ensured end-to-end data pipeline validation using Amazon EMR as the compute unit.
  • Provided solutions for ETL process, Hive queries, MapReduce and PySpark jobs optimization to improve efficiency by 55%.
  • Provisioned AWS Lambda functions in Java for validating and processing incremental data files to be further processed in Hadoop Framework. Integrated SQS and SNS for handling failures and sending completion notifications.
  • Migrated data loads from MongoDB to Elasticsearch and created comprehensive Kibana dashboards for data visualization.
  • Utilized Apache Airflow for orchestrating REST calls, initiating data load processes with HTTP Operators, performing SQL operations using SQL Operators, and monitoring SQL tables with SQL Sensors.
  • Developed business logic solutions for processing organization specific business logic on the custom version of Salesforce CRM tool using Java, MongoDB, AWS Elasticsearch, and REST APIs.

Data Engineer

Cisco Video Technologies (Client Site)
10.2016 - 04.2018
  • Functionality Enhancement as per the requirement.
  • Developing the Logstash configurations and creating visualizations in the Kibana dashboard with the queries to optimize the search.

Education

Master's Degree - Computer Application

RGPV University
01.2012 - 01.2015

Skills

ETL development

Certification

AWS Certified Developer - Associate

Accomplishments

  • Awarded as Star of the Month twice by Impetus Technologies Pvt. Ltd.
  • Star Awards by Accenture Solutions Pvt. Ltd.
  • Recognized for EXEMPLIFY CLIENT CENTRICITY by Accenture Solutions Pvt. Ltd.
  • Excellence Award by Accenture Solutions Pvt. Ltd.

Work Availability

monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Timeline

Lead Data Engineer

Accenture Technologies PVT. LTD.
01.2022 - Current

Data Engineer

Impetus Technologies PVT. LTD.
05.2018 - 01.2022

Data Engineer

Cisco Video Technologies (Client Site)
10.2016 - 04.2018

Master's Degree - Computer Application

RGPV University
01.2012 - 01.2015
SAPNA HARGUNANILead Data Engineer