Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Hobbies and Interests
Timeline
Generic

Harpreet Singh

Bangalore

Summary

Innovative technology professional with 8+ years of diverse experience. Skilled in enhancing systems and aligning technical solutions with business objectives. Proven success in leading projects from start to finish and contributing to organizational growth and success.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Lead Data Engineer

JP Morgan Chase
06.2024 - Current
  • Led the design for building the data pipelines, data models and Data quality framework for JP Morgan Research Markets Web Pages containing readership information of JP morgan customers accessing JP morgan research papers.
  • Worked with cross functional teams for extracting the data from various sources like Adobe Analytics data feed, raw nested JSONs, CSVs, Parquet files, and databases.
  • Gained strong leadership skills by managing projects from start to finish.
  • Led the team of 5 engineers to modernize and develop the data pipelines catering 10+ Business intelligence, Machine learning teams, and product teams for generating insights on readership trends for the JP morgan markets website.
  • Optimized and improved Glue ETL job runtime performance and cost by 65% by understanding the Glue ETL workers configuration, metrics outliers and optimizing spark read performance.
  • Skills: AWS Glue ETL, Data Pipeline Design, Team management, Python, Spark, Jenkins, Bitbucket, Data Modelling, Data Architecture

Senior Data Engineer

Allen Digital
03.2024 - 05.2024
  • Collaborated with cross-functional teams to design innovative data solutions.
  • Led architecture discussions, driving innovation in ETL design and implementation strategies.
  • Worked on developing end to end framework using Spark, Scala and Iceberg to build an ACID compliant data lake supporting capabilities to update and delete as well on data lake.
  • Designed end to end solution to build and orchestrate an ETL pipeline using Airflow and AWS Code pipelines. This helped in reducing the manual effort required to deploy the pipelines on production by 95%.
  • Worked on designing real time data pipelines with multiple sources like MySQL, MongoDB, DynamoDB, application data.
  • Provided guidance and mentored less-experienced staff members.
  • Skills: AWS Redshift, Kafka (AWS Managed Streaming for Apache Kafka), Glue ETL, EMR, QuickSight, AWS DMS, Airflow (AWS Managed Workflow for Apache Airflow), Python, Scala, Team management, Data Modelling, Data Architecture

Data Engineer 2

Amazon Web Services
01.2019 - 03.2024
  • Designed & Developed Spark Jobs in EMR and Glue, ETL Processes based on Redshift and S3 sources and Destinations.
  • Worked on ACID compliance data format like Hudi, Iceberg to build transactional Data lake.
  • Proactively identified opportunities for process improvements, addressing inefficiencies through innovative solutions and implementing best practices in field of data engineering. Saved ~80 PB on storage and 1.4MM+ USD(internal costing) by redefining processes or SOPs.
  • Optimized existing spark based ETL workload and achieved 25% runtime reduction using partition pruning, spark configuration optimizations, and storage format changes.
  • Designed scalable pipelines handling terabytes of data
  • Led end-to-end implementation of multiple high-impact projects from requirements gathering through deployment and post-launch support stages.
  • Worked with Legal and Security team to cater data governance and data sovereignty requirements that resulted in implementing data governance with features like Row level security control on petabyte scaled Data Lake
  • Worked with multiple service teams like EMR, AWS Lake Formation, Glue ETL, and Athena for launching/integrating Lake Formation FGAC feature.
  • Developed QuickSight/BI dashboard containing internal cost, and SLA metrics and trends.
  • Skills: AWS Redshift, Glue ETL, EMR, QuickSight, AWS StepFunction , Python, AWS Lake Formation, AWS S3, Apache Hudi, Spark, PySpark, Hive, Apache Ranger, AWS Athena

Cloud Engineer - Data Analytics

Amazon Web Services
06.2018 - 01.2019
  • Provide technical support to AWS enterprise and Business customers using AWS data analytics services like OpenSearch, Redshift, Kinesis, Managed Kafka, and QuickSight.
  • Have achieved Subject matter expert certification from AWS for AWS Open Search Services.
  • Trained and mentored 50+ engineers across AWS sites on AWS Managed Workflow for Apache Kafka.
  • Developed internal tool for new engineers to enable them work on real troubleshooting scenarios and helping them get onboarded on AWS as part of training. This tool enabled 400+ engineers/new hire to provide quality support to AWS customers by providing them feedback/score on troubleshooting.
  • Skills: AWS Redshift, Glue ETL, AWS Athena, AWS QuickSight, AWS Managed Stream for Apache Kafka, Python

Systems Engineer

Infosys Limited
05.2016 - 06.2018
  • Worked on Agile Methodology (Scrum).
  • Monitored and tested application performance to identify potential bottleneck, developed solutions, and creating effective documentation.
  • Skills: Java, Session Initiation Protocol, LDAP

Education

Bachelors of Technology - Computer Science And Engineering

Galgotias University
Greater Noida
05.2016

Skills

  • AWS Cloud
  • Python
  • Redshift Data warehouse
  • Spark
  • ETL Design
  • Data Governance and Security
  • Apache Airflow

Accomplishments

  • Achieved Bias for action award by Amazon Web Services Commerce platform team
  • Achieved Best Performer Award by Amazon Web Services Premium Support team
  • Achieved Insta Award by Infosys Limited

Certification

  • AWS Certified Data Analytics - Specialty

  • AWS Certified Cloud Practitioner

  • AWS Certified Developer-Associate

Hobbies and Interests

  • Music

  • Badminton

  • Gaming

Timeline

Lead Data Engineer

JP Morgan Chase
06.2024 - Current

Senior Data Engineer

Allen Digital
03.2024 - 05.2024

Data Engineer 2

Amazon Web Services
01.2019 - 03.2024

Cloud Engineer - Data Analytics

Amazon Web Services
06.2018 - 01.2019

Systems Engineer

Infosys Limited
05.2016 - 06.2018

Bachelors of Technology - Computer Science And Engineering

Galgotias University
Harpreet Singh