Summary
Overview
Work History
Education
Skills
Personal Information
Awards
Timeline
Generic

Yukti Singh

Lead Software Engineer
jaipur

Summary

Currently working at Persistent System Limited as Lead Software Engineer with 4.7+ experience in designing scalable PySpark solutions and developing high-performance scripts for large datasets. Proven track record in optimising data workflows, enhancing vehicle transportation efficiency, and minimising costs through automation. Expertise in AWS services including Lambda, S3, DynamoDB, Glue, and Step Functions for data validation, integration processing, and machine learning applications. Skilled in data warehousing, ETL processes, cloud computing, and data visualisation to facilitate data-driven decision-making. Committed to delivering structured documentation and conducting thorough data quality assessments to ensure accuracy and alignment with project goals. Career focus on leveraging advanced technologies to drive innovation and improve operational efficiency.

Overview

5
5
years of professional experience
2
2
years of post-secondary education
2
2
Languages

Work History

Lead Software Engineer

Persistent System
02.2025 - Current

Amazon – Data Spark Automation Projects (Offshore_1)
Duration: February 2025 – May 2025
Client: Amazon
Role: PySpark Developer

Designed and implemented scalable PySpark solutions to support the training of a language model that automates the analysis of migration changes between Apache Spark versions.

Built modular, high-performance PySpark scripts for processing large-scale datasets used in model training and evaluation.

Authored clear, structured documentation detailing script logic, data pipelines, and output formats for client reference and maintenance.

Conducted comprehensive data quality assessments to ensure output accuracy, consistency, and alignment with project requirements.

Data Engineer

Tech Mahindra
01.2023 - 02.2025

Client-Corten Logistics

  • Objective: Optimize vehicle transportation for clients like Hyundai and BMW by automating data workflows, improving efficiency, and reducing costs.
  • AWS Lambda: Serverless processing for data validation and integration.
  • Amazon S3: Storage for raw data.
  • Amazon DynamoDB: Scalable NoSQL database for structured data storage.
  • AWS Glue: ETL processing for data transformation.
  • AWS Step Functions: Workflow automation and orchestration.

Key Features:

  • Automated data ingestion from emails using Amazon SES.
  • Real-time data processing with AWS Lambda and SageMaker.
  • Scalable data storage and management in S3 and DynamoDB.
  • Workflow automation with Step Functions and CloudWatch for monitoring.
  • Data-driven decision-making with QuickSight reporting.
  • Outcome: Streamlined cargo management, enhanced efficiency, cost reduction, and improved customer satisfaction.

Data Engineer

Nationwide Mutual Insurance Company
10.2021 - 12.2022
  • Company Overview: United Healthcare Data Management Project Summary
  • Objective: Streamline customer records and improve data quality, reporting accuracy, and operational efficiency.
  • Challenges:
  • Fragmented data systems (Salesforce, Market, NetSuite).
  • Poor data quality and misalignment.
  • Inefficiencies in reporting and operations.
  • Phases:
  • Client Admin: Manage broker and client admin access, including ACA features.
  • Partner Admin ACA: Provide comprehensive data views for associated companies.
  • Non-Partner Integration: Integrate SSO and HR Logics for seamless authentication and data exchange.
  • Data Framework:
  • Storage: AWS Glue (new data), Salesforce (old data).
  • Processing: Normalization, security enhancements, and quality checks.
  • Tools: Openpyxl, FuzzyWuzzy, and Parquet formats for efficiency.
  • Workflow:
  • Roles: UHC – Broker – Client – User.
  • Revenue: UHC profits, brokers earn commissions, clients receive policies.
  • Logic: Handle scenarios for missing/existing partners or clients.
  • Integration: ThinkHR tracks policy updates and renewals.
  • Outcome: A unified, secure, and scalable data management system enabling better business decisions.
  • United Healthcare Data Management Project Summary

Data Engineer

BambooHR
09.2020 - 08.2021
  • Company Overview: Optimization of Project Costing - SummaryTech Stack
  • Objective: Leverage machine learning to predict budgets, estimate operational costs, and enhance resource management.
  • Key Goals:
  • Forecast project costs to avoid over/underspending.
  • Estimate daily operational expenses.
  • Improve resource allocation and efficiency.
  • Process:
  • Clean and transform data for analysis.
  • Use XGBoost and regression models for predictions.
  • Deploy models via Django and monitor with AWS CloudWatch.
  • Outcomes:
  • Achieved 94% accuracy with XGBoost.
  • Provided reliable cost ranges for budgeting.
  • Result: Accurate budgeting, cost control, and enhanced project efficiency.
  • Optimization of Project Costing - SummaryTech Stack

Education

Master of Business Administration (MBA) -

Rajasthan Technical University
01.2012 - 01.2014

Skills

Python

SQL

Data Warehousing

ETL

Cloud Computing

Data Mining

Data Visualization

Data Analysis

Data Modeling

Apache Spark

AWS

Data Pipeline

Data Migrations

Databricks

Pandas

AWS Lambda

Amazon S3

Amazon DynamoDB

AWS Glue

undefined

Personal Information

  • Father's Name: Shri Satyaveer Singh
  • Mother's Name: Smt. Sudesh Devi
  • Gender: Female
  • Marital Status: Married

Awards

Employee of the Month, 06/01/24

Timeline

Lead Software Engineer

Persistent System
02.2025 - Current

Data Engineer

Tech Mahindra
01.2023 - 02.2025

Data Engineer

Nationwide Mutual Insurance Company
10.2021 - 12.2022

Data Engineer

BambooHR
09.2020 - 08.2021

Master of Business Administration (MBA) -

Rajasthan Technical University
01.2012 - 01.2014
Yukti SinghLead Software Engineer