Summary
Overview
Work History
Education
Skills
Websites
Intro
Technologies
Timeline
Generic

Yashi Mishra

AI Data Engineer
Jaipur,RJ

Summary

AI & Data Engineer with 4 years of experience bridging large-scale data engineering with production AI systems - Python (PySpark), SQL, Databricks, and cloud-native architectures across Azure/GCP/AWS, with focus on data quality and real-time pipelines. Proven track record authoring 300+ data-quality rules for UAE government pipelines on Microsoft Fabric + Azure Databricks, finetuning multilingual LLMs with parallel/batch training pipelines, cutting SQL latency 30%+ to under 2 seconds, and prototyping real-time Voice AI systems (Whisper + Deepgram + ElevenLabs) with full production architecture estimates. Delivered onsite with enterprise and government stakeholders in Dubai and Abu Dhabi, translating ministry-level requirements into production data systems across UAE/UK/India time zones.

Overview

4
4
years of professional experience

Work History

Data Engineer

FiftyFive Technologies
Jaipur, RJ
08.2022 - 05.2026
  • Owned scoping, architecture, and production delivery for enterprise and government clients across UAE, UK, and India; led onsite engagements with ministry stakeholders.
  • Client Engagements:
  • AI Interview Assistant (Voice & NLP PoC): Built a proof-of-concept real-time conversational pipeline - Whisper + Deepgram + ElevenLabs - for a client exploring AI-driven interviews. Researched the production architecture end-to-end, sized the engineering estimate, and presented the build plan to the client.
  • AGY Logistics (Cold-Chain IoT
  • GCP): Owned the alerting and workflow automation layer - automated Slack notifications on critical reefer-temperature thresholds, orchestrated via N8N + Make.com. Supported the broader Bronze→Diamond medallion architecture on GCP with data quality checks and end-to-end testing.
  • QxLab AI (LLM Fine-Tuning Research): Fine-tuned multiple large language models on the Hugging Face stack for production inference. Built and optimized parallel and batch training pipelines on distributed Azure (Ray) infrastructure for multilingual LLM workloads.
  • ODA - UAE Aid (with Artefact
  • Abu Dhabi/Dubai
  • Onsite): Stood up the data quality platform for the UAE Official Development Assistance program - authored 300+ validation rules across schema, business-logic, and cross-system integrity on Microsoft Fabric + Azure Databricks for audit-grade ministry pipelines. Designed the logical/physical data model and KPI framework that standardized cross-ministry reporting, and rolled out Unity Catalog for centralized lineage and access control.
  • Smart Cloud Kitchen (Query Perf): Rewrote critical SQL queries and stored procedures across PostgreSQL + BigQuery - cut frontend query latency 30%+ to under 2 seconds for live ops dashboards. Built FastAPI services on Cloud Run and Looker Studio dashboards surfacing weekly trends and per-store failure patterns.
  • ZipApply (Job Platform Backend): Designed the backend data architecture for candidate-recruiter matching using a life-quality scoring signal with integrated email workflows. Built HPCC ECL ETL pipelines (Roxie real-time queries, cron-based scheduling) and OpenAI-powered tools for resume enhancement and cover-letter generation.

Education

B.Tech - Computer Science & Engineering

GLA University
06.2023

Skills

  • Whisper STT
  • Deepgram
  • ElevenLabs
  • OpenAI APIs
  • Hugging Face
  • RAG
  • Conversational AI
  • Prompt Engineering
  • Ray
  • PySpark
  • SQL
  • Databricks
  • BigQuery
  • Microsoft Fabric
  • Unity Catalog
  • Dbt
  • Snowflake
  • Airflow
  • ETL/ELT
  • Star/Snowflake Schemas
  • Data Quality
  • Medallion Architecture
  • Real-Time & Batch Processing
  • Python
  • FastAPI
  • REST APIs
  • PostgreSQL
  • MySQL
  • MongoDB
  • Azure
  • GCP
  • AWS
  • Terraform
  • Docker
  • CI/CD
  • ReactJS
  • Looker Studio
  • Power BI
  • Tableau
  • N8N
  • Makecom

Intro

AI & Data Engineer with 4 years of experience bridging large-scale data engineering with production AI systems - Python (PySpark), SQL, Databricks, and cloud-native architectures across Azure/GCP/AWS, with focus on data quality and real-time pipelines. Proven track record authoring 300+ data-quality rules for UAE government pipelines on Microsoft Fabric + Azure Databricks, finetuning multilingual LLMs with parallel/batch training pipelines, cutting SQL latency 30%+ to under 2 seconds, and prototyping real-time Voice AI systems (Whisper + Deepgram + ElevenLabs) with full production architecture estimates. Delivered onsite with enterprise and government stakeholders in Dubai and Abu Dhabi, translating ministry-level requirements into production data systems across UAE/UK/India time zones.

Technologies

Whisper STT, Deepgram, ElevenLabs, OpenAI APIs, Hugging Face, RAG, Conversational AI, Prompt Engineering, Ray, PySpark, SQL, Databricks, BigQuery, Microsoft Fabric, Unity Catalog, dbt, Snowflake, Airflow, ETL/ELT, Star/Snowflake Schemas, Data Quality, Medallion Architecture, Real-Time & Batch Processing, Python, FastAPI, REST APIs, PostgreSQL, MySQL, MongoDB, Azure (Databricks/ADF/ADLS Gen2/AzureML), GCP (BigQuery/Pub-Sub/Cloud Run/Functions), AWS (S3/Lambda/Glue), Terraform, Docker, CI/CD, ReactJS, Looker Studio, Power BI, Tableau, N8N, Make.com

Timeline

Data Engineer

FiftyFive Technologies
08.2022 - 05.2026

B.Tech - Computer Science & Engineering

GLA University
Yashi MishraAI Data Engineer