Summary
Overview
Work History
Education
Skills
Timeline
Generic

G VSN SAI YASHASWEA BHARADWAJ

Guntur,Andhra Pradesh

Summary

Experienced Data Scientist with 4+ years of expertise in constructing data engineering pipelines using ADF and Databricks, among other tools. Deeply passionate about data engineering, continuously developing skills in ETL workflows, pipeline automation, and scalable data processing. Goal is to deliver efficient data solutions that optimize business operations through the power of data.

Overview

5
5
years of professional experience

Work History

Data Scientist

Tiger Analytics
09.2020 - Current

Azure IoT Edge & Microsoft Fabric Integration

  • Designed and implemented Azure IoT Edge devices for a retail project, including containerizing code and managing device mount paths.
  • Developed data transfer pipelines between Azure Blob Storage and Microsoft Fabric Lakehouse, enabling efficient storage and processing.
  • Created PySpark notebooks within Microsoft Fabric to apply transformations and store data in Delta tables for reporting, improving data accessibility and performance.

ML Pipeline Operationalization using VertexAI pipeline

  • Deployed and operationalized a client's ML pipeline on Google Cloud Platform (GCP), utilizing BigQuery for data storage and Vertex AI for orchestration.
  • Designed a three-stage pipeline consisting of data extraction, parallel modeling per target, and parallel post-processing, ensuring scalability and efficiency.
  • Orchestrated the execution with Cloud Scheduler and Cloud Functions, ensuring reliability and streamlined operations.

Medallion Architecture-Based Data Pipeline implementing a Common Data Model (CDM)

  • Built an end-to-end data pipeline using Azure Data Factory, Azure Databricks, and Delta Lake, following the Medallion architecture (Bronze, Silver, Gold layers) for structured data processing.
  • Developed metadata-driven pipelines supporting incremental and full loads based on a configuration file.
  • Implemented a Common Data Model (CDM) and Slowly Changing Dimensions (SCD Type 2) for historical tracking, ensuring high data integrity and compliance.

LLM Based Product Data Enhancement

  • Led the development of an AI-driven pipeline to enhance product titles, descriptions, and attributes for SKUs using Gemini Models in Vertex AI.
  • Designed a text-based LLM pipeline and a Visual Question Answering (VQA) model pipeline to extract attributes like color from images.
  • Integrated extracted attributes to generate enhanced product descriptions, improving data consistency, searchability, and customer experience.

Code Migration & Real-Time Monitoring with Kibana dashboards

  • Migrated Pandas-based code to PySpark, improving efficiency for large-scale data processing.
  • Automated job flows using Airflow DAGs and Autosys JIL files, optimizing task execution and pipeline management.
  • Designed and deployed Kibana dashboards for real-time and daily model monitoring, providing actionable insights for stakeholders.

Education

Bachelor of Technology -

National Institute of Technology Tiruchirapalli
Tamil Nadu
05-2020

Skills

  • Python
  • SQL
  • PySpark
  • Azure Data Factory
  • Databricks
  • Azure Functions
  • Microsoft Fabric
  • Vertex AI
  • BigQuery
  • Apache Airflow
  • Kibana
  • AWS Lambda

Timeline

Data Scientist

Tiger Analytics
09.2020 - Current

Bachelor of Technology -

National Institute of Technology Tiruchirapalli
G VSN SAI YASHASWEA BHARADWAJ