Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Aishwarya Balachandran

Senior Data Scientist
Coimbatore

Summary

Senior Data Scientist with 6+ years of experience designing and deploying production-grade machine learning systems across retail, restaurant, agriculture, healthcare, and e-commerce domains. Proven expertise in demand forecasting, large-scale time series modeling, GenAI/LLM systems, and ML platform architecture. Strong track record of translating business problems into scalable AI solutions adopted by global enterprises. Adept at end-to-end ownership from data engineering and modeling to deployment and monitoring

Overview

9
9
years of professional experience
6
6
Certifications

Work History

Data Scientist

Wavicle Data Solutions
01.2025 - Current
  • Developed a robust, end-to-end pipeline for demand forecasting, processing large-scale time series data.
  • Engineered and optimized exogenous, static, and dynamic variables to refine forecast accuracy and enhance model explainability.
  • Handled multiple forecasting models, including ARIMA, Prophet, Temporal Fusion Transformer (TFT), and DeepAR, selecting the champion model based on MAPE, RMSE, and other performance metrics.
  • Applied hierarchical analysis to forecast demand at different levels, such as product granularity and item-customer unique combinations.
  • Built custom feature engineering techniques, including lag variables, rolling averages, seasonality decomposition, and external factor incorporation, to improve model robustness.
  • Addressed demand drifts using adaptive moving average methods to maintain forecast accuracy over time.
  • Conducted post-processing, error analysis to fine-tune forecasting strategies and optimize predictive insights for business decision-making.
  • Integrated forecasting outputs with business intelligence dashboards, enabling real-time visualization and actionable insights.
  • Leveraged Azure ML Studio for model training and deployment, utilizing Azure Compute for scalable processing and Azure Blob Storage for efficient data handling and executed forecasting models within Azure Notebooks, ensuring seamless integration with cloud-based workflows and automated retraining pipelines.
  • Used parallel processing and distributed computing techniques to handle large-scale forecasting tasks efficiently.

Data Scientist

Wavicle Data Solutions
04.2021 - Current
  • "Just Walk Out" technology can track customers throughout the store, uses an impressive display of Internet of Things (IoT) and artificial intelligence (AI) capability to power its innovation efforts.
  • Object detection technology we shall build a system that recognizes products in a restaurant and allows customers to pick it up and walk out which gives a totally new experience.
  • Dataset Conception and Processing.
  • Data access - Analysis & categorize.
  • Code to automatically categorize the image data.
  • Code and compare the performance of models.
  • Apply image preprocessing techniques & setup darknet.
  • Dataset Augmentation and Annotation.
  • Develop different formats of annotations for yolo.
  • Assess Annotation Quality-Fix for Unbalanced Classes-De-duplicate Images.
  • Data Augmentation - Angle Flipping , Probability, Rotations.
  • Data files preparation for yolo consumption.
  • Model Development and Training.
  • Configuring cuDNN on Colab for YOLOv4.
  • Darknet configuration for YOLOv4 - MAKE FILE.
  • Setup yolov4-tiny weights.
  • Setup and get darknet format roboflow.
  • Develop custom training code.
  • Model Training config code and Evaluation metrics.
  • Model Prediction Performance / Monitoring.
  • Check and develop combinations of annotations.
  • Tune the hyper params and redevelop the training code.
  • Error Handling and Successful Model Train.
  • Develop prediction code and unit testing - Final Output.

Data Scientist

Wavicle Data Solutions
09.2020 - Current
  • Text Analytics with python.
  • Dataset creation for fine tuning and testing.
  • Text Analytics workflow to modify the parent and child process for optimization.
  • Statistical Data Analysis with EDA report for MCD.
  • Data Stratification and evaluation metrics.
  • Evaluation process automation.
  • Multi label topic classification.
  • Comparative analytics with Evaluation metrics.

Senior Data Scientist | ML Engineer

Wavicle Data Solutions
06.2020 - Current

Crop Yield Forecasting Platform

  • Architected and deployed a fully automated, production-grade forecasting platform on Azure, handling millions of weekly records across 200+ growers, and multiple crop categories.
  • Designed a multi-layer forecasting pipeline (Preprocessing → Modeling → Postprocessing → Monitoring), enabling zero-touch weekly execution.
  • Built a hybrid forecasting engine combining ARIMA, DeepAR, TFT, Random Forest, and XGBoost, with champion–challenger model governance.
  • Implemented crop-specific anomaly handling, capping, and season normalization, significantly improving forecast stability in noisy, real-world data.
  • Delivered SHAP-based explainability and executive-ready insights to enable trust, diagnostics, and root-cause analysis.
  • Built production monitoring using Azure Log Analytics and KQL, enabling proactive alerting for pipeline failures, data drift, and compute issues.

Restaurant and Menu Analytics

  • Delivered text analytics, demand modeling, and statistical analysis for restaurant datasets, including menu items, transactions, and customer feedback.
  • Active Insights – Restaurant NLP Analytics: Built multi-label NLP and sentiment analysis pipelines using BERT, SpaCy, and AWS Comprehend.
  • Developed text analytics pipelines for menu and customer feedback analysis.

GenAI-Based BI and Analytics Validation Platform

  • Designed a GenAI-powered system to validate and compare Tableau and Power BI dashboards using LLMs and computer vision.
  • Automated extraction of charts, tables, layouts, and contextual metrics from images and PDFs.
  • Evaluated and deployed Qwen Vision, Gemini, and Claude, optimizing for reasoning accuracy, latency, and scalability.
  • Built Flask-based APIs and evaluated AWS SageMaker. EC2 for cost-efficient deployment.

GenAI Web Intelligence – Security Classification Platform

  • Led development of a GenAI-driven web intelligence system to identify whether communities use manual or automated security systems.
  • Built large-scale semantic web scraping pipelines to collect public data from forums, websites, and community pages.
  • Applied semantic similarity, entity extraction, and LLM reasoning to classify security posture.
  • Integrated Gemini APIs, Tavily search, and custom prompts to automate evidence discovery and classification.

Data Engineering and Analytics Modeling (dbt + Redshift)

  • Led a dbt-based data transformation program covering 70+ source-to-target Redshift models.
  • Designed modular, reusable dbt models aligned with analytics engineering best practices.
  • Built a GenAI POC for automated dbt model and YAML documentation generation, leveraging Jinja templating, auto-generated macros, and LLM-assisted documentation.

BI Asset Intelligence (E-commerce)

  • Built data extraction pipelines using MicroStrategy REST APIs to process over 50,000 BI assets.
  • Developed semantic similarity and NLP models (fuzzy match, SpaCy, BERT, clustering) to auto-map assets to business domains.
  • Reduced manual BI asset mapping effort by 60% through automated classification.

ML Operations and Data Engineering (Automobile)

  • Built scalable data ingestion, validation, and ML automation pipelines using Jenkins, Docker, and AWS Athena.
  • Productionalized 20+ ML pipelines, handling historical backfills, and schema evolution.
  • Improved reliability through automated CI/CD and monitoring.

Clinical Data Analytics Platform

  • Delivered end-to-end clinical data ingestion and analytics pipelines using AWS SageMaker, Glue, Redshift, and QLDB.
  • Automated model training with EMR ephemeral instances and serverless workflows using Lambda.
  • Built statistical analytics and reporting for medical study insights.

Senior Data Scientist

Wavicle Data Solutions
06.2020 - Current
  • Implemented a full-stack, production-grade forecasting platform on Azure using Fabric, Azure Blob Storage, Azure ML, Azure Alerts and notifications, and Python-based pipelines.
  • Built a multi-layered forecasting architecture (Preprocessing → Categorization → season-continuity logic with Patching → Capping → Modelling → Postprocessing) handling millions of weekly records for 200+ growers across multiple crops & greenhouse blocks.
  • Engineered a hybrid forecasting engine combining timeseries models ARIMA, DeepAR, TFT, and predictive analytics models like Random Forest, XGBoost, including champion-challenger selection with rule-driven model fallback.
  • Designed cluster-aware, config-driven modeling pipelines supporting cold-start crops, limited-history growers, and dynamic exogenous variable creation.
  • Delivered model interpretability using SHAP-based RFE and feature importance analytics.
  • Implemented yield cleaning and normalization through multi-stage Threshold Capping, Scaling & Season Removal (crop-specific rules with cucumber/tomato/pepper logic). Developed a 3-step capping pipeline using weekly crop thresholds, seasonal average scaling, and season removal (especially for 2021 anomalies), with crop-specific logic (e.g., cucumber allows 3 exceedances, tomato/pepper 10).
  • Automated pipeline orchestration with Azure ML Schedules and offset-driven hyper trigger system with self-driving execution loop, where each completed offset updates Blob storage and triggers the next run through Azure ML Scheduled Jobs-supporting 19 offsets per cycle without manual intervention.
  • Implemented robust missing-data and season-continuity logic via patching algorithms: Authored algorithms for 1-week gap patching, multi-week (>3) missing actuals segmentation, crop age recalculation, and leading/mid/trailing missing classification, enabling stable season-level modelling even with noisy agricultural datasets.
  • Built production-grade monitoring, alerts, and operational visibility using Log Analytics + KQL : Created KQL queries to track pipeline cancellations, failures, compute node issues, dataset misalignments, long-running jobs, and integrated these with Azure Monitor alerts for proactive notification to engineering & business teams.
  • Analysis on Architecture of forecasting intelligence reasoning agent (MCP-like) : Analysis on LLM-powered agent capable of forecast explainability, root-cause diagnostics, cross-table reasoning, anomaly detection, seasonal behavior analysis, and what-if simulations, using the entire preprocessing + modelling + business rules system as its knowledge base.
  • Delivered fully automated weekly forecast generation supporting operational planning at scale with Azure: Achieved a reliable weekly forecasting cycle where preprocessing, modelling, postprocessing, auditing, and output delivery are fully automated by enabling zero-touch execution, accurate crop yield insights, and seamless integration with business planning systems.

Data Scientist

Wavicle Data Solutions
07.2024 - 01.2025
  • Responsible for development of Analyzer which extracts the data through rest apis from microstrategy BI Tool and engineering the data by understanding the insights of data through data engineering techniques and mapping of data to its respective functional and master domains. Handled text similarity algorithms and data processing techniques to handle 50K data Assets.
  • Building Data Extraction and Data Aggregation pipelines to extract the data through Rest Apis of MicroStrategy BI Tool and Transform and aggregate the data.
  • Responsible for handling large data and processing, cleaning and transforming the data to better organize and visualize the data.
  • Developed semantic similarity models using fuzzy match, spacy, bert transformers, kmeans and hybrid of all methods to calculate a threshold and arrive at a decision of which asset might belong to which domain.
  • Handled NLP models to autogroup or cluster the similar attributes and run the similarity algorithm on domain names to predict the matching domains.
  • Accelerated the project need by developing semantic similarity algorithms to look at various dimensions of the data asset and auto map an asset to multiple functional and master domains based on domain to keyword hash dictionary. This helped the team to avoid tedious manual mapping for 25k+ data and accelerated the process by 60 percent.
  • Development of microservices that would parse multiple queries and extract the necessary information of attributes, metrics and table names from the query which will be used later in the pipeline.
  • Analytics on occurrences of unique combination of attributes and table names to understand the prioritized assets.
  • Research work on BI and visualization tools like MicroStrategy, Tableau and SAPBO. Mapping if assets to relevant functional and master domains.
  • Frequency analysis of unique attributes and its occurrences across the entire dataset to understand the proficient data products.

Data Scientist

Wavicle Data Solutions
09.2022 - 11.2023
  • Responsible for Data Engineering, Data science activities and ML operations and automation pipelines to perform analytical data ingestion via CI/CD pipelines and data analytics in python notebooks to produce processed data.
  • Building pipelines to extract the data, clean and process the data for analytics.
  • Responsible for handling large data for ML modals by analysing and processing according to organized and analytical data formats.
  • Experienced in productionalzing ML modals via pipelines and taking data through different stages of processing to produce structured results.
  • Responsible for handling scalable data through jenkins pipeline and Athena.
  • Data validations and type checking along with historical data backfill with respect to new changes.
  • Pipeline automation by creating jenkins pipeline, automic job flow, and code structure via docker.
  • Backtracking and resolving productional data issues and fixes.
  • Worked and handled 20 ML Pipelines by performing data flow operations and data processing.
  • Experienced in handling data with AWS.

Data Scientist

Wavicle Data Solutions
06.2021 - 09.2022
  • Responsible for Data Engineering and Data science activities to perform clinical data ingestion via pipelines and data analytics in sagemaker notebooks to produce statistical chart analysis.
  • Building pipelines to extract the data, clean and process the data for analytics.
  • Experienced in using AWS services for production level algorithms.
  • AWS Redshift and QLDB for data store - Transactional data.
  • Experienced in using AWS EMR by leveraging ephemeral instances to automate model training and prediction.
  • AWS sagemaker training jobs to perform tasks of ephemeral instances Usage of AWS lambda for serverless deployment and data insertion with triggers.
  • AWS Glue jobs for transforming the data as required and loading.
  • Usage of castor - clinical data management tool to leverage and integrate data through REST calls via lambda.
  • Cloudwatch log Analysis.
  • Implementation of Api integration and data extraction via castor electronic data capture platform.
  • Experienced in writing wrappers to pull data to capture the data via apis.
  • Experienced with AWS glue job for data extraction and ETL.
  • Basics of Pyspark in Glue job.
  • Redshift query optimizations and operations.
  • Managing entire end to end pipeline for data Ingestion - Analysis and Storage.
  • Responsible for working in pipeline management to confirm the process flow of medical data.
  • Basic experience in R language to manage the data science analysis jobs with lambda and sagemaker notebooks.
  • Experienced in working with JSON structure of data extraction and pivot analysis of tables for data merge and transformation operations.

Data Scientist

Wavicle Data Solutions
07.2020 - 08.2020
  • Automated ML Platform for both the Data Scientist and Business User to upload data and predict with ease.
  • Automatic data process.
  • Logic Formulation for automated process.
  • Image Analytics.
  • Object Detection CNN.
  • Tensor Flow, OpenCV, imageai.
  • Missing Data Interpolation.
  • One-Hot Encoding.
  • Encapsulated functions into Flask Framework.
  • Rest Api to access the automated process.

Data Scientist

Wavicle Data Solutions
06.2020 - 08.2020
  • Gesture recognition is a type of perceptual computing that allows users to interact with the computer screen without a touch.
  • It can be used in many areas of including a possibility to help people with hearing or speaking disabilities.
  • We use it with Restaurant Domain to eliminate touch with kiosks.
  • It can also be used for virtual zoom in, zoom out, tuning operations for devices.
  • Research on ASL and virtual gesture tuning.
  • Capturing and recording Images of Gestures by OpenCV.
  • Image Recognition (hand).
  • Data Preprocessing and Manipulations.
  • Building CNN with Keras, Tensorflow.
  • Hyper Parameter Tuning.
  • Webcam enabled gesture prediction by OpenCV.
  • Trainable and custom modeling.
  • Image augmentation and data set tuning.

Machine Learning Engineer

Datinfi Pvt Ltd
08.2018 - 11.2019
  • To protect E Commerce counterfeit with Intelligence.
  • Analytics on Ecommerce data.
  • Scraping Ecommerce Data.
  • Image Processing For Image Similarity.
  • Facebook Api Integration.
  • Analytics on Customer reviews and Sentimental Analysis.
  • NLP ETL - Analytics - Data & Pattern Analysis.
  • Processing millions of invoices and getting the best approach for invoices to be accepted by clients.
  • Statistical Modeling for Sales and Revenue Analysis.
  • Stock Analysis for the Restaurant Domain.
  • Streak Analysis with Sales Data.
  • Popularity Boost, Best-Seller Analysis.
  • Ranking Models and Data Forecasting.

Data Scientist and Python Developer

Ndot Technologies
08.2017 - 08.2018
  • Destination Suggestion Api: This project provides a personalized destination suggestion that helps a particular taxi customer to just click on the address instead of typing the address. Options will be provided from which one can choose his frequent visiting place or his destination he wishes to visit.
  • Pattern Analysis: Logistic Domain, Restaurant.
  • Demand Prediction: We do software for restaurant management where there is a requirement for them to analyze and predict sales patterns with machine learning algorithms. Sales forecast and statistical analysis for restaurants with all the analytics that support trends in python was my part in the project.

Education

Bachelor - Computer Science and Engineering

Sri Krishna College of Engineering & Technology
05.2018

Skills

Python

Data Science & Applied Machine Learning

Forecasting & Time Series Modeling

Feature Engineering & Model Evaluation

ETL & Data Engineering pipelines DBT

Prompt Engineering & Model Evaluation

Text Analytics(Multi-Class, Multi-Label,NER)

Agentic AI (MCP – Foundations)

Vision Analytics - OpenCV,YOLOv4

RoboFlow - Annotations

Frameworks - TensorFlow,Scikit-Learn

AWS Comprehend, Redshift, SageMaker

JavaScript, Reactjs, GraphQL

PySpark

MongoDB, SQL

Power BI, MicroStrategy,Tableau

Flask,Spacy

Certification

DP-100, Microsoft, 2020-06-01

Timeline

Data Scientist

Wavicle Data Solutions
01.2025 - Current

Data Scientist

Wavicle Data Solutions
07.2024 - 01.2025

Data Scientist

Wavicle Data Solutions
09.2022 - 11.2023

Data Scientist

Wavicle Data Solutions
06.2021 - 09.2022

Data Scientist

Wavicle Data Solutions
04.2021 - Current

Data Scientist

Wavicle Data Solutions
09.2020 - Current

Data Scientist

Wavicle Data Solutions
07.2020 - 08.2020

Senior Data Scientist | ML Engineer

Wavicle Data Solutions
06.2020 - Current

Senior Data Scientist

Wavicle Data Solutions
06.2020 - Current

Data Scientist

Wavicle Data Solutions
06.2020 - 08.2020

Machine Learning Engineer

Datinfi Pvt Ltd
08.2018 - 11.2019

Data Scientist and Python Developer

Ndot Technologies
08.2017 - 08.2018

Bachelor - Computer Science and Engineering

Sri Krishna College of Engineering & Technology
Aishwarya BalachandranSenior Data Scientist