Summary
Overview
Work History
Education
Skills
Timeline
Generic

Ankit Shrivastava

Bangalore

Summary

Innovative Senior Data Scientist recognized for high productivity and efficiency in task completion. Specialized in machine learning, predictive analytics, and big data processing, with adeptness at translating complex datasets into actionable insights. Excel using critical thinking, problem-solving, and effective communication to drive project success and foster team collaboration.

Overview

9
9
years of professional experience

Work History

Senior Data Scientist

SAP Labs
Bangalore
10.2015 - Current

ABAP Foundation LLM (GenAI4ABAP)
Developing an LLM to support ABAP development with tasks like code completion, review, and explanation.

Purpose:
We are building a foundational LLM model tailored to ABAP code. This model, based on open-source frameworks like Starcoder, Mistral, and Codestral, will undergo pretraining and fine-tuning to enhance specific downstream tasks for ABAP development.

Approach:

  • Pretraining: Autoregressive training followed by Fill-in-the-Middle (Prefix-Middle-Suffix) strategy.
  • Fine-tuning: Supervised fine-tuning with human feedback, focusing on real-world ABAP tasks.

Data Curation: Raw ABAP dumps were processed into meaningful segments using Abstract Tree Structure (ATS) to generate more accurate training data.

Alignment Training:
Addressed underperforming scenarios, such as unclosed statements, through alignment training with the Direct Preference Optimization (DPO) approach, significantly improving model performance.

Constraint Generation:
To accommodate ABAP’s Key User restrictions, we implemented a system to forbid certain tokens (e.g., direct database updates), ensuring the model adheres to allowed actions by dynamically forbidding restricted tokens during generation. This was achieved using an ABAP parser for token identification and backtracking.

Benchmarking Library:
Created a comprehensive library to evaluate ABAP LLMs, supporting models from:

  • Local checkpoints
  • Hugging Face
  • GenAIHub (via sap-llm-commons)

Metrics Supported:

  • String-based: Exact Match, Fuzzy Match
  • Syntax: ABAP Syntax Checks
  • Semantic: ABAP Semantic Checks (API and KG4HANA-based)

Custom Prompt Templates:
Developed templates for:

  • Code Completion
  • Natural Language Instructions

Inference Engines:
Deployed multiple inference engines for robust model serving:

  • vLLM for production
  • Hugging Face inference engine
  • SAP_genaihub
  • Nvidia_api_inference

Data Management (abap-genai-data-management):
Built data pipelines for training data recreation, fully integrated into Pachyderm for versioning and orchestration, with local execution support.

Model Training Framework:
We are utilizing the Nemo framework for training.

SAP GPT4HANA:
Worked on the SAP foundational model, improving performance through training and fine-tuning. Ensured high-quality data collection from the S/4 system and enhanced accuracy using the RAG concept with a vector database. Also experimented with XGBoost on similar datasets.

BTP Log Error Analysis:
Utilized RAG and vector databases to quickly identify log errors. Leveraged LLM to automate ticket creation and suggest solutions, improving issue resolution efficiency.

Dcom Chatbot:
Developed a GPT-4-powered chatbot for session-related queries using RAG techniques. This significantly boosted the chatbot's accuracy and improved user engagement and satisfaction.

GenAI Studio:
Led the creation of a low-code/no-code platform offering LLM as a service. Delivered a successful proof of concept, showcasing its potential to empower users with AI capabilities.

Talk to Your PDF (Dcom 2023):
Organized a popular session using Langchain, achieving the highest participation rate. Conducted additional sessions on Metaflow and DMC Visual Inspection, driving knowledge sharing and team development.

DMC Visual Inspection:
Delivered a complete proof of concept (POC) for BYOM/BYOS using AI Core and Launchpad. Evaluated cloud solutions from GCP and Azure for specific use cases.

AI Core and AI Launchpad Expertise:
Specialized in model training and deployment using AI Core and Launchpad. Streamlined onboarding for internal partners and provided

Education

High School Diploma -

Excellence BVN
Indore
05-2012

Skills

  • Deep learning
  • ML/AI
  • Model Fine tuning
  • Model pre training
  • Model deployement

Timeline

Senior Data Scientist

SAP Labs
10.2015 - Current

High School Diploma -

Excellence BVN
Ankit Shrivastava