Generate AI creates new content with business value for SAP Business Processes on the SAP Business Technology Platform SAP BTP.
Generative AI transforms single or multi modal inputs to outputs like text, image, audio or code output.
The SAP Generative AI Hub enables the integration of Large Language Models (LLMs) into SAP Business Processes.
Generative AI can generate new content for a wide area of AI scenarios often with multi-modal foundation models with multiple input options like text, speech or image. Foundation models process new downstream tasks with transfer and self-supervised learning capabilities.
Generative AI Prompts
Foundation models offer emergent capabilities based on self-supervised training to create labels based on data structures with the opportunity to be applied to many diverse tasks.
NLP use-cases are typically implemented with Large Language Models (LLMs) to process language input and generate text, code or image output. Some Generative AI NLP tasks are classification, sentiment determination, summarization, comparison or generation.
Business Context Grounding
Input prompt instructions, sent to Deep Learning models, can be enriched with grounding techniques to improve outputs of Generative AI solutions.
Prompt engineering and advanced search capabilities are typical options to optimize generates outputs for custom business context without model retraining. Vector search on embedding models is commonly used on different data sources for RAG, similarity search or recommendation scenarios.
Based on vector search techniques, Retrieval-Augmented Generation (RAG) augments prompts for generative AI solutions with information retrieved from custom data sources.
Large Language Models are foundation models trained on large text datasets to perform several tasks. The reusability of these models reduces training effort and costs which enables use-cases with small training datasets or machine learning teams.
Some Natural Language Processing (NLP) application areas of LLMs are autonomous AI assistants or different kind of document processing.
Generative AI data preparation typically includes the tokenization of natural language input into small text pieces followed by converting these pieces into vectors.
Byte-Pair Encoding (BPE) is widely used to implement Generative AI tokenizers like for instance tiktoken for OpenAI models.
Vectorization converts text tokens into Embeddings which are numerical vector representations optimized for machine learning processing.
Vector representations can be used in NLP analysis for similarity calculations with Euclidian distance, Dot product or Cosine formulas. Use cases for interpretations of vector directions are comparisons for text mining, sentiment analysis, document clustering or similarity search.
Transformer neural networks remember long-range dependencies and are able process sequence to sequence tasks. Main capabilities of transformer models are processing whole sentences, self attention and positional embeddings.
Transformer architectures are separated into encoder and decoder parts both with attention and feed forward components.
Encoders transform language tokens into coordinates within multidimensional vector spaces of semantic language models. The distance of tokens within these embedding models represent their semantic relationship. Embeddings are used for NLP analysis tasks like summarization, key phrase extraction, sentiment analysis with confidence scores or translation.
Decoder are able to generate new text sequences and enable decoder only Conversational or Generative AI solutions.
Attention layer weights in encoder or decoder components have impact on the choice of prediction results. Encoder attention layer weights try to quantify the meaning of words within text sequences. Decoder attention layers predict the most probable output token in a sequence.
Traditional transformer models are composed of encoder-decoder components. Encoder tokenize input text with vector-based semantic and syntactic representations with embeddings. Decoders generate new language sequences with determined probable text sequences.
Interaction with Generative AI assistants like Microsoft Copilot or SAP Joule improve digital user experience based on data-driven decisions supported by LLMs. Microsoft Copilot combines ChatGPT LLMs with data provided by Microsoft Graph and offers the option to build custom copilots for various business-specific tasks.
Open source Large Language Model, interference server and GenAI tool offerings are rapidly growing and can be used to implement solutions for intelligent downstream tasks. Some advantages of open-source GenAI are cost reduction, local solutions to fulfill advanced data security requirements or flexible model fine-tuning. Examples of open source GenAI model families are LLama, Mistral or Falcon.
GenAI platforms allow Bring Your Own Model (BYOM) and serve models as containerized web applications. GenAI model interference server can be defined by serving templates with parameters like number of replicas. LLM models can be called via inference request endpoints on hyperscaler object stores to return predictions for consuming customer services.
The SAP BTP Generative AI Hub offers LLMs as selection with a toolset to engineer LLM prompts and integrate GenAI into business processes.