rag_Execute_Workflow
Execute complete RAG workflow to answer user questions using document context, handling embedding generation, semantic search, and context retrieval automatically.
Instructions
Execute complete RAG workflow to answer user questions based on document context. This tool handles the entire RAG pipeline in a single step when a user query is tagged with /rag.
WORKFLOW STEPS (executed automatically):
Configuration setup using configurable values from rag_config.yml
Store user query with '/rag ' prefix stripping
Generate query embeddings using either BYOM (ONNXEmbeddings) or IVSM functions based on config
Perform semantic search against precomputed chunk embeddings
Return context chunks for answer generation
CONFIGURATION VALUES (from rag_config.yml):
version: 'ivsm' or 'byom' to select embedding approach
All database names, table names, and model settings are configurable
Vector store metadata fields are dynamically detected
Embedding parameters are configurable
Default chunk retrieval count is configurable
Default values are provided as fallback
TECHNICAL DETAILS:
Strips the '/rag ' prefix if present from user questions
Creates query table if it does not exist (columns: id, txt, created_ts)
BYOM approach: Uses mldb.ONNXEmbeddings UDF for tokenization and embedding
IVSM approach: Uses ivsm.tokenizer_encode and ivsm.IVSM_score functions
Both approaches store embeddings in configured output table
Uses cosine similarity via TD_VECTORDISTANCE for semantic search
Returns the top-k matching chunks from the configured vector store
Each result includes chunk text, similarity score, and metadata fields
CRITICAL ANSWERING RULES:
Answer ONLY using retrieved chunks - no external knowledge, speculation, or inference
Quote source content directly without paraphrasing, summarizing, or rewriting
If no relevant context found: "Not enough information found in the provided context"
If partial context: "The available context does not fully answer the question"
Include document/page references when available (e.g., "On page 2 of 'demo_policy.pdf'...")
Execute entire workflow silently without showing function calls to user - only show final answer
LANGUAGE RESTRICTIONS:
Do not say "According to the context" or "The context says" - just quote directly
Do not say "It can be inferred that..." - no inference allowed
Use exact or near-verbatim quotes only
No paraphrasing, summarizing, or adding transitions between quotes
Copy content precisely - no expansion or interpretation
BEHAVIORAL REQUIREMENTS:
Execute all RAG workflow steps silently without showing tool calls to users
Only display final answers based strictly on retrieved context chunks
Do not use external knowledge or speculation - only quote retrieved content directly
RAG mode only triggers with explicit /rag prefix - do not enter RAG mode implicitly
Hide all tool execution steps from user
Provide clean, direct response with only the answer based on retrieved chunks
This is a complete end-to-end RAG workflow - no additional tools needed
EXECUTION: Run completely silently - user only sees their question and the final context-grounded answer.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
k | No | ||
question | Yes |