rag
Answer questions by retrieving relevant information from a knowledge base and generating responses using customizable language models and search modes.
Instructions
Perform Retrieval-Augmented Generation (RAG) query with full parameter control.
This tool retrieves relevant context from the knowledge base and generates an answer using a language model. Supports all search modes (semantic, hybrid, graph) and customizable generation parameters.
Args: query: The question to answer using the knowledge base. Required. preset: Preset configuration for common use cases. Options: - "default": Basic RAG with gpt-4o-mini, temperature 0.7, 10 results - "development": Hybrid search with higher temperature for creative answers, 15 results - "refactoring": Hybrid + graph search with gpt-4o for code analysis, 20 results - "debug": Minimal graph search with low temperature for precise answers, 5 results - "research": Comprehensive search with gpt-4o for research questions, 30 results - "production": Balanced hybrid search optimized for production, 10 results model: LLM model to use for generation. Examples: - "vertex_ai/gemini-2.5-flash" (default, fast and cost-effective) - "vertex_ai/gemini-2.5-pro" (more capable, higher cost) - "openai/gpt-4-turbo" (high performance) - "anthropic/claude-3-haiku-20240307" (fast) - "anthropic/claude-3-sonnet-20240229" (balanced) - "anthropic/claude-3-opus-20240229" (most capable) temperature: Generation temperature controlling randomness. Must be between 0.0 and 1.0. Lower values (0.0-0.3) = more deterministic, precise answers Medium values (0.4-0.7) = balanced creativity and accuracy (default: 0.7) Higher values (0.8-1.0) = more creative, diverse answers max_tokens: Maximum number of tokens to generate. Optional, uses model default if not specified. use_semantic_search: Enable semantic/vector search for retrieval (default: True) use_hybrid_search: Enable hybrid search combining semantic and full-text search (default: False) use_graph_search: Enable knowledge graph search for entity/relationship context (default: False) limit: Maximum number of search results to retrieve. Must be between 1 and 100 (default: 10) kg_search_type: Knowledge graph search type. "local" for local context, "global" for broader connections (default: "local") semantic_weight: Weight for semantic search in hybrid mode. Must be between 0.0 and 10.0 (default: 5.0) full_text_weight: Weight for full-text search in hybrid mode. Must be between 0.0 and 10.0 (default: 1.0) full_text_limit: Maximum full-text results to consider in hybrid search. Must be between 1 and 1000 (default: 200) rrf_k: Reciprocal Rank Fusion parameter for hybrid search. Must be between 1 and 100 (default: 50) search_strategy: Advanced search strategy (e.g., "hyde", "rag_fusion"). Optional. include_web_search: Include web search results from the internet (default: False) task_prompt_override: Custom system prompt to override the default RAG task prompt. Useful for specializing AI behavior for specific domains or tasks. Optional.
Returns: Generated answer based on relevant context from the knowledge base.
Examples: # Simple RAG query rag("What is machine learning?")
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | ||
| preset | No | default | |
| model | No | vertex_ai/gemini-2.5-pro | |
| temperature | No | ||
| max_tokens | No | ||
| use_semantic_search | No | ||
| use_hybrid_search | No | ||
| use_graph_search | No | ||
| limit | No | ||
| kg_search_type | No | global | |
| semantic_weight | No | ||
| full_text_weight | No | ||
| full_text_limit | No | ||
| rrf_k | No | ||
| search_strategy | No | ||
| include_web_search | No | ||
| task_prompt_override | No |