CTX: Context as Code (CaC) tool

Overview Schema Related Servers Score Discussions

generator
docs
rag-tools

README.md•8.47 KiB

# Feature Request: Configurable RAG Knowledge Stores with Custom Tools ## Summary Refactor the RAG module to support multiple named storage configurations (servers + collections) and enable defining custom RAG tools via `context.yaml`. This improves AI understanding of tool purpose and allows project-specific knowledge stores with meaningful names and descriptions. ## Problem Statement ### Current Issues 1. **Generic Tool Names**: Current tools (`rag-search`, `rag-store`) don't convey purpose to AI 2. **Single Storage**: Only one Qdrant endpoint/collection per project 3. **No Customization**: Tool descriptions are hardcoded, not project-specific 4. **Tight Coupling**: Storage configuration mixed with feature toggle 5. **CLI Limitations**: Commands don't support multi-collection operations ### User Pain Points - AI doesn't understand when/why to use RAG tools - Can't use multiple knowledge bases (e.g., docs + architecture + team conventions) - Tool names like `rag-search` are meaningless to AI agents - No way to separate different knowledge domains --- ## Proposed Solution ### New Configuration Schema ```yaml # context.yaml rag: # 1. Define RAG servers (connection settings) servers: default: driver: qdrant endpoint_url: ${RAG_QDRANT_ENDPOINT:-http://localhost:6333} api_key: ${RAG_QDRANT_API_KEY:-} # optional cloud: driver: qdrant endpoint_url: https://my-cluster.qdrant.io api_key: ${QDRANT_CLOUD_API_KEY} # 2. Define named collections (reference servers) collections: project-docs: server: default # References server above collection: intruforce_docs # Qdrant collection name description: "Project documentation and guides" # Optional per-collection overrides embeddings_dimension: 1536 embeddings_distance: Cosine transformer: chunk_size: 1000 overlap: 200 architecture: server: default collection: intruforce_architecture description: "Architecture decisions and patterns" transformer: chunk_size: 2000 # Larger chunks for architecture docs overlap: 400 shared-knowledge: server: cloud collection: team_knowledge description: "Shared team knowledge base" # 3. Global vectorizer settings (shared by all collections) vectorizer: platform: openai model: text-embedding-3-small api_key: ${OPENAI_API_KEY} # 4. Default transformer settings (can be overridden per collection) transformer: chunk_size: 1000 overlap: 200 # 5. Define custom RAG tools tools: - id: project-docs-search name: find-docs # Optional: custom tool name (defaults to id) type: rag description: "Search project documentation for guides, tutorials, and how-to articles. Use this when you need to understand how to use project features." collection: project-docs # Reference to named collection operations: [ search ] # Available: search, store - id: project-docs-store type: rag description: "Store project documentation and insights for future retrieval." collection: project-docs operations: [ store ] - id: architecture-search name: arch-search # Tool will be registered as "arch-search" type: rag description: "Search architecture decisions, patterns, and design rationale. Use when understanding why code is structured a certain way." collection: architecture operations: [ search ] - id: team-knowledge-search type: rag description: "Search shared team knowledge including coding conventions and best practices." collection: shared-knowledge operations: [ search ] - id: team-knowledge-store type: rag description: "Store shared team knowledge for the whole team to access." collection: shared-knowledge operations: [ store ] ``` ### Tool Naming Convention For each collection with operations `[search, store]`, generate tools: - `{collection-id}-search` - Search tool - `{collection-id}-store` - Store tool ### Generated Tool Schemas **Search Tool Schema:** ```json { "type": "object", "properties": { "query": { "type": "string", "description": "Search query in natural language" }, "type": { "type": "string", "description": "Filter by type: architecture, api, testing, convention, tutorial, reference, general" }, "sourcePath": { "type": "string", "description": "Filter by source path (exact or prefix match)" }, "limit": { "type": "integer", "description": "Maximum number of results to return", "default": 10, "minimum": 1, "maximum": 50 } }, "required": [ "query" ] } ``` **Store Tool Schema:** ```json { "type": "object", "properties": { "content": { "type": "string", "description": "Content to store in the knowledge base" }, "type": { "type": "string", "description": "Type: architecture, api, testing, convention, tutorial, reference, general", "default": "general" }, "sourcePath": { "type": "string", "description": "Source path (e.g., 'src/Auth/Service.php')" }, "tags": { "type": "string", "description": "Tags (comma-separated)" } }, "required": [ "content" ] } ``` --- ## CLI Commands Update All RAG commands will support `--collection` option: ```bash # Index into specific collection ctx rag:index docs --collection=project-docs # Index into all collections (default behavior) ctx rag:index docs # Status for specific collection ctx rag:status --collection=architecture # Status for all collections ctx rag:status # Clear specific collection ctx rag:clear --collection=project-docs # Initialize specific collection ctx rag:init --collection=shared-knowledge # Reindex specific collection ctx rag:reindex docs --collection=project-docs ``` --- ## Backward Compatibility ### Legacy Format Support The old format will continue to work and be internally converted: ```yaml # LEGACY FORMAT (still supported) rag: enabled: true store: driver: qdrant qdrant: endpoint_url: http://localhost:6333 collection: ctx_knowledge # Internally converts to: # servers: { default: { driver: qdrant, endpoint_url: ... } } # collections: { default: { server: default, collection: ctx_knowledge } } ``` ### Detection Logic ```php // In RagParserPlugin if (isset($data['store'])) { // Legacy format - convert to new structure return $this->convertLegacyFormat($data); } // New format with servers/collections return $this->parseNewFormat($data); ``` ### Default Tools for Legacy Config When using legacy format without custom tools defined, automatically register: - `rag-search` (pointing to `default` collection) - `rag-store` (pointing to `default` collection) --- ## Benefits | Aspect | Before | After | |----------------------|------------------------|------------------------------------| | Tool naming | `rag-search` (generic) | `project-docs-search` (meaningful) | | AI understanding | Low | High (custom descriptions) | | Storage flexibility | 1 collection | Unlimited collections | | Configuration | Hardcoded | Declarative YAML | | Team sharing | Difficult | Easy (multiple servers) | | Knowledge separation | None | By domain/purpose | --- ## Implementation Stages See individual stage files for detailed implementation: 1. **Stage 1**: Configuration Infrastructure (ServerConfig, CollectionConfig) 2. **Stage 2**: Store Registry & Factory 3. **Stage 3**: Service Layer Updates (collection-aware services) 4. **Stage 4**: RAG Tool Type Support in Tool Parser 5. **Stage 5**: Dynamic Tool Generation 6. **Stage 6**: CLI Commands Update 7. **Stage 7**: Integration & Testing --- ## Open Questions 1. **Default Collection**: When no `--collection` specified in CLI, operate on all or require explicit choice? - **Decision**: Operate on all collections by default, with clear output per collection 2. **Tool Validation**: Fail at startup if referenced collection doesn't exist? - **Decision**: Yes, fail early with clear error message 3. **Collection Inheritance**: Should collections inherit from a "default" template? - **Decision**: Collections inherit global `transformer` settings unless overridden

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/context-hub/generator'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•8.47 KiB