Kiwi MCP

kiwi-mcp
docs

REGISTRY_BUNDLING_STRATEGY.md•23.4 KiB

# Registry Bundling Strategy: Collections, Discovery, and Distribution **Date:** 2026-01-28 **Version:** 0.2.0 **Purpose:** Design framework for bundling tools, directives, and knowledge into shareable collections --- ## Executive Summary This document outlines how Lilux/RYE enables users to easily **bundle and share collections** of directives, tools, and knowledge—similar to GitHub, but for AI-native content. Users can: 1. **Create collections** - Bundle any mix of tools, directives, and knowledge 2. **Host collections** - On registry or personal git space 3. **Discover collections** - Via registry RAG + metadata search 4. **Install collections** - Into their project `.ai/` directories 5. **Contribute to collections** - Fork, extend, and merge back 6. **Mix sources** - Combine core RYE with domain-specific collections --- ## Part 1: Collection Architecture ### Collection Definition A **collection** is a versioned bundle of related content: ```yaml # .ai/collections/collection.toml [metadata] name = "data-processing" version = "1.0.0" description = "Tools and directives for data processing workflows" author = "team@example.com" license = "MIT" # Git repository where this collection lives source = "https://github.com/example/data-processing-collection" registry = "rye" # Which registry this publishes to # RAG configuration vector_config = "default" # Which vector store to use for search # Dependencies: other collections [dependencies] core = ">=0.1.0" # Depends on core RYE ml-utils = ">=1.0.0" # Depends on another collection # Content manifest [content] directives = [ "directives/process-csv.md", "directives/validate-data.md", "directives/transform-pipeline.md", ] tools = [ "tools/csv-parser.py", "tools/validator.py", "tools/transformer.py", ] knowledge = [ "knowledge/patterns/batch-processing.md", "knowledge/patterns/error-handling.md", ] # Tags for discovery [tags] categories = ["data", "processing", "automation"] skill-level = ["intermediate", "advanced"] domains = ["data-science", "ml-ops"] ``` ### Collection Directory Structure ``` collections/data-processing/ ├── collection.toml # Metadata + manifest ├── README.md # Human-readable guide ├── directives/ │ ├── process-csv.md │ ├── validate-data.md │ └── transform-pipeline.md ├── tools/ │ ├── csv-parser.py │ ├── validator.py │ └── transformer.py ├── knowledge/ │ ├── patterns/ │ │ ├── batch-processing.md │ │ └── error-handling.md │ └── concepts/ │ └── data-validation.md ├── tests/ │ ├── test_directives.py │ └── test_tools.py └── examples/ ├── example1.md └── example2.md ``` --- ## Part 2: Registry Architecture ### Three-Tier Registry System ``` ┌──────────────────────────────────────────────────────────────┐ │ REGISTRY ECOSYSTEM │ ├──────────────────────────────────────────────────────────────┤ │ │ │ Tier 1: CORE REGISTRY (Official) │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ • RYE core directives, tools, knowledge │ │ │ │ • Official collections (curated) │ │ │ │ • Vector embeddings (primary index) │ │ │ │ • RAG index (primary search) │ │ │ │ Repository: github.com/leolilley/rye │ │ │ └────────────────────────────────────────────────────────┘ │ │ │ │ Tier 2: COMMUNITY REGISTRIES (Secondary) │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ • Domain-specific collections │ │ │ │ • Community-contributed packs │ │ │ │ • Optional vector embeddings │ │ │ │ • Can have own RAG indices │ │ │ │ Example URLs: │ │ │ │ - github.com/org/ml-collection │ │ │ │ - github.com/user/personal-tools │ │ │ └────────────────────────────────────────────────────────┘ │ │ │ │ Tier 3: LOCAL SPACE (Personal) │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ • User's own collections + mixes │ │ │ │ • Git-backed or local-only │ │ │ │ • Personal vector index (optional) │ │ │ │ Location: ~/.local/share/rye/collections/ │ │ │ └────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────┘ ``` ### Registry URL Conventions ``` # Official Core Registry rye://core # Implicit: RYE core content rye://core@0.1.0 # Pinned version # Community Collection (GitHub) github://org/collection-name github://org/collection-name@1.0.0 # Direct HTTPS https://github.com/org/collection-name # Local (user space) local://collection-name file:///path/to/collection ``` --- ## Part 3: Vector Store & RAG Configuration ### Flexible Vector Config Each registry/collection can have its own vector configuration: ```yaml # lilux/config/vector_config.yaml registries: # Core RYE registry rye:core: name: "RYE Core" embedding_model: "sentence-transformers/all-MiniLM-L6-v2" backend: "local" # or: openai, huggingface, pinecone cache_dir: "~/.local/share/rye/cache/embeddings/rye-core" vector_db_path: "~/.local/share/rye/cache/vector-stores/rye-core" description: "Official RYE core content index" # Fallback to BM25 if vector fails fallback_search: "bm25" # Metadata filtering filters: - field: "category" name: "Content Category" - field: "skill_level" name: "Skill Level" # Community ML registry ml-community: name: "ML Community Collection" embedding_model: "sentence-transformers/all-MiniLM-L6-v2" backend: "openai" api_key_env: "OPENAI_API_KEY" cache_dir: "~/.local/share/rye/cache/embeddings/ml-community" source_repo: "https://github.com/ml-community/collection" # User's personal space local:personal: name: "Personal Collections" embedding_model: "sentence-transformers/all-MiniLM-L6-v2" backend: "local" cache_dir: "~/.local/share/rye/cache/embeddings/personal" local_only: true ``` ### Registry Discovery via RAG **Search architecture:** ``` User Query ↓ ┌─────────────────────────────────────────┐ │ Multi-Registry Search Pipeline │ ├─────────────────────────────────────────┤ │ │ │ 1. Query tokenization & embedding │ │ 2. Query-specific registry selection │ │ (which registries to search) │ │ 3. Vector search in each registry │ │ (parallel queries) │ │ 4. Metadata filtering │ │ (category, skill-level, etc.) │ │ 5. Relevance ranking │ │ 6. Result aggregation │ │ │ └─────────────────────────────────────────┘ ↓ Results (merged, ranked) ``` **Search examples:** ```python # Search across all enabled registries search "data processing" → Results from: rye:core, ml-community, personal # Search specific registry search "data processing" from:ml-community → Results from: ml-community only # Filter by metadata search "data processing" skill-level:advanced → Results matching skill level # Complex query search "batch processing patterns" category:data domain:ml-ops → Filtered results from all registries ``` --- ## Part 4: Collection Publishing Workflow ### Step 1: Create Collection Locally ```bash # Create collection structure mkdir -p ~/.local/share/lilux/collections/my-data-tools cd ~/.local/share/lilux/collections/my-data-tools # Create collection.toml cat > collection.toml << 'EOF' [metadata] name = "my-data-tools" version = "0.1.0" description = "My personal data processing tools" author = "me@example.com" license = "MIT" registry = "local" [content] directives = ["directives/process.md"] tools = ["tools/parser.py"] knowledge = ["knowledge/patterns.md"] EOF # Create content directories mkdir -p directives tools knowledge ``` ### Step 2: Develop & Test ```bash # Tools, directives, and knowledge work normally in user space # They're automatically discoverable via local registry # Test via execute tool lilux execute action run directive my-data-tools/process ``` ### Step 3: Version & Publish **Option A: Publish to GitHub (Community Registry)** ```bash # Initialize git repo git init git add . git commit -m "Initial version 0.1.0" git tag v0.1.0 # Push to GitHub git remote add origin https://github.com/username/my-data-tools git push -u origin main git push origin v0.1.0 # Now discoverable as: github://username/my-data-tools github://username/my-data-tools@0.1.0 ``` **Option B: Publish to Official Registry** ```bash # Via CLI (for curated/official content) lilux publish collection --name my-data-tools --registry rye # Requires: # - Passing content validation # - Proper metadata # - Tests passing # - Admin approval (for curated collections) ``` ### Step 4: Discovery & Installation ```bash # Search for collections lilux search "data processing" # Get collection info lilux info github://username/my-data-tools # Install collection into project cd /path/to/my-project lilux install github://username/my-data-tools # What happens: # 1. Clone/download collection # 2. Verify collection.toml # 3. Install content into .ai/directives, .ai/tools, .ai/knowledge # 4. Update local registry metadata # 5. (Optional) Download + index embeddings ``` --- ## Part 5: Collection Dependency Management ### Dependency Resolution Collections can depend on other collections: ```toml # collection.toml [dependencies] core = ">=0.1.0" # RYE core ml-utils = "1.0.0" # Pinned version "github.com/user/helpers" = "^0.5.0" # Semver range ``` ### Installation with Dependencies ```bash # Install my-ml-app (depends on ml-utils, core, etc.) lilux install github://username/my-ml-app # Automatic resolution: # 1. Fetch collection metadata # 2. Resolve dependencies recursively # 3. Check version compatibility # 4. Install in dependency order # 5. Verify all content loads # Result structure: ~/.local/share/rye/collections/ ├── rye-core/ # Auto-installed ├── github-user-helpers/ # Auto-installed ├── github-user-ml-app/ # Explicitly requested └── my-personal-tools/ # Previously installed ``` ### Conflict Resolution ```yaml # Two collections require different versions of a tool # Solution: Namespace isolated, copy-on-write ~/.local/share/rye/tools/ ├── core/ # From rye:core ├── helpers_v0.5/ # From ml-utils ├── helpers_v1.0/ # From another collection └── my-tools/ # Personal # Directives can explicitly import: import: "helpers_v1.0/utility.py" ``` --- ## Part 6: Collection Lifecycle & Versioning ### Semantic Versioning ``` Collections follow semver: MAJOR.MINOR.PATCH 0.1.0 = Initial release (alpha) 0.2.0 = Additions/improvements (pre-release) 1.0.0 = Stable, no breaking changes 1.1.0 = Additions 2.0.0 = Breaking changes # Breaking changes: - Renamed directives/tools - Changed tool signatures - Removed public content - Major dependency version jumps ``` ### Version Compatibility Matrix ```yaml # collection.toml [compatibility] min_lilux_version = "0.1.0" min_rye_version = "0.1.0" python_version = ">=3.9" # Declare which tools/directives work with which versions [versioning] directives: process-csv: breaking_changes: - version: "2.0.0" description: "Changed input signature" tools: parser.py: deprecated_in: "1.5.0" removed_in: "2.0.0" replacement: "tools/parser-v2.py" ``` ### Changelog Convention ```markdown # Changelog ## [1.0.0] - 2026-02-01 ### Added - New directive: `transform-with-validation` - Knowledge entry: `data-quality-patterns` ### Changed - Improved `process-csv` performance - Updated `validator.py` to support async ### Deprecated - `process-old.md` (use `process-csv.md` instead) ### Fixed - Bug in CSV parser with special characters ### Removed - Legacy tool `old-parser.py` ## [0.2.0] - 2026-01-28 ... ``` --- ## Part 7: Mix & Match: Personal Collections ### Creating a Custom Mix Users can create **meta-collections** that combine content from multiple sources: ```toml # ~/.local/share/rye/collections/my-workspace/collection.toml [metadata] name = "my-workspace" version = "1.0.0" description = "My personal ML workspace" type = "meta" # This is a collection of collections [includes] # Include entire collections core = "rye://core@>=0.1.0" ml-tools = "github://user/ml-tools@^1.0.0" data-processing = "github://org/data-processing@1.5.0" # Include specific items from collections selected-patterns = [ "github://community/patterns/batch-processing.md", "github://community/patterns/async-chains.md", ] # Include from local personal-tools = "local://my-scripts" [overlays] # Create aliases for commonly used items "batch" = "data-processing/directives/batch-process.md" "validate" = "data-processing/directives/validate-data.md" ``` ### Collection Composition ``` Install meta-collection "my-workspace" ↓ Resolve dependencies: ├── rye:core (fully) ├── ml-tools (fully) ├── data-processing (fully) └── personal-tools (fully) ↓ Create overlay: ~/.local/share/rye/overlays/my-workspace/ ├── batch → data-processing/directives/batch-process.md ├── validate → data-processing/directives/validate-data.md └── ... ↓ Available in searches: search "batch" # Finds overlay search "data processing" # Finds original ``` --- ## Part 8: Implementation: Tools & Directives ### New Directives ```xml  <directive name="collection/publish" version="1.0.0"> <metadata> <description>Publish a collection to a registry</description> <inputs> <input name="collection_path" type="path" required="true"> Path to collection directory </input> <input name="registry" type="string" default="local"> Target registry (local, rye, github) </input> <input name="version" type="string" required="true"> Version number (semver) </input> <input name="dry_run" type="bool" default="true"> Validate without publishing </input> </inputs> </metadata> <process> <step name="validate">    </step> <step name="version">   </step> <step name="publish">    </step> </process> </directive> ``` ```xml  <directive name="collection/install" version="1.0.0"> <metadata> <description>Install a collection into project</description> <inputs> <input name="collection_url" type="string" required="true"> Collection URL (github://user/name, rye://core, etc.) </input> <input name="version" type="string" default="latest"> Version constraint (e.g., ">=1.0.0") </input> <input name="target" type="path" default=".ai"> Where to install content </input> </inputs> </metadata> </directive> ``` ```xml  <directive name="collection/search" version="1.0.0"> <metadata> <description>Search registries for collections</description> <inputs> <input name="query" type="string" required="true"> Search query (with optional filters) </input> <input name="registries" type="list" default="[]"> Which registries to search (all if empty) </input> <input name="limit" type="int" default="10"> Results limit </input> </inputs> </metadata> </directive> ``` ### Registry Tools ```python # .ai/tools/registry/collection_manager.py """Collection management utilities""" class CollectionManager: """Manage local and remote collections""" def load_collection(self, path: Path) -> Collection: """Load collection.toml and validate""" ... def publish_to_registry(self, collection: Collection, registry: str): """Publish to registry (github, rye, etc.)""" ... def install_from_url(self, url: str, version: str, target: Path): """Install collection from URL""" ... def resolve_dependencies(self, collection: Collection) -> List[Collection]: """Recursively resolve dependencies""" ... def search_registries(self, query: str, registries: List[str] = None): """Search across registries""" ... def validate_collection(self, collection: Collection) -> ValidationResult: """Validate collection structure and content""" ... ``` --- ## Part 9: User Experience: Quick Start ### For Collection Authors ```bash # 1. Create collection locally mkdir -p ~/.local/share/rye/collections/my-data-tools cd ~/.local/share/rye/collections/my-data-tools # 2. Create collection.toml cat > collection.toml << 'EOF' [metadata] name = "my-tools" version = "0.1.0" description = "My useful tools" author = "me@example.com" registry = "local" [content] tools = ["tools/helper.py"] directives = ["directives/task.md"] EOF # 3. Create structure mkdir -p tools directives # 4. Add content (already tested in user space) # Just move files here # 5. Publish to GitHub git init git add . git commit -m "Initial version" git tag v0.1.0 git push github.com/username/my-tools # 6. Now shareable! # People install with: # lilux install github://username/my-tools ``` ### For Collection Users ```bash # 1. Search for collections lilux search "data processing" # Shows: Collections from all registries # 2. Get info lilux info github://user/data-tools # Shows: Description, versions, dependencies, etc. # 3. Install lilux install github://user/data-tools # Auto-installs dependencies, indexes content # 4. Use in projects cd my-project lilux search "process data" # Finds tools from installed collections # Use directly lilux execute action run tool data-tools/process-csv # Or in directives <directive name="workflow" ...> <imports> <import from="data-tools/directives/validate.md" /> </imports> </directive> ``` --- ## Part 10: Multi-Registry Expansion ### Future Support for Multiple Registries The system is designed for easy expansion: ```yaml # Eventually support multiple registries per environment registries: - name: "rye:core" url: "https://registry.lilux.dev/core" priority: 100 # Primary registry - name: "huggingface" url: "https://huggingface.co/lilux-collections" priority: 50 - name: "pinecone" url: "https://pinecone.io/lilux" priority: 40 - name: "local" path: "~/.local/share/lilux/collections" priority: 10 # Lowest priority (local fallback) # Search respects priority order # Install auto-pulls from first available source # User can override: lilux install @huggingface://collection ``` --- ## Part 11: Summary & Benefits ### What Users Can Do 1. **Create** - Bundle their tools, directives, knowledge 2. **Host** - On GitHub, personal registry, or local 3. **Share** - Via git URL or registry publish 4. **Discover** - Via RAG + metadata search across registries 5. **Install** - Into any project, with dependency resolution 6. **Mix** - Combine collections into personal workspaces 7. **Contribute** - To community collections via PRs ### Core Design Principles 1. **Git-first** - Everything is version controlled 2. **Decentralized** - Content can live anywhere 3. **Discoverable** - Vector search + metadata 4. **Composable** - Collections depend on collections 5. **Flexible** - One core registry now, multiple later 6. **Safe** - Validation, testing, versioning built-in ### Architecture Benefits - **Scalable**: Registry system doesn't become bottleneck - **Extensible**: Easy to add new registries/sources - **Independent**: Core kernel stable, collections evolve - **Developer-friendly**: Familiar git workflows - **AI-native**: RAG search, semantic discovery - **Community-driven**: Anyone can publish collections --- ## Implementation Roadmap ### Phase 1: Foundation (0.2.0) - [ ] Collection metadata format (TOML) - [ ] Local collection discovery - [ ] Collection installation from git URLs - [ ] Basic dependency resolution ### Phase 2: Registry Integration (0.3.0) - [ ] Vector indexing for collections - [ ] RAG-based search across registries - [ ] Official registry (rye:core) - [ ] Publish directive ### Phase 3: Community (0.4.0) - [ ] Community collection curation - [ ] Registry UI/web interface - [ ] Collection ratings/reviews - [ ] Multiple registry support ### Phase 4: Advanced (1.0.0) - [ ] Collection composition - [ ] Advanced dependency resolution - [ ] Conflict detection - [ ] Migration tooling --- ## Conclusion **Collections as distribution units** enable: - **Creators**: Share curated, versioned content easily - **Users**: Discover and mix content from multiple sources - **Community**: Build ecosystem around Lilux/RYE - **Scale**: Decentralized, git-backed, searchable **One command to install everything you need:** ```bash lilux install github://user/my-complete-system ``` --- _Document Status: Design for Implementation_ _Last Updated: 2026-01-28_ _Next: Implement collection metadata + discovery system_

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/leolilley/kiwi-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

REGISTRY_BUNDLING_STRATEGY.md•23.4 KiB