Skip to main content
Glama
README.md6.7 kB
# Delia A Model Context Protocol (MCP) server that cultivates your local LLM garden. Plant a seed, let Delia pick the right vine, and harvest a fresh melon. *Delia - from Greek Δηλία, "from Delos" (the sacred island). Also, she grows watermelons.* ## Features - **Smart Vine Selection**: Routes seeds to the right vine - quick (7B), coder (14B+), moe (30B+), or thinking - **Multi-Garden Support**: Ollama, llama.cpp, and Gemini gardens with automatic failover - **Context-Aware Routing**: Handles large seeds with appropriate context windows - **Circuit Breaker**: Drought protection with graceful recovery - **Parallel Processing**: Tends multiple seeds simultaneously - **Authentication**: Optional greenhouse access control - **Usage Tracking**: Per-gardener quotas and harvest monitoring - **Dashboard**: Real-time garden status with watermelon-themed activity feed ## Requirements ### Hardware | Component | Minimum | Recommended | Large Models | |-----------|---------|-------------|--------------| | GPU | 4GB VRAM | 12GB VRAM | 24GB+ VRAM | | RAM | 8GB | 16GB | 32GB+ | | Storage | 10GB | 30GB | 50GB+ | ### Software - Python 3.11+ - [uv](https://docs.astral.sh/uv/) package manager - One or more backends: - [Ollama](https://ollama.ai) (recommended) - [llama.cpp](https://github.com/ggerganov/llama.cpp) - Google Gemini API (optional cloud fallback) ## Quick Start ```bash # Clone and install git clone https://github.com/zbrdc/delia.git cd delia uv sync # Pull models (examples - choose based on your hardware) ollama pull qwen3:14b # General purpose ollama pull qwen2.5-coder:14b # Code specialized ollama pull qwen3:30b-a3b # Complex reasoning # Run server uv run python mcp_server.py ``` ## Integration Delia works with AI coding assistants via MCP. Choose your tool: ### VS Code / GitHub Copilot Add to `~/.config/Code/User/mcp.json`: ```json { "servers": { "delia": { "command": "uv", "args": ["run", "--directory", "/path/to/delia", "python", "mcp_server.py"], "type": "stdio" } } } ``` Reload VS Code to activate. ### Claude Code Create `~/.claude/mcp.json`: ```json { "mcpServers": { "delia": { "command": "uv", "args": ["run", "--directory", "/path/to/delia", "python", "mcp_server.py"] } } } ``` Then run `claude` and use `@delia` to delegate tasks. ### Gemini CLI **Option 1: HTTP Mode (Recommended)** ```bash # Start server uv run python mcp_server.py --transport sse --port 8200 ``` Add to `~/.gemini/settings.json`: ```json { "mcpServers": { "delia": { "url": "http://localhost:8200/sse", "transport": "sse" } } } ``` **Option 2: STDIO Mode** Add to `~/.gemini/settings.json`: ```json { "mcpServers": { "delia": { "command": "uv", "args": ["run", "--directory", "/path/to/delia", "python", "mcp_server.py"] } } } ``` ### GitHub Copilot CLI Create `~/.copilot-cli/mcp.json`: ```json { "servers": { "delia": { "command": "uv", "args": ["run", "--directory", "/path/to/delia", "python", "mcp_server.py"] } } } ``` ## Configuration ### Backend Configuration Edit `settings.json` in the project root: ```json { "backends": [ { "id": "ollama-local", "name": "Ollama Local", "provider": "ollama", "type": "local", "url": "http://localhost:11434", "enabled": true, "priority": 1, "models": { "quick": "qwen3:14b", "coder": "qwen2.5-coder:14b", "moe": "qwen3:30b-a3b", "thinking": "deepseek-r1:14b" } } ], "routing": { "prefer_local": true, "fallback_enabled": true } } ``` ### Gemini Cloud Backend (Optional) Add Gemini as a cloud fallback: ```bash # Install dependency uv add google-generativeai # Set API key export GEMINI_API_KEY="your-key-from-aistudio.google.com" ``` Add to `settings.json`: ```json { "id": "gemini-cloud", "name": "Gemini Cloud", "provider": "gemini", "type": "remote", "url": "https://generativelanguage.googleapis.com", "enabled": true, "priority": 10, "models": { "quick": "gemini-2.0-flash", "coder": "gemini-2.0-flash", "moe": "gemini-2.0-flash" } } ``` ### Authentication (Optional) For HTTP mode with multiple users: ```bash # Quick setup python setup_auth.py # Or manually export DELIA_AUTH_ENABLED=true export DELIA_JWT_SECRET="your-secure-secret" ``` Supports username/password and Microsoft 365 OAuth. ## Transport Modes ```bash # STDIO (default) - for VS Code, Claude Code, Copilot CLI uv run python mcp_server.py # HTTP/SSE - for Gemini CLI, web clients, remote access uv run python mcp_server.py --transport sse --port 8200 # View all options uv run python mcp_server.py --help ``` ## Tools Delia provides these MCP tools: | Tool | Description | |------|-------------| | `delegate` | Execute tasks with automatic model selection | | `think` | Extended reasoning for complex problems | | `batch` | Process multiple tasks in parallel | | `health` | Check backend status and statistics | | `models` | List available models and tiers | | `switch_backend` | Switch between backends at runtime | | `switch_model` | Change model for a tier | | `get_model_info` | Get model specifications | ## Vine Selection Delia picks the right vine for every seed: | Vine | Size | Best For | |------|------|----------| | Quick | 7B-14B | Summaries, simple questions | | Coder | 14B-30B | Generation, review, debugging | | MoE | 30B+ | Architecture, critique, analysis | | Thinking | Specialized | Extended reasoning, research | Override with hints in your prompt: "use the large model" or "quick answer". ## Troubleshooting ### Server won't start ```bash # Check Ollama is running curl http://localhost:11434/api/tags # Test server import uv run python -c "import mcp_server; print('OK')" ``` ### MCP not connecting - Verify path in config points to correct directory - Reload VS Code / restart Claude Code - Check logs: `~/.cache/delia/live_logs.json` ### "Unknown" responses - Backend not running or unreachable - Check `settings.json` configuration - Run `curl http://localhost:11434/health` ### Slow responses - Try smaller models - Check system resources (`nvidia-smi`, `htop`) - Reduce context size in `settings.json` ## Performance Typical harvest times (modern hardware): - Quick vine: 2-5 seconds - Coder vine: 5-15 seconds - MoE/Thinking vines: 30-60 seconds ## License BSD 3-Clause ## Acknowledgments - [Ollama](https://ollama.ai) - Local LLM runtime - [MCP Python SDK](https://github.com/modelcontextprotocol/python-sdk) - Protocol implementation - [Qwen](https://qwenlm.github.io/) - Base models

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/zbrdc/delia'

If you have feedback or need assistance with the MCP directory API, please join our Discord server