Skip to main content
Glama

delegate

Execute AI tasks on local or remote GPUs with intelligent model selection. Routes code review, generation, and analysis tasks to optimal backends based on content size, task type, and GPU availability for private processing.

Instructions

Execute a task on local/remote GPU with intelligent 3-tier model selection. Routes to optimal backend based on content size, task type, and GPU availability.

WHEN TO USE:

  • "locally", "on my GPU", "without API", "privately" → Use this tool

  • Code review, generation, analysis tasks → Use this tool

  • Any task you want processed on local hardware → Use this tool

Args: task: Task type determines model tier: - "quick" or "summarize" → quick tier (fast, 14B model) - "generate", "review", "analyze" → coder tier (code-optimized 14B) - "plan", "critique" → moe tier (deep reasoning 30B+) content: The prompt or content to process (required) file: Optional file path to include in context model: Force specific tier - "quick" | "coder" | "moe" | "thinking" OR natural language: "7b", "14b", "30b", "small", "large", "coder model", "fast", "complex", "thinking" language: Language hint for better prompts - python|typescript|react|nextjs|rust|go context: Serena memory names to include (comma-separated: "architecture,decisions") symbols: Code symbols to focus on (comma-separated: "Foo,Bar/calculate") include_references: True if content includes symbol usages from elsewhere backend_type: Force backend type - "local" | "remote" (default: auto-select)

ROUTING LOGIC:

  1. Content > 32K tokens → Uses backend with largest context window

  2. Prefer local GPUs (lower latency) unless unavailable

  3. Falls back to remote if local circuit breaker is open

  4. Load balances across available backends based on priority weights

Returns: LLM response with metadata footer showing model, tokens, time, backend

Examples: delegate(task="review", content="", language="python") delegate(task="generate", content="Write a REST API", backend_type="local") delegate(task="plan", content="Design caching strategy", model="moe") delegate(task="analyze", content="Debug this error", model="14b") delegate(task="quick", content="Summarize this article", model="fast")

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
taskYes
contentYes
fileNo
modelNo
languageNo
contextNo
symbolsNo
include_referencesNo
backend_typeNo

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/zbrdc/delia'

If you have feedback or need assistance with the MCP directory API, please join our Discord server