swarm-mcp
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": false
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| runA | Run a single Claude agent in a Docker container. Returns the agent's text output and metadata. Args: prompt: The task prompt for the agent. sandbox: Named sandbox spec (from ~/.claude/sandboxes/) or inline JSON. Overrides below are merged on top. network: Whether the container has network access (default: true — needed for API calls). tools: Comma-separated list of allowed Claude tools (default: Read,Write,Glob,Grep,Bash). mounts: JSON array of mount specs: [{"host_path": "...", "container_path": "...", "readonly": true}]. model: Claude model to use (default: sonnet). Options: haiku, sonnet, opus. timeout: Max execution time in seconds (default: 120). system_prompt: System prompt injected via --system-prompt (role, persona, instructions). claude_md: Project instructions written to workspace CLAUDE.md. output_schema: JSON schema string for structured output (--json-schema). mcps: JSON array of MCP server names to attach: ["database-mcp", "whatsapp"]. effort: Effort level: low, medium, high, max. max_budget: Explicit USD budget cap. env_vars: JSON object of environment variables: {"KEY": "value"}. input_files: JSON object of files to inject: {"/path": "content"}. memory: Docker memory limit (e.g. "2g"). cpus: Docker CPU limit (e.g. 2.0). gpu: Pass --gpus all to Docker for GPU access (default: false). Acquires the "gpu" resource pool (capacity 1). resources: JSON array of named resource pools to acquire before execution (e.g. '["gpu", "database"]'). Agents wait for all resources. Configure capacity via SWARM_RESOURCE_= env vars. input_type: Natural language type describing what the agent receives (e.g. "research notes", "[code-review]"). output_type: Natural language type describing what the agent must produce (e.g. "[mcp-server] with [test-suite]"). |
| parA | Run multiple Claude agents in parallel. Each task can have its own config. Args: tasks: JSON array of task objects. Each supports all sandbox fields (prompt, model, tools, sandbox, system_prompt, claude_md, output_schema, mcps, effort, etc.). max_concurrency: Max agents running simultaneously (default: 5). |
| mapA | Apply a prompt template to each input in parallel. Use {input} as the placeholder. Args: prompt_template: Prompt template with {input} placeholder(s). inputs: JSON array of input strings: ["input1", "input2", ...]. sandbox: Named sandbox spec or inline JSON. network: Whether containers have network access (default: true — needed for API calls). tools: Comma-separated list of allowed Claude tools. model: Claude model to use (default: sonnet). timeout: Max execution time per agent in seconds (default: 120). max_concurrency: Max agents running simultaneously (default: 5). system_prompt: System prompt injected via --system-prompt. claude_md: Project instructions written to workspace CLAUDE.md. output_schema: JSON schema string for structured output. mcps: JSON array of MCP server names to attach. effort: Effort level: low, medium, high, max. |
| chainA | Run agents sequentially as a pipeline. Each stage receives the prior stage's output as context. Args: stages: JSON array of stage objects. Each supports all sandbox fields (prompt, model, tools, sandbox, system_prompt, etc.). |
| reduceA | Synthesise multiple results into one. Accepts plain strings or structured AgentResult objects (auto-extracts .text fields), so you can pipe par/map output directly without manual unwrapping. Args: results: JSON array — either plain strings ["text1", "text2"] or AgentResult objects [{"text": "...", ...}]. synthesis_prompt: Instructions for how to synthesise the results. sandbox: Named sandbox spec or inline JSON. network: Whether the container has network access (default: true — needed for API calls). tools: Comma-separated list of allowed Claude tools. model: Claude model to use (default: sonnet). timeout: Max execution time in seconds (default: 120). mcps: JSON array of MCP server names to attach to the reducer agent. system_prompt: System prompt for the reducer agent. |
| map_reduceA | Map a prompt over inputs in parallel, then reduce results into one — all in a single call. Fan-out then synthesise: map produces N results, reduce consumes them, no manual plumbing. Args: prompt_template: Prompt template with {input} placeholder(s). inputs: JSON array of input strings: ["input1", "input2", ...]. synthesis_prompt: Instructions for how to synthesise the map results. sandbox: Named sandbox spec or inline JSON (used for map agents). network: Whether containers have network access (default: true — needed for API calls). tools: Comma-separated list of allowed Claude tools for map agents. model: Claude model for map agents (default: sonnet). reduce_model: Claude model for the reduce agent (default: same as model). timeout: Max execution time per agent in seconds (default: 120). max_concurrency: Max map agents running simultaneously (default: 5). system_prompt: System prompt for map agents. reduce_system_prompt: System prompt for the reduce agent. output_schema: JSON schema for structured reduce output. mcps: JSON array of MCP server names for map agents. effort: Effort level for map agents. |
| unwrapA | Unwrap an agent result ref — writes the full text to a file and returns the path. All combinators return refs (metadata without text). Use unwrap to extract the text when you need it. The text is written to a .md file alongside the result, so you can Read() it, Grep it, or pass it to other tools without bloating the MCP protocol. Args: ref: A ref string like "run_id/agent_id", or a JSON object with a "ref" field. |
| inspectA | Inspect an agent's full execution state — partial output, stream log, files produced. Use after a timeout, crash, or unexpected result to understand what happened. Writes a human-readable debug report to output_dir/inspect.md. Args: ref: A ref string like "run_id/agent_id". |
| filterA | Filter refs by type validation — keep only results that match the declared type. Runs validate on each ref in parallel. Returns only refs with VALID verdict. This is the type-gated composition primitive: ensures only correct results flow downstream. Args: refs: JSON array of ref objects: [{"ref": "run_id/agent_id"}, ...]. declared_type: Type name or description to validate against. model: Model for the validator agents (default: sonnet). timeout: Timeout per validation (default: 120). |
| raceA | Run multiple approaches in parallel, return the first to succeed. All tasks start simultaneously. As soon as one completes without error, its ref is returned. Remaining tasks are abandoned (their containers are killed). Use for speculative execution or when multiple strategies might work. Args: tasks: JSON array of task objects (same format as par). max_concurrency: Max agents running simultaneously (default: 5). |
| retryA | Run a single agent with automatic retries on failure. If declared_type is set, retries until the output validates as that type (not just until exit code 0). Each attempt receives the prior error as context. Args: prompt: The task prompt. max_attempts: Maximum number of attempts (default: 3). sandbox: Named sandbox spec or inline JSON. model: Claude model (default: sonnet). timeout: Timeout per attempt (default: 120). declared_type: If set, validates output and retries if not VALID. mcps: JSON array of MCP server names to attach. |
| beamA | Sample N candidates in parallel, score each, commit to the top-1. The simplest search combinator: proposes Evaluator forms:
Budget semantics: a hard cap on total proposer spend. If exceeded, the
winner's search stamp records Anti-pattern: the Tree Search paper flags evaluator-as-expensive-as-proposer as a non-starter. This combinator hardcodes haiku for scoring — if you need a stronger evaluator, lift that logic into a governor instead. Args:
prompt: Task prompt sent to every candidate agent.
width: Number of parallel candidates (default: 3).
evaluator: Scoring directive. Must start with Returns:
JSON with |
| iterateA | Iteratively refine an agent's output until an evaluator is satisfied. Runs the agent up to
Evaluator forms (pass exactly one of
Args:
prompt: Base task description sent to the agent on iteration 1; on
subsequent iterations it is augmented with a "Prior attempts"
section summarising previous outputs, scores, and issues.
target_type: Shorthand for Returns:
JSON with |
| score_listA | Score and rank a list of existing refs against an evaluator. This is the missing piece for "generate N candidates, then pick the best"
pipelines where the N candidates already exist as refs from an upstream
combinator ( The key infrastructure point: refs are resolved to their artifact text
server-side, so callers do not need to pipe full artifact text
through tool parameters. This clears the ~4KB param-size wall you would
otherwise hit scoring 6+ medium-length artifacts through Evaluator forms are the same as
Each ref is tagged with a Args:
refs: JSON array of refs — either Returns:
JSON with |
| huntA | Run multiple (fetch → map → score) hunt strategies in parallel and merge. Each strategy is an independent (fetch, plan, score) triple with its own seed prompt or explicit seeds, plan template, and rubric. Strategies run concurrently; a failure in one does not halt the others. All scored refs across all strategies are merged into a single globally-ranked leaderboard so you can see which strategy produced the highest-yield finding. This is the compound combinator that makes "run all the hunt framings at once" tractable — rather than sequentially trying one hunt shape at a time, you parallelise across framings (problem-driven, gap-in-field, broken-claims, tool-landscape, connection) and let the scoring sort them. Strategy dict fields:
Args: strategies: JSON array of strategy dicts (or a Python list). max_parallel_strategies: Upper bound on strategies running at once (default: 5). Each strategy internally parallelises its map step. top_k_global: Size of the unified leaderboard across all strategies (default: 10). Returns:
JSON with |
| guardA | Check a monadic condition on a ref. Returns the ref if the guard passes, error if not. Use to enforce constraints before passing refs to downstream combinators. Args: ref: A ref string or JSON object. check: The guard to check — one of: "validated", "budget", "classification", "encrypted", "exists". value: Required for some checks — e.g. the type name for "validated", the classification level for "classification". |
| classifyA | Set the classification level on a ref. Controls which MCPs can access the data. Use for data sensitivity enforcement — e.g. mark original legal documents as 'confidential' (no WhatsApp MCP), mark synthetic outputs as 'public'. Args: ref: A ref string or JSON object. level: Classification level: public, internal, confidential, restricted. allowed_mcps: JSON array of MCP names allowed to access this ref. denied_mcps: JSON array of MCP names denied access. |
| encryptA | Encrypt a ref's text payload. Returns the ref with a key_id — only callers with the key can decrypt. The ref metadata (provenance, classification, etc.) stays visible; only the text content is encrypted. Pass the key_id to specific agents or features that should be able to read the content. Args: ref: A ref string like "run_id/agent_id", or a JSON object with a "ref" field. |
| decryptA | Decrypt an encrypted ref's text. Writes the plaintext to output.md and returns the path. You need the key_id that was returned when the ref was encrypted. Args: ref: A ref string like "run_id/agent_id", or a JSON object with a "ref" field. key_id: The key ID returned by the encrypt tool. |
| save_governor_specA | Register an LLM-governed governor for use in pipeline control flow. Governors are evaluated at trigger points (on_fail, on_success) to decide the continuation. The spec is a natural language description of what the governor should decide. The governor LLM returns one of:
Each continuation also carries a free-form Reference a governor in a pipeline step: "on_fail": {"governor": "Failure"} "on_success": {"governor": "Validation"} Args:
name: Unique governor name used to reference it from pipeline steps.
spec: Natural language description telling the LLM what to decide.
description: One-line summary shown in list_governor_specs.
model: Claude model for evaluation (default: haiku).
beam_width: Self-consistency beam width. When >1 the harness samples the
governor N times in parallel and commits to the confidence-weighted
majority decision. Losing candidates are preserved on the winner's
|
| list_governor_specsA | List all registered LLM-governed governors. Returns each governor's name, description, model, and a preview of its spec. |
| pipelineA | Launch a pipeline in the background and return immediately. The pipeline runs asynchronously in a daemon thread. Use pipeline_status(run_id) to poll progress, and pipeline_kill(run_id) to stop it. The definition is a JSON object or a pipeline name (loaded from registered project pipelines/ directories or ~/.claude/pipelines/). Pipeline format: { "name": "optional-name", "sandbox": "optional-default-sandbox", "steps": [ {"id": "step-0", "prompt": "...", "model": "sonnet", "sandbox": "...", ...}, {"id": "test", "prompt": "Run tests", "tools": "Bash", "on_fail": "fix"}, {"id": "fix", "prompt": "Fix failing tests", "tools": "Read,Edit,Bash", "condition": "prev.error", "next": "test", "max_retries": 3} ] } Step fields: prompt (required), plus any sandbox fields (model, tools, system_prompt, etc.). Control flow: on_fail (step id to jump to on error, or {"governor": "name"} for LLM-governed recovery — see save_governor_spec), on_success ({"governor": "name"} for LLM-governed continuation), next (jump after success), condition ("prev.error" = only run if previous failed), max_retries, retry_if ({target_step: keyword} — jump if output contains keyword). Any unhandled failure terminates the pipeline with status="broken". Args: definition: Pipeline name (loaded from ~/.claude/pipelines/.json) or inline JSON definition. resume: Resume a previous run. Format: "run_id" or "run_id/step_id". Reuses the shared directory from the previous run. If step_id is given, skips to that step. If only run_id, resumes from the step that failed. |
| pipeline_statusA | Return the current status of a running or completed pipeline. Reads /tmp/swarm-mcp/<run_id>/pipeline-status.json and returns its contents. The status file is written after each step completes. Args: run_id: The pipeline run ID returned by the pipeline() tool. |
| pipeline_artifactsA | List artifacts produced by a pipeline run. Without step_id: lists the /shared/ directory contents (inter-step files) plus a summary of each step's output directory. With step_id: lists that specific step's output directory in detail, including file sizes. Use unwrap(ref) or Read() to view file contents. Args: run_id: The pipeline run ID. step_id: Optional step ID to inspect. If omitted, lists shared/ and all steps. |
| pipeline_killA | Kill a running pipeline and all its Docker containers. Sets the pipeline's stop event (so the loop exits cleanly after the current step) and immediately kills all Docker containers associated with the run. Args: run_id: The pipeline run ID to kill. |
| list_pipelinesA | List recent pipeline runs and their current status. Scans /tmp/swarm-mcp/ for pipeline-status.json files and returns a summary of all known runs, sorted by last_updated descending. Also annotates which runs have live threads in the current process. |
| save_sandbox_specA | Save a reusable sandbox spec to ~/.claude/sandboxes/.json. Args: name: Name for the sandbox spec (e.g. "web-researcher", "code-reviewer"). spec: JSON object with sandbox fields: model, tools, mcps, system_prompt, claude_md, output_schema, effort, max_budget, mounts, workdir, input_files, network, memory, cpus, timeout, env_vars. |
| list_sandbox_specsA | List all saved sandbox specs from ~/.claude/sandboxes/. |
| wrapA | Wrap a file or directory into the swarm ref system. This is how you bring external objects INTO the monadic context. The wrapped file gets a ref that can be passed to any combinator. Args: path: Absolute path to a file or directory on the host. |
| wrap_projectA | Register a project directory's pipelines, sandboxes, and types with the swarm. Looks for pipelines/, sandboxes/, types/ subdirectories and adds them to the search paths. After wrapping, named resources from the project are discoverable by all swarm tools (pipeline, run, validate, etc.). Args: project_dir: Absolute path to a project root containing pipelines/, sandboxes/, and/or types/ directories. |
| list_type_registryA | List all registered types from ~/.claude/types/. |
| get_type_definitionA | Get a type definition by name, optionally resolving [references] to other types. Args: name: Type name (e.g. "mcp-server", "tarball", "code-review"). resolve_refs: Whether to inline [referenced] types (default: true). |
| validateA | Validate an artifact against a declared type. Runs a type-checker agent that inspects the artifact and reports VALID/PARTIAL/INVALID with per-criterion results. Use this after a pipeline step to verify the output matches expectations. If validation fails, you know which agent to blame and can retry. Args: artifact: Description of what to validate — e.g. the agent's output text, a file path, or a ref {"ref": "run_id/agent_id"}. declared_type: The type to validate against — either a type name (e.g. "mcp-server") or inline natural language description. sandbox: Named sandbox spec or inline JSON for the validator agent. model: Model for the validator (default: sonnet — needs to be good at analysis). timeout: Timeout for the validation agent. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/stiege/swarm-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server