Schema | swarm-mcp

swarm-mcp

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
No arguments

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
runA	Run a single Claude agent in a Docker container. Returns the agent's text output and metadata. Args: prompt: The task prompt for the agent. sandbox: Named sandbox spec (from ~/.claude/sandboxes/) or inline JSON. Overrides below are merged on top. network: Whether the container has network access (default: true — needed for API calls). tools: Comma-separated list of allowed Claude tools (default: Read,Write,Glob,Grep,Bash). mounts: JSON array of mount specs: [{"host_path": "...", "container_path": "...", "readonly": true}]. model: Claude model to use (default: sonnet). Options: haiku, sonnet, opus. timeout: Max execution time in seconds (default: 120). system_prompt: System prompt injected via --system-prompt (role, persona, instructions). claude_md: Project instructions written to workspace CLAUDE.md. output_schema: JSON schema string for structured output (--json-schema). mcps: JSON array of MCP server names to attach: ["database-mcp", "whatsapp"]. effort: Effort level: low, medium, high, max. max_budget: Explicit USD budget cap. env_vars: JSON object of environment variables: {"KEY": "value"}. input_files: JSON object of files to inject: {"/path": "content"}. memory: Docker memory limit (e.g. "2g"). cpus: Docker CPU limit (e.g. 2.0). gpu: Pass --gpus all to Docker for GPU access (default: false). Acquires the "gpu" resource pool (capacity 1). resources: JSON array of named resource pools to acquire before execution (e.g. '["gpu", "database"]'). Agents wait for all resources. Configure capacity via SWARM_RESOURCE_= env vars. input_type: Natural language type describing what the agent receives (e.g. "research notes", "[code-review]"). output_type: Natural language type describing what the agent must produce (e.g. "[mcp-server] with [test-suite]").
parA	Run multiple Claude agents in parallel. Each task can have its own config. Args: tasks: JSON array of task objects. Each supports all sandbox fields (prompt, model, tools, sandbox, system_prompt, claude_md, output_schema, mcps, effort, etc.). max_concurrency: Max agents running simultaneously (default: 5).
mapA	Apply a prompt template to each input in parallel. Use {input} as the placeholder. Args: prompt_template: Prompt template with {input} placeholder(s). inputs: JSON array of input strings: ["input1", "input2", ...]. sandbox: Named sandbox spec or inline JSON. network: Whether containers have network access (default: true — needed for API calls). tools: Comma-separated list of allowed Claude tools. model: Claude model to use (default: sonnet). timeout: Max execution time per agent in seconds (default: 120). max_concurrency: Max agents running simultaneously (default: 5). system_prompt: System prompt injected via --system-prompt. claude_md: Project instructions written to workspace CLAUDE.md. output_schema: JSON schema string for structured output. mcps: JSON array of MCP server names to attach. effort: Effort level: low, medium, high, max.
chainA	Run agents sequentially as a pipeline. Each stage receives the prior stage's output as context. Args: stages: JSON array of stage objects. Each supports all sandbox fields (prompt, model, tools, sandbox, system_prompt, etc.).
reduceA	Synthesise multiple results into one. Accepts plain strings or structured AgentResult objects (auto-extracts .text fields), so you can pipe par/map output directly without manual unwrapping. Args: results: JSON array — either plain strings ["text1", "text2"] or AgentResult objects [{"text": "...", ...}]. synthesis_prompt: Instructions for how to synthesise the results. sandbox: Named sandbox spec or inline JSON. network: Whether the container has network access (default: true — needed for API calls). tools: Comma-separated list of allowed Claude tools. model: Claude model to use (default: sonnet). timeout: Max execution time in seconds (default: 120). mcps: JSON array of MCP server names to attach to the reducer agent. system_prompt: System prompt for the reducer agent.
map_reduceA	Map a prompt over inputs in parallel, then reduce results into one — all in a single call. Fan-out then synthesise: map produces N results, reduce consumes them, no manual plumbing. Args: prompt_template: Prompt template with {input} placeholder(s). inputs: JSON array of input strings: ["input1", "input2", ...]. synthesis_prompt: Instructions for how to synthesise the map results. sandbox: Named sandbox spec or inline JSON (used for map agents). network: Whether containers have network access (default: true — needed for API calls). tools: Comma-separated list of allowed Claude tools for map agents. model: Claude model for map agents (default: sonnet). reduce_model: Claude model for the reduce agent (default: same as model). timeout: Max execution time per agent in seconds (default: 120). max_concurrency: Max map agents running simultaneously (default: 5). system_prompt: System prompt for map agents. reduce_system_prompt: System prompt for the reduce agent. output_schema: JSON schema for structured reduce output. mcps: JSON array of MCP server names for map agents. effort: Effort level for map agents.
unwrapA	Unwrap an agent result ref — writes the full text to a file and returns the path. All combinators return refs (metadata without text). Use unwrap to extract the text when you need it. The text is written to a .md file alongside the result, so you can Read() it, Grep it, or pass it to other tools without bloating the MCP protocol. Args: ref: A ref string like "run_id/agent_id", or a JSON object with a "ref" field.
inspectA	Inspect an agent's full execution state — partial output, stream log, files produced. Use after a timeout, crash, or unexpected result to understand what happened. Writes a human-readable debug report to output_dir/inspect.md. Args: ref: A ref string like "run_id/agent_id".
filterA	Filter refs by type validation — keep only results that match the declared type. Runs validate on each ref in parallel. Returns only refs with VALID verdict. This is the type-gated composition primitive: ensures only correct results flow downstream. Args: refs: JSON array of ref objects: [{"ref": "run_id/agent_id"}, ...]. declared_type: Type name or description to validate against. model: Model for the validator agents (default: sonnet). timeout: Timeout per validation (default: 120).
raceA	Run multiple approaches in parallel, return the first to succeed. All tasks start simultaneously. As soon as one completes without error, its ref is returned. Remaining tasks are abandoned (their containers are killed). Use for speculative execution or when multiple strategies might work. Args: tasks: JSON array of task objects (same format as par). max_concurrency: Max agents running simultaneously (default: 5).
retryA	Run a single agent with automatic retries on failure. If declared_type is set, retries until the output validates as that type (not just until exit code 0). Each attempt receives the prior error as context. Args: prompt: The task prompt. max_attempts: Maximum number of attempts (default: 3). sandbox: Named sandbox spec or inline JSON. model: Claude model (default: sonnet). timeout: Timeout per attempt (default: 120). declared_type: If set, validates output and retries if not VALID. mcps: JSON array of MCP server names to attach.
beamA	Sample N candidates in parallel, score each, commit to the top-1. The simplest search combinator: proposes `width` candidates via `par`, scores each with a cheap haiku evaluator, and returns the highest-scoring ref. Losing candidates are preserved on the winner's `search.alternatives` field (unless `keep_losers` is false). This is self-consistency / majority-vote with arbitrary scoring — the same shape as the governor beam search, but applied to arbitrary agent output. Evaluator forms: `score:<criterion>` — direct haiku call, returns a float in [0, 1] plus a reason string. Use for rubric-style scoring. `validate:<type>` — runs the type validator; VALID=1.0, PARTIAL=0.5, INVALID=0.0. Use when the acceptance criterion is a registered type. Budget semantics: a hard cap on total proposer spend. If exceeded, the winner's search stamp records `prune_reason="budget exhausted"` but the result is still returned — best-effort rather than abort. Evaluator cost is not counted against `budget` for phase 1; evaluators are already constrained to haiku. Anti-pattern: the Tree Search paper flags evaluator-as-expensive-as-proposer as a non-starter. This combinator hardcodes haiku for scoring — if you need a stronger evaluator, lift that logic into a governor instead. Args: prompt: Task prompt sent to every candidate agent. width: Number of parallel candidates (default: 3). evaluator: Scoring directive. Must start with `score:` or `validate:`. sandbox: Named sandbox spec or inline JSON for candidate agents. model: Candidate agent model (default: sonnet — the proposer). timeout: Per-candidate timeout in seconds. mcps: JSON array of MCP server names attached to candidates. keep_losers: Preserve losing candidates on winner search stamp (default: true — useful for inspection + future step-lookahead). budget: Total USD cap on proposer cost. Best-so-far semantics on breach. max_concurrency: Upper bound on concurrent candidate agents. Returns: JSON with `run_id`, `winner` ref (search-stamped), `scores`, and `total_cost`. If all candidates scored 0, `error` is populated.
iterateA	Iteratively refine an agent's output until an evaluator is satisfied. Runs the agent up to `max_iterations` times. Each iteration sees the prior attempts' outputs, scores, and issues injected into its prompt, so the agent can correct the specific shortcomings the evaluator called out on the last pass. Halts when any of these become true: Evaluator score >= `success_threshold` (success). `patience` iterations pass with no improvement to the best score. Total agent cost exceeds `max_budget` (best-so-far). Wall-time exceeds `max_wall_time` seconds (best-so-far). `max_iterations` reached (best-so-far). Evaluator forms (pass exactly one of `target_type` or `evaluator`): `target_type="some-type"` — shorthand for `evaluator="validate:some-type"`. LLM validator against a registered type. Best for text artifacts whose correctness is a matter of shape and content. `evaluator="validate:<type>"` — explicit form of the above. `evaluator="exec:<shell cmd>"` — ground-truth executor. The agent's output is written to a tempfile and the command runs with `$ARTIFACT` set to the path (and `{artifact}` substituted in the template). Exit 0 scores 1.0; non-zero scores 0.0 with stderr parsed into issues. Use for artifacts with compile/build/test semantics (`docker build`, `pytest`, etc.) — this is the only way to get ground-truth feedback. `evaluator="score:<criterion>"` — ad-hoc LLM rubric scoring via haiku. Args: prompt: Base task description sent to the agent on iteration 1; on subsequent iterations it is augmented with a "Prior attempts" section summarising previous outputs, scores, and issues. target_type: Shorthand for `evaluator="validate:<target_type>"`. Also injects the type as `output_type` context on the agent prompt. evaluator: Full evaluator directive (`validate:`, `exec:`, or `score:`). Takes precedence over `target_type` if both given. max_iterations: Hard cap on iteration count (default: 10). success_threshold: Score at or above which we declare success (default: 0.9). Must be in [0.0, 1.0]. patience: Halt if `patience` consecutive iterations fail to improve on the best score so far (default: 3). max_budget: Optional USD cap on total agent (proposer) cost. Evaluator cost is not counted. Halts best-so-far on breach. max_wall_time: Optional wall-time cap in seconds. Halts best-so-far on breach. sandbox: Named sandbox spec or inline JSON for the agent. model: Agent (proposer) model (default: sonnet). timeout: Per-iteration agent timeout in seconds. mcps: JSON array of MCP server names attached to the agent. Returns: JSON with `run_id`, `iterations`, `halted_because`, `best_iteration`, `best_ref` (validated-stamped), `best_score`, `total_cost`, and the full `attempts` trace with per-iteration ref / score / issues / cost.
score_listA	Score and rank a list of existing refs against an evaluator. This is the missing piece for "generate N candidates, then pick the best" pipelines where the N candidates already exist as refs from an upstream combinator (`map`, `par`, etc.) — `beam` is the wrong shape because it fans out width-N proposers of the same prompt, whereas `score_list` takes N different outputs and ranks them. The key infrastructure point: refs are resolved to their artifact text server-side, so callers do not need to pipe full artifact text through tool parameters. This clears the ~4KB param-size wall you would otherwise hit scoring 6+ medium-length artifacts through `map`. Evaluator forms are the same as `iterate` / `beam`: `validate:<type>` — LLM validator against a registered type `score:<criterion>` — ad-hoc haiku rubric `exec:<shell-cmd>` — ground-truth shell command (exit code scoring) Each ref is tagged with a `search` stamp carrying its score and `beam_rank`. Refs outside the top-`k` are marked `pruned=True` with a reason pointing at the beam cut. `top_k=0` means return all without pruning. Args: refs: JSON array of refs — either `["run_id/agent_id", ...]` or `[{"ref": "run_id/agent_id"}, ...]`. Both forms are accepted. evaluator: Scoring directive (`validate:<type>`, `score:<criterion>`, or `exec:<cmd>`). top_k: How many top-scoring refs to surface as winners. `0` means rank-only, no pruning stamp applied. max_concurrency: Upper bound on parallel evaluator calls (default: 5). Returns: JSON with `run_id`, `evaluator`, `total`, `top_k`, `winners` (list of winning ref strings), and `ranked` (the full per-ref trace: rank, ref, score, verdict, issues, reason).
huntA	Run multiple (fetch → map → score) hunt strategies in parallel and merge. Each strategy is an independent (fetch, plan, score) triple with its own seed prompt or explicit seeds, plan template, and rubric. Strategies run concurrently; a failure in one does not halt the others. All scored refs across all strategies are merged into a single globally-ranked leaderboard so you can see which strategy produced the highest-yield finding. This is the compound combinator that makes "run all the hunt framings at once" tractable — rather than sequentially trying one hunt shape at a time, you parallelise across framings (problem-driven, gap-in-field, broken-claims, tool-landscape, connection) and let the scoring sort them. Strategy dict fields: `name` (required) — short label used for tagging and per-strategy reporting in the leaderboard. `seeds` (optional) — pre-supplied list of seed strings (each becomes a map input). Exactly one of `seeds` or `fetcher_prompt` must be set. `fetcher_prompt` (optional) — prompt sent to a single `run` call that must emit a JSON array of seed strings (or a JSON object with a `seeds` key). The agent runs with network enabled. `planner_template` (required) — prompt template for the map step. Use `{input}` as the placeholder for each seed. `rubric` (required) — evaluator directive for the score step. Any form accepted by `_evaluate_node`: `validate:<type>`, `score:<criterion>`, or `exec:<cmd>`. `top_k` (optional, default 3) — how many winners this strategy contributes to the unified leaderboard. `model_planner` (optional, default "haiku") — model for the map step. `model_fetcher` (optional, default "haiku") — model for the fetch step. `timeout` (optional) — per-agent timeout in seconds. Args: strategies: JSON array of strategy dicts (or a Python list). max_parallel_strategies: Upper bound on strategies running at once (default: 5). Each strategy internally parallelises its map step. top_k_global: Size of the unified leaderboard across all strategies (default: 10). Returns: JSON with `run_id`, `strategies` (per-strategy summary including name, error-if-any, total_cost, winner_ref, best_score), and `leaderboard` (globally-ranked list tagged by strategy). The full per-ref scoring trace lives at `strategy_details[i].ranked`.
guardA	Check a monadic condition on a ref. Returns the ref if the guard passes, error if not. Use to enforce constraints before passing refs to downstream combinators. Args: ref: A ref string or JSON object. check: The guard to check — one of: "validated", "budget", "classification", "encrypted", "exists". value: Required for some checks — e.g. the type name for "validated", the classification level for "classification".
classifyA	Set the classification level on a ref. Controls which MCPs can access the data. Use for data sensitivity enforcement — e.g. mark original legal documents as 'confidential' (no WhatsApp MCP), mark synthetic outputs as 'public'. Args: ref: A ref string or JSON object. level: Classification level: public, internal, confidential, restricted. allowed_mcps: JSON array of MCP names allowed to access this ref. denied_mcps: JSON array of MCP names denied access.
encryptA	Encrypt a ref's text payload. Returns the ref with a key_id — only callers with the key can decrypt. The ref metadata (provenance, classification, etc.) stays visible; only the text content is encrypted. Pass the key_id to specific agents or features that should be able to read the content. Args: ref: A ref string like "run_id/agent_id", or a JSON object with a "ref" field.
decryptA	Decrypt an encrypted ref's text. Writes the plaintext to output.md and returns the path. You need the key_id that was returned when the ref was encrypted. Args: ref: A ref string like "run_id/agent_id", or a JSON object with a "ref" field. key_id: The key ID returned by the encrypt tool.
save_governor_specA	Register an LLM-governed governor for use in pipeline control flow. Governors are evaluated at trigger points (on_fail, on_success) to decide the continuation. The spec is a natural language description of what the governor should decide. The governor LLM returns one of: next — proceed normally jump(target) — jump to a named step halt — stop the pipeline cleanly broken(reason) — stop and write broken status (visible via pipeline_status) patch_pipeline — deep-merge patch the pipeline definition and continue Each continuation also carries a free-form `context` dict that accumulates across the pipeline and is written to /shared/governor-context.json, plus a `confidence` score in [0.0, 1.0]. Reference a governor in a pipeline step: "on_fail": {"governor": "Failure"} "on_success": {"governor": "Validation"} Args: name: Unique governor name used to reference it from pipeline steps. spec: Natural language description telling the LLM what to decide. description: One-line summary shown in list_governor_specs. model: Claude model for evaluation (default: haiku). beam_width: Self-consistency beam width. When >1 the harness samples the governor N times in parallel and commits to the confidence-weighted majority decision. Losing candidates are preserved on the winner's `alternatives` field for inspection. Default 1 (no beam).
list_governor_specsA	List all registered LLM-governed governors. Returns each governor's name, description, model, and a preview of its spec.
pipelineA	Launch a pipeline in the background and return immediately. The pipeline runs asynchronously in a daemon thread. Use pipeline_status(run_id) to poll progress, and pipeline_kill(run_id) to stop it. The definition is a JSON object or a pipeline name (loaded from registered project pipelines/ directories or ~/.claude/pipelines/). Pipeline format: { "name": "optional-name", "sandbox": "optional-default-sandbox", "steps": [ {"id": "step-0", "prompt": "...", "model": "sonnet", "sandbox": "...", ...}, {"id": "test", "prompt": "Run tests", "tools": "Bash", "on_fail": "fix"}, {"id": "fix", "prompt": "Fix failing tests", "tools": "Read,Edit,Bash", "condition": "prev.error", "next": "test", "max_retries": 3} ] } Step fields: prompt (required), plus any sandbox fields (model, tools, system_prompt, etc.). Control flow: on_fail (step id to jump to on error, or {"governor": "name"} for LLM-governed recovery — see save_governor_spec), on_success ({"governor": "name"} for LLM-governed continuation), next (jump after success), condition ("prev.error" = only run if previous failed), max_retries, retry_if ({target_step: keyword} — jump if output contains keyword). Any unhandled failure terminates the pipeline with status="broken". Args: definition: Pipeline name (loaded from ~/.claude/pipelines/.json) or inline JSON definition. resume: Resume a previous run. Format: "run_id" or "run_id/step_id". Reuses the shared directory from the previous run. If step_id is given, skips to that step. If only run_id, resumes from the step that failed.
pipeline_statusA	Return the current status of a running or completed pipeline. Reads /tmp/swarm-mcp//pipeline-status.json and returns its contents. The status file is written after each step completes. Args: run_id: The pipeline run ID returned by the pipeline() tool.
pipeline_artifactsA	List artifacts produced by a pipeline run. Without step_id: lists the /shared/ directory contents (inter-step files) plus a summary of each step's output directory. With step_id: lists that specific step's output directory in detail, including file sizes. Use unwrap(ref) or Read() to view file contents. Args: run_id: The pipeline run ID. step_id: Optional step ID to inspect. If omitted, lists shared/ and all steps.
pipeline_killA	Kill a running pipeline and all its Docker containers. Sets the pipeline's stop event (so the loop exits cleanly after the current step) and immediately kills all Docker containers associated with the run. Args: run_id: The pipeline run ID to kill.
list_pipelinesA	List recent pipeline runs and their current status. Scans /tmp/swarm-mcp/ for pipeline-status.json files and returns a summary of all known runs, sorted by last_updated descending. Also annotates which runs have live threads in the current process.
save_sandbox_specA	Save a reusable sandbox spec to ~/.claude/sandboxes/.json. Args: name: Name for the sandbox spec (e.g. "web-researcher", "code-reviewer"). spec: JSON object with sandbox fields: model, tools, mcps, system_prompt, claude_md, output_schema, effort, max_budget, mounts, workdir, input_files, network, memory, cpus, timeout, env_vars.
list_sandbox_specsA	List all saved sandbox specs from ~/.claude/sandboxes/.
wrapA	Wrap a file or directory into the swarm ref system. This is how you bring external objects INTO the monadic context. The wrapped file gets a ref that can be passed to any combinator. Args: path: Absolute path to a file or directory on the host.
wrap_projectA	Register a project directory's pipelines, sandboxes, and types with the swarm. Looks for pipelines/, sandboxes/, types/ subdirectories and adds them to the search paths. After wrapping, named resources from the project are discoverable by all swarm tools (pipeline, run, validate, etc.). Args: project_dir: Absolute path to a project root containing pipelines/, sandboxes/, and/or types/ directories.
list_type_registryA	List all registered types from ~/.claude/types/.
get_type_definitionA	Get a type definition by name, optionally resolving [references] to other types. Args: name: Type name (e.g. "mcp-server", "tarball", "code-review"). resolve_refs: Whether to inline [referenced] types (default: true).
validateA	Validate an artifact against a declared type. Runs a type-checker agent that inspects the artifact and reports VALID/PARTIAL/INVALID with per-criterion results. Use this after a pipeline step to verify the output matches expectations. If validation fails, you know which agent to blame and can retry. Args: artifact: Description of what to validate — e.g. the agent's output text, a file path, or a ref {"ref": "run_id/agent_id"}. declared_type: The type to validate against — either a type name (e.g. "mcp-server") or inline natural language description. sandbox: Named sandbox spec or inline JSON for the validator agent. model: Model for the validator (default: sonnet — needs to be good at analysis). timeout: Timeout for the validation agent.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/stiege/swarm-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server