| delegateA | Delegate a task to one ACP agent and return its normalized result. cli is an agent id (see capabilities); model is optional (the agent's default otherwise).
safety_mode is read_only | propose | write | yolo; when omitted, the configured default_safety_mode
applies (read_only out of the box). write and yolo also need a trusted workspace (trust_workspace=true
or a configured allowlist). files lists paths to put in scope. role names a persona (see
list_roles) whose system prompt is prepended to prompt. effort (low | medium | high | xhigh) asks
the agent to spend more reasoning where it has a knob (codex/cursor via the model id, cline via
--thinking, junie via env); a reported no-op for an agent with none. Omitted, the configured
default_effort (per-agent or global) applies. fallback is an ordered list of alternate targets
(cli / cli:model strings or {cli, model} objects) tried when the primary fails on a
re-execution-safe failure (a spawn/handshake failure that never ran the prompt); a benched alternate is
skipped and fallback_chain records the path. A write/yolo delegation never falls back.
allow_model_fallback (default true) first retries the same agent on its configured fallback model on a
model-unavailable failure, where it has one. persist keeps this run as a durable job under
<jobs_dir>/<run_id>/ (state.json + answer / diff artifacts); None follows default_persistence
(ephemeral out of the box), true / false force it. session_id resumes a prior agent session: pass
the session_id from an earlier delegate result and the agent reloads that conversation (ACP
session/load) instead of starting fresh, so a follow-up turn continues it; agents that do not persist
their own sessions fail RESUME_FAILED. mode="async" runs the turn as a background job and returns a
job_id (poll with job_status / job_result); mode="sync" awaits it.
|
| continue_jobA | Continue a completed durable job with a new direction, picking up where the kept run left off. job_id is the id of a kept run under <jobs_dir>/ (the run_dir name a persisted result carries). A
delegate job resumes its one session (else re-injects the prior prompt + answer); a consensus panel
resumes each voice's session and re-aggregates under the recorded strategy; a debate resumes each seat's
session and argues rounds MORE rounds (rounds is ignored for the other kinds). The parent's record
supplies the roster, model, working dir, role, files, and -- for a panel -- the strategy / stances /
per-seat steering, all inherited unless overridden here. A seat whose agent cannot reload its ACP session
is recorded as a failed voice, never silently dropped. The continuation is a fresh run linked to the
parent (continued_from) -- the parent is never mutated. The trust gate is re-applied fresh and defaults
to read_only (panels are read-only deliberation regardless). persist (default true) keeps the
continuation as its own durable child job. mode="async" runs it as a background job and returns a
job_id.
|
| consensusA | Ask the same prompt of several ACP agents in parallel and reduce the voices. targets is a list of {cli, model} objects or cli / cli:model strings; each runs as its own ACP
session concurrently. Omit targets, pass an empty list, pass the sentinel "all", or set
expand_all=true to fan out to every registered agent (each at its default model, capped at
max_targets); the result's skipped field explains any agent left out. Or name a saved panel (with
optional panel_overrides) to reuse a stored roster + strategy instead of targets; they are mutually
exclusive (see reload_panels). A {cli, model} target may
also carry per-seat role / label / weight / parity / stance. With a strategy other than
all-voices (unanimous | majority | plurality | weighted | parity-pair | rank, optionally
with a verdict_schema), each voice is asked for a verdict and the panel collapses to one outcome
(StrategyResult) instead of every voice. rank is a two-round protocol (F4b): every voice answers, then
ranks the OTHER answers anonymized and self-excluded, aggregated by Borda mean-rank into a rank
leaderboard with a pairwise agreement matrix and concordance; require_dissent surfaces each non-winning
position on its dissent. discount_correlated=true (F3 vote-math, opt-in) down-weights correlated votes
by model-family lineage (vendor fallback) so a panel of "one model in N CLI costumes" counts as one
effective vote under majority / plurality / weighted (each voice's lineage_weight shows it). Optional
stances (parallel to targets) steer each voice and cannot combine with the auto-expanded panel.
synthesize (defaults to synthesize_default, off
out of the box) adds a server-side combined answer (all-voices only); judge names the seat that
writes it. timeout_s applies to every voice; one failing voice is a failed result, never an aborted
panel. Consensus is read-only deliberation: a safety_mode beyond read_only (propose / write /
yolo) is refused -- there is no coherent merge of edits from several agents into one tree -- so route
write / propose work through delegate (a single agent isolated in a worktree sandbox). role names a
persona (see list_roles) prepended to the prompt every voice
receives. effort (low | medium | high | xhigh) asks every voice to spend more reasoning where it has a
knob. time_budget_s is a wall-clock deadline for the WHOLE panel (distinct from each voice's
timeout_s): at the deadline answered voices are kept, in-flight ones cut, and the panel aggregates over
the harvest if min_quorum usable remain (stop_reason="budget", with a rollup); below min_quorum
is BUDGET_EXHAUSTED. on_budget is harvest | continue | resume (default default_on_budget). persist
keeps the panel as a durable job (F2): a parent state.json linking a child record per voice, plus
voices/voice-N.md artifacts; None follows default_persistence, true / false force it.
mode="async" runs the panel as a background job and returns a job_id (poll with job_status /
job_result); mode="sync" awaits it.
|
| debateA | Have several ACP agents argue a question across rounds and return the full transcript. targets is a list of {cli, model} objects or cli / cli:model strings; a debate needs at least
two. Or name a saved panel (with optional panel_overrides) for a stored roster instead of targets;
they are mutually exclusive (rounds / judge stay call args). Each voice keeps ONE persistent ACP
session across all rounds: round one is each voice's
independent answer, and each later round shows a voice the others' latest positions and asks it to
revise -- the agent remembers its own prior reasoning in-session, so only the delta is sent.
carry_forward=true instead re-sends the FULL prior transcript verbatim each round (for a weaker session
memory; bounded by time_budget_s). track_convergence=true asks each voice for a one-word verdict each
round and stops early when the panel CONVERGES (a unanimous verdict) or STALLS (the decision holds for the
configured tolerance); the outcome field reports the termination reason (converged / stalled /
unresolved / budget / quorum_lost) and the final decision.
synthesize=true (default) adds a closing summary; judge names a target to write it. A debate is
read-only deliberation: a safety_mode beyond read_only (propose / write / yolo) is refused --
the voices run on persistent sessions in the working directory with no per-turn sandbox -- so route write /
propose work through delegate (a single agent isolated in a worktree sandbox). role names a
persona (see list_roles) prepended to the opening prompt every voice argues from. effort (low |
medium | high | xhigh) asks every voice to spend more reasoning where it has a knob. time_budget_s is a
wall-clock deadline for the WHOLE debate enforced at round boundaries: a round still in flight at the
deadline is cut and the transcript so far is finalized (stop_reason="budget", with a rollup);
on_budget is harvest | continue | resume (default default_on_budget; continue runs every round to
completion). persist keeps the debate as a durable job (F2): a parent state.json plus the full
transcript.md; None follows default_persistence, true / false force it. mode="async" runs the
debate as a background job and returns a job_id (poll with job_status / job_result); mode="sync"
awaits it.
|
| capabilitiesA | List the ACP agents Rutherford can drive (id, display name, launch command, provider). |
| list_rolesA | List the available role personas (id, name, description) for the role param. A role is a reusable system prompt; pass its id as role="<id>" to delegate / consensus /
debate and the persona is prepended to your prompt. Built-in roles ship with Rutherford; a
role_dirs directory can add or override one. |
| reviewA | Review a diff or a set of files across one or more ACP agents (read-only). Provide diff or paths. A read-only consensus under the principal-reviewer persona: each agent reviews the code and the
panel returns every voice plus a combined verdict. targets is a list of {cli, model} objects (or
cli / cli:model strings); or name a saved panel (with optional panel_overrides) instead -- the
two are mutually exclusive. Provide diff (a unified diff, inlined into the prompt) or paths (files put
in scope for the agents to read). synthesize defaults on (the combined verdict); pass false for the
raw per-voice reviews. Always read-only -- a review never mutates the tree. |
| planA | Ask one ACP agent for an implementation plan for goal under the architect persona (read-only). A read-only delegate with the architect (planner) persona prepended: the agent designs an approach
rather than implementing it. cli is an agent id (see capabilities); model is optional. files
lists paths to put in scope. Always read-only -- planning never mutates the tree; implementing the plan
is delegate in write mode. |
| reload_panelsA | Re-read saved panels from disk (after editing a panels.toon) and list those now available. Returns {reloaded, count, panels: [{name, description, target_count}]}. Panels are discovered under
~/.rutherford/panels.toon, the project .rutherford/panels.toon, and $RUTHERFORD_CONFIG_DIR, merged
by name (closest scope wins). A malformed panels file raises PANEL_INVALID naming the file and seat. |
| setupA | Show where config lives and scaffold a starter config.toml; the first-run helper. scope is project (<cwd>/.rutherford/config.toml) or global (the platform config dir's
config.toml). It returns the proposed starter content (the most useful settings at their effective
defaults) and the resolved path, plus a snapshot of the agents you already have. Pass write=true to
create the file -- it never overwrites an existing one (already_exists=true, written=false).
trust_workspace=true adds the current directory to trusted_workspaces so write/yolo delegations are
permitted there.
The adapters block reports agents whose underlying CLI is installed but whose npm ACP adapter shim is
not (codex needs codex-acp, claude_code needs claude-agent-acp, pi needs pi-acp -- what doctor
flags as not_installed with an install hint). Pass install_adapters=true to run npm i -g <package>
for each of those automatically (an explicit, opt-in machine change; off by default). |
| discoverA | Find installed ACP agents via the community registry and propose [agents.<id>] config for them. The registry-driven companion to setup/doctor. It fetches the ACP agent registry (cached under
~/.rutherford/acp-registry.json for offline use), detects which registry agents are ALREADY installed
here -- scanning PATH plus curated install dirs (~/.local/bin, ~/.cargo/bin, ~/.<vendor>/bin),
never downloading or running npx -- and (with probe=true, the default) drives each found agent with a
real read-only ACP round trip so the proposal only includes ones that actually answer. Returns the
discovered agents and a proposed [agents.<id>] config block for the new drivers. write=true appends
that block to the config for scope (project -> <cwd>/.rutherford/config.toml, global -> the
platform path), creating the file if needed and never overwriting an existing section. refresh
re-fetches the registry. Use this to adopt an ACP agent (or bridge) Rutherford does not ship as a built-in. |
| doctorA | Probe each agent (or one named agent) with a real read-only ACP round trip and report conformance. The trustworthy health check for ACP agents: whether each spawns, handshakes, and answers. Each report
is ok / no_answer / model_unavailable / handshake_failed / not_installed / error. model_unavailable
means spawn + handshake succeeded (the agent is reachable) but the harness/provider rejected the model on
the turn (a model/provider config issue, e.g. a Claude Code on AWS Bedrock / Vertex), so it is NOT reported
as a broken agent. Slower
than capabilities (it makes a real call per agent); run it to see which of the roster actually drive on
this machine. connect_only
runs the lighter handshake-only check (spawn + handshake, no prompt) and reports reachable /
handshake_failed / not_installed plus each agent's advertised models -- it shows whether Rutherford can
talk to and configure an agent even when a model call would fail for a reason outside ACP (an auth /
entitlement / quota issue, e.g. Grok without a SuperGrok subscription). When an agent (codex / claude_code / pi) launches a separate npm ACP adapter shim and that shim is not
installed but its underlying CLI is (you have codex/claude/pi), the report adds an install_hint
with the exact npm i -g <package> command instead of a flat not_installed -- run that, or
setup install_adapters=true, to set the adapter up. |
| list_jobsA | List the background jobs Rutherford is tracking (id, tool, status, summary, timestamps), newest first. The light listing -- no heavy result. Fetch a finished job's result with job_result. Jobs are
in-memory: a finished one is evicted after job_ttl_s, and a restart clears them all. |
| analyzeA | Analyze the kept run corpus (read-only). report="historical_agreement" is the default and only report. historical_agreement scans the consensus panels you kept (persist=true / default_persistence=job) and
reports how often two DISTINCT model lineages reached the same verdict when they co-voted -- an
OBSERVATIONAL signal for your roster choice (e.g. a lineage that never adds a dissent), NOT a vote discount:
agreement is not correctness, so down-weighting agreeing lineages would punish them for being right
together. An empty corpus returns an empty report whose notes explain how to build one.
|
| activityA | Show the background jobs IN FLIGHT right now (running + pending), each with a live elapsed time. The focused "what is happening now" snapshot, distinct from list_jobs: where list_jobs enumerates
every tracked job of every status (finished ones included), activity returns only the jobs still in
flight -- {active: [...], count} with each row {job_id, tool, status, summary, started_at, elapsed_s}, longest-running first. Empty ({active: [], count: 0}) when nothing is running. |
| job_statusA | Report one background job's status and timings (no heavy result); JOB_NOT_FOUND if the id is unknown. status is pending | running | succeeded | failed | cancelled. Poll this, then call job_result once
the job is succeeded (or to read the failure of a failed / cancelled job).
|
| job_resultA | Return a finished background job's result envelope -- identical to the sync tool's envelope. A succeeded job returns its stored result verbatim; a failed job returns its error; a cancelled
or still-running job returns a structured error (poll job_status and retry); an unknown id is
JOB_NOT_FOUND. |
| cancel_jobA | Cancel a running background job (killing its work) and return {job_id, status}; JOB_NOT_FOUND if unknown. Cancelling an already-finished job is a no-op that returns its current status. The job's process tree
is torn down on the next cancellation point. |