| apilookupA | Look up current documentation for an API/library/SDK using web-capable CLI. Use cases: Verify an SDK's current version + breaking changes before upgrading Find the canonical doc URL for a feature you only half-remember Spot deprecations introduced in the last 12 months
Default agent is gemini (web search built in via the headless CLI). Other
agents will answer from training data and flag the staleness in their response. |
| challengeA | Push back on a claim — agent with skeptic persona finds counter-arguments. Use cases: Anti-sycophancy: counterweight to consultations that always seem to agree Sanity-check a design before committing to it Surface failure modes you might have missed
The persona is fixed to skeptic; passing a different persona is not allowed
(would defeat the tool's purpose). |
| codereviewA | Cross-agent code review. Pattern: 'Claude implemented → Codex and Gemini review the code'.
Sends file contents to N agents in parallel with reviewer persona.
Each agent's response is parsed into severity-tagged findings; raw text is
preserved in response for any unparseable cases. Use cases: Validate Claude's code via Codex + Gemini perspectives Surface bugs / edge cases / security issues Identify regressions before merging
|
| consensusA | Multi-agent verdict with optional stance assignment + optional synthesis. Use cases: Force productive disagreement: stances=["for", "against", "neutral"] Lightweight cross-validation when debate is overkill (one round per agent) Explicitly named perspectives instead of implicit consensus
Distinct from consult_parallel because each agent can get a stance-steered
prompt; distinct from debate (IT-003) because it's one round, not round-robin. |
| consultB | Send a prompt to one CLI agent (claude/codex/gemini) and return its response. Use cases: Quick second opinion on an idea or design Ask a specific model for its take on a problem Get help from an agent specialized via persona
Available agents: claude, codex, gemini.
Available personas: default, architect, reviewer, researcher, coder. |
| consult_parallelA | Fan-out the same prompt to multiple CLI agents in parallel. Use cases: Cross-validate: ask claude, codex, and gemini the same question Variant generation: pass ["claude", "claude"] for two independent responses Collect diverse perspectives on a design or claim
Wall time = max(per-agent latency), not sum. Duplicates are NOT deduplicated —
each entry runs independently (intentional, supports variant generation). |
| councilA | 3-stage council: independent → cross-rank (anonymized) → synthesis. Use cases: Atomic deliberation when you don't want to manage poll/cancel state 3-perspective review with explicit ranking step (reduces first-mover bias) Quick consensus that's stronger than consult_parallel but lighter than debate
council_id registers a cancellation watermark with
CancellationRegistry. To cancel an in-flight council, the caller must
supply the id upfront and then bump the counter via
CancellationRegistry.request_cancel(id) from another in-process caller.
The MCP-exposed debate_cancel tool does NOT cover council ids — it
looks up DebateStore which has no council entries. Cancellation is
honoured between stages only; running subprocesses are never killed
mid-stage.
ctx is the FastMCP-injected context. When the client supports progress
notifications it sees three events (one per stage); otherwise emit is a
silent no-op.
Returns a dict with status ∈ {success, cancelled, failed}
plus partial: True when results are degraded (some stage1 vote failed,
or stage3 chairman failed). failed is reserved for "all stage1 voters
failed" — the council had nothing to deliberate on. |
| debate_cancelA | Request cancellation of a running debate. Returns immediately. If the debate is already finished (completed / cancelled / error), this is a
no-op and returns already_finished=true. The loop polls the cancellation
flag between turns, so cancellation may take up to one turn-latency to land. |
| debate_exportA | Render an archived/active debate as portable markdown or JSON. format:
truncate_body_chars shortens per-turn body to at most N characters
(suffix "... (truncated)" added). Use to keep response under MCP transport
limits when transcripts are large.
|
| debate_listA | List debates, sorted by started_at desc. With active_only=True: only in-memory debates in pending/running state.
Otherwise: merge in-memory + archive (memory wins on id collision), then sort. |
| debate_statusB | Get current status of a debate (pending/running/completed/cancelled/error). |
| debate_replayC | Return full transcript for a debate (memory first, then archive). |
| debate_runA | Run an async round-robin debate to completion, streaming progress. Same arguments as debate_start plus the FastMCP-injected ctx. Returns
when the loop terminates (max_turns / done / cancel / all_slots_down /
error). Per-turn progress is emitted via ctx.report_progress if the
client supports it (silently no-op otherwise). Return shape (IT-008/C-01 taxonomy): status: "success" — debate completed normally
status: "cancelled" — external debate_cancel honoured between turns
status: "failed" — agent registry not initialised, all slots down,
zero successful turns, or unhandled exception in the loop
|
| debate_startA | Start an async round-robin debate. Returns debate_id immediately. The loop runs in the background; poll debate_status(debate_id) for progress
and final transcript. Use debate_cancel(debate_id) to abort early. Use cases: Long deliberation between 2-6 agents (each round ≈ per-agent latency) When you want to step away and check back later Need a transcript saved for replay (auto-archived in SQLite on completion)
|
| diff_reviewA | Review a git diff for risks, regressions, and breaking changes. Pass either diff_text (raw unified diff) OR git_ref + cwd
(the tool runs git diff <ref> in cwd for you). Mutually exclusive. Use cases: Pre-merge sanity check on a PR's diff Spot accidental breaking changes when refactoring Cross-validate a branch against multiple reviewer agents
|
| implementA | Delegate implementation to a CLI agent that can edit files in base_path. Use cases: 'Claude designs → Codex implements → Claude+Gemini review' workflow Apply a refactor proposal across files without manual edits Hand off boilerplate work to a different model
Activates the agent's mutation flags (e.g. --permission-mode acceptEdits for
Claude). The tool returns a diff of files the agent created/modified/deleted. base_path MUST be an absolute path. A relative path is resolved against the
MCP server process's cwd (NOT the calling user's terminal cwd), which is rarely
what the caller intends and may land in an unexpected directory.
If base_path is a git repo: full unified diff is returned.
If not: only the list of changed files (mtime-detected, no diff text). |
| plannerA | Decompose a problem into atomic tasks with optional dependencies. Use cases: Break a feature request into ordered work items before implementation Surface parallelizable branches in a multi-step task Get a starting structure that the orchestrator can iterate on
depth="flat" returns a flat list (depends_on=[] always); depth="tree"
(default) lets the agent model dependencies for parallel execution.
|
| delegate_implementationA | Plan → implement → review in one MCP call. The composite tool runs three stages sequentially: Plan via the planner tool (planner_agent). Implement via the implement tool (implementer_agent;
mutation flags enabled). base_path must be allow-listed via
CONSULT_MCP_ALLOWED_BASE exactly as for implement. Review via diff_review or codereview (reviewer_agents,
in parallel). Choice is automatic based on implement's output:
a unified git diff routes to diff_review; otherwise the list of
modified files routes to codereview.
Returns one of: {"status": "success", ...} — all three stages ok
{"status": "partial_success", ...} — implement ok, ≥1 reviewer
failed or review skipped
{"status": "partial_error", "stage_failed": "implement", ...} —
implement failed; review run best-effort against any partial diff
{"status": "error", "stage_failed": "plan", ...} — planner failed,
downstream stages skipped
The tool never raises — all errors are surfaced as structured fields. |