How do I use squad-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@squad-mcp Review the current changes with the advisory squad" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

squad-mcp

by ggemba

Overview Schema Related Servers Score Discussions

TypeScript

Local

squad-mcp

npm version license: Apache-2.0 website

Website: https://squad.devthinks.com.br/

MCP server that exposes the squad-dev workflow as deterministic tools, prompts, and resources. It classifies a task, scores its risk, picks an advisory squad of specialist reviewers, slices the changed files per agent, validates the plan, and consolidates the advisory verdicts. The host LLM (Claude Code, Cursor, Warp, Claude Desktop, …) orchestrates; squad-mcp provides the building blocks.

It also ships as a Claude Code plugin that bundles the MCP server, the slash commands (/squad:implement, /squad:review, /squad:question, /squad:debug, /squad:tasks, /squad:next, /squad:task, /squad:grillme, /squad:pipeline, /squad:inventory, /squad:stats, /brainstorm, /commit-suggest, /squad:enable-journaling), and the matching skills behind a single /plugin install.

Install

Claude Code plugin (recommended)

/plugin marketplace add ggemba/squad-mcp
/plugin install squad@gempack

The plugin bundles the MCP server plus the slash commands and skills (/squad:implement, /squad:review, /squad:question, /squad:debug, /squad:tasks, /squad:next, /squad:task, /squad:grillme, /squad:pipeline, /squad:inventory, /squad:stats, /brainstorm, /commit-suggest, /squad:enable-journaling). After install, restart Claude Code to pick up the new commands and the squad MCP server.

npm package (any MCP client)

npx -y @gempack/squad-mcp

The package exposes the squad-mcp binary and works with any MCP-capable client. Examples below.

Claude Desktop

%APPDATA%\Claude\claude_desktop_config.json (Windows) / ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):

{
  "mcpServers": {
    "squad": {
      "command": "npx",
      "args": ["-y", "@gempack/squad-mcp"]
    }
  }
}

Cursor

.cursor/mcp.json (workspace-scoped) or global Cursor settings:

{
  "mcpServers": {
    "squad": {
      "command": "npx",
      "args": ["-y", "@gempack/squad-mcp"]
    }
  }
}

Warp

Settings → MCP servers → add. Command npx, args ["-y", "@gempack/squad-mcp"].

From source (development)

git clone https://github.com/ggemba/squad-mcp.git
cd squad-mcp
npm install
npm run build
node dist/index.js

Related MCP server: WhenLabs/When

Your first `/squad:implement` in 60 seconds

After install, the plugin is silent until you invoke it. Drop into a repo with at least one staged or recently committed change and run:

/squad:implement add a /health endpoint that returns {"status":"ok"}

What happens, in order:

Classification. compose_squad_workflow looks at your prompt + changed files and prints something like work_type: Feature, risk: Low, agents: [developer, qa].
Depth resolution. It also picks an execution depth — quick, normal, or deep — and surfaces it as mode + mode_source on the output. Auto-detect rules: deep on risk == High / Security work / auth-money-migration signals; quick on Low-risk diffs with ≤5 files and no high-risk signals; normal otherwise. Pass --quick / --normal / --deep to override. If you force --quick on a high-risk diff, security is force-included and a structured mode_warning is set so the host can surface it.
Plan. The skill drafts an implementation plan and sends it to tech-lead-planner for review (skipped in quick). You see the plan in chat.
Gate 1. The skill stops and asks you to approve. Reply approved, go, or equivalent to proceed; anything else cancels.
Implementation. After approval, the skill writes code. Never commits or pushes — that's your call.
Advisory squad. In v1.5+ the advisory runs AFTER implementation against the actual diff, not a plan draft. Every selected agent (architect, dba, dev, qa, security, reviewer — depends on the selection; capped at 2 for quick, force-includes architect + security for deep) reviews in parallel and emits a findings list + a Score: NN/100.
Consolidation. tech-lead-consolidator produces a verdict (APPROVED / CHANGES_REQUIRED / REJECTED) plus a scorecard like:
```
SQUAD RUBRIC — weighted 82 / 100 (threshold 75)
Application Code     ████████████████░░░░   82  ×18%  developer
Testing & QA         ███████████████░░░░░   78  ×14%  qa
```
The tech-lead-consolidator persona is skipped in quick mode; apply_consolidation_rules still runs to produce the verdict. If the verdict is CHANGES_REQUIRED / REJECTED, the reject-loop dispatches the implementer again against the delta.

Other commands to try once /squad:implement works:

/squad:review — same agents, but on an existing diff or PR (no implementation).
/squad:question <question> — fast read-only code Q&A. Spawns the code-explorer subagent to grep + excerpt the relevant lines and answers with file:line citations. Use it for "where is X defined?", "what calls Y?", "how does the auth flow work?". No plan, no gates, no implementation.
/squad:debug <issue> — read-only bug investigation. Takes a bug description + optional stack trace + repro steps, orients via code-explorer, then dispatches the debugger persona to emit N ranked hypotheses (1 on --quick, 3 on --normal, 5 with a cross-check pass on --deep) with file:line evidence and verification steps. The missing middle between /squad:question (lookup) and /squad:implement (fix).
/squad:tasks docs/prd.md — decompose a PRD into atomic tasks with confirmation before they land in .squad/tasks.json.
/squad:next — pick the next ready task; /squad:task 3 — work on a specific one.
/squad:grillme <plan> — Socratic plan validation. Grills your plan one question at a time against the project's domain language (CONTEXT.md) and prior decisions (ADRs in docs/adr/), and writes resolved terms back to both as it goes. Run it before /squad:implement to stress-test a plan; pass --no-write for a dry run.
/squad:pipeline <feature> — runs the six squad steps as one guided, human-gated sequence: brainstorm → grillme → tasks → next → implement → review. See Cradle-to-grave with /squad:pipeline below.
/squad:inventory <recipe> — codebase audit/inventory. Scans the repo for a named pattern (defined by a YAML "recipe pack") and emits a structured Markdown report cross-referenced with framework metadata (routes, handlers). Hybrid pipeline: a deterministic rg sweep does the file IO, then tiered LLM enrichment (Haiku tier-1, Sonnet escalation on requires_semantic rules or low confidence) classifies each finding. Bundled pack v1: php-inline-sql (inline SQL in PHP/Laravel). Writes one MD to ./docs/inventory/<recipe>-<date>.md by default; --out <path> overrides (accepts Obsidian vault paths). Reads source; never edits code.
/brainstorm <topic> — exploratory Q&A, no code.
/commit-suggest — generate a Conventional Commits message for staged changes.
/squad:stats — observability dashboard over .squad/runs.jsonl. Bar charts (verdict mix, score buckets), Unicode sparkline trend, per-agent breakdown of avg wall-clock and estimated tokens. Read-only; never writes. Flags: --quick (last 7d), --thorough (full history + health panel), --since <ISO>, --last <N>, --no-color. Token figures are estimates (chars ÷ 3.5).

Stuck? Check INSTALL.md → Troubleshooting. The most common failures (Failed to reconnect to plugin:squad:squad, marketplace cache, SSH key) all have entries.

How it works

squad-mcp is a deterministic server — it makes no LLM calls of its own. The host LLM does all the reasoning; the server hands it building blocks (tools, prompts, resources) and the skills wire them into a workflow.

flowchart LR
  subgraph Host["Host LLM · Claude Code / Cursor / Warp / Claude Desktop"]
    SK["Skills<br/>/squad:implement · /review<br/>/pipeline · /stats · ..."]
    SA["Subagents<br/>architect · developer<br/>security · qa · ..."]
  end
  subgraph Server["squad-mcp server · deterministic, no LLM calls"]
    T["Tools<br/>classify · score_risk<br/>select_squad · consolidate"]
    P["Prompts<br/>orchestration<br/>advisory · consolidator"]
    R["Resources<br/>agent://...<br/>severity://..."]
  end
  SK -->|invoke tools| T
  SK -->|load| P
  SA -->|read role def| R
  T -->|verdict + rubric scorecard| SK

A single /squad:implement run threads two human gates — plan approval and a Blocker halt — so the squad never writes code you did not sign off on:

flowchart TD
  A["/squad:implement <task>"] --> B["classify · score risk · select squad · pick depth"]
  B --> C{"Gate 1<br/>plan approved?"}
  C -->|no| X1["stop — nothing written"]
  C -->|yes| D["advisory squad runs in parallel<br/>each agent emits Score 0-100"]
  D --> E{"Gate 2<br/>any Blocker?"}
  E -->|yes| X2["halt — ask the user"]
  E -->|no| F["implement the approved plan"]
  F --> G["consolidate → verdict + rubric scorecard"]
  G --> H{"verdict"}
  H -->|APPROVED| I["done — committing is your call"]
  H -->|CHANGES_REQUIRED / REJECTED| J["reject loop → re-review the delta"]
  J --> G

Depth (quick / normal / deep) auto-scales the run: quick caps the squad at 2 agents and skips the planner + consolidator personas; deep force-includes architect + security and raises the reject-loop ceiling. See Your first /squad:implement for the auto-detect rules.

Examples in practice

Every example is a single line you type into the host. The squad sizes itself from the prompt + the changed files — you only reach for a flag to override.

Low-risk feature — auto-detected quick:

/squad:implement add a /health endpoint that returns {"status":"ok"}

work_type: Feature · risk: Low · mode: quick (auto) · agents: [developer, qa] Planner skipped, 2-agent advisory, sub-30s feedback. Stops at Gate 1 for your approval.

Auth refactor — auto-detected deep:

/squad:implement refactor src/auth/jwt-validator to rotate signing keys

work_type: Security · risk: High · mode: deep (auto) · agents: [architect, security, developer, qa, reviewer] touches_auth fires → deep. Planner + consolidator personas run, reject-loop ceiling raised to 3.

Forcing --quick on a risky diff — the safety override:

/squad:implement --quick patch the refund amount rounding in src/billing/ledger.ts

mode: quick (user) · mode_warning set — --quick is honoured but security is force-included as one of the 2 agents because touches_money fired. The host surfaces the mode_warning so the downgrade is never silent.

Review an existing PR and post the verdict:

/squad:review #42

Runs the advisory on PR #42's diff, renders the scorecard, then dry-runs the PR post — shows the exact request and the markdown body it would post, and waits for your go.

Fast read-only code Q&A — no plan, no gates:

/squad:question where is the rubric weighted score computed?

Spawns code-explorer, answers with file:line citations. Sub-second on --quick.

See where your runs went:

/squad:stats

Reads .squad/runs.jsonl and renders a cyan ANSI panel — verdict mix, score buckets, sparkline trend, per-agent token + wall-clock breakdown.

Cradle-to-grave with `/squad:pipeline`

Each squad skill runs standalone. /squad:pipeline chains them into one guided sequence for a feature going from idea to verified change, so you never have to remember what comes next or how to wire one step's output into the next:

flowchart LR
  BS["/brainstorm<br/>decide what to build"] --> GM["/squad:grillme<br/>stress-test the plan"]
  GM --> TK["/squad:tasks<br/>decompose into tasks"]
  TK --> NX["/squad:next<br/>pick the next task"]
  NX --> IM["/squad:implement<br/>build it"]
  IM --> RV["/squad:review<br/>review the change"]

/squad:pipeline add multi-currency support to the checkout flow

The pipeline is an executor that auto-invokes each sub-skill and halts only at explicit user-decision gates (proceed / adjust / skip / exit). Each time you invoke it, it:

Reconstructs how far the feature has progressed from the conversation context (or the .squad/pipeline-state.json resume cache).
Dispatches the next sub-skill via the Skill tool with arguments pre-filled and depth flags forwarded.
Stops at each inter-phase gate to explain the decision you are about to make.

Sub-skills' own internal gates (e.g. /squad:implement Gate 1 plan approval, Gate 2 Blocker halt) keep firing as before — those remain in-skill human checkpoints. Reserved interruption commands (pipeline stop, pipeline skip, pipeline redo, pipeline back) break the auto-flow at any gate. The pipeline records no telemetry of its own; each sub-skill still records its own run, so /squad:stats aggregates them normally.

Flag	Purpose
`--from <phase>`	Enter the pipeline mid-sequence (`brainstorm` / `grillme` / `tasks` / `next` / `implement` / `review`). Skip the phases you have already done.
`--quick` / `--normal` / `--deep`	Forwarded as-is to every step the pipeline recommends.

# already brainstormed — jump straight to stress-testing the plan
/squad:pipeline --from grillme add multi-currency support to the checkout flow

# take a small change cradle-to-grave at quick depth
/squad:pipeline --quick fix the timezone bug in the daily report job

What it provides

Tools (deterministic, pure functions)

Tool	Purpose
`detect_changed_files`	Hardened `git diff --name-status --no-renames` for a workspace. Allowlisted refs, 10s timeout, 1MB stdout cap.
`classify_work_type`	Heuristic `WorkType` from prompt + paths (`Feature` / `Bug Fix` / `Refactor` / `Performance` / `Security` / `Business Rule`) with Low/Medium/High confidence.
`score_risk`	Compute Low/Medium/High from boolean signals (auth, money, migration, files_count, new_module, api_change).
`select_squad`	Select advisory agents for a work type. Combines matrix + path hints + content sniff. Returns evidence per file.
`slice_files_for_agent`	Filter a file list to those owned by a single agent. Used to build sliced advisory prompts.
`validate_plan_text`	Advisory check for inviolable-rule violations in a plan (commit/push fences, emojis in code blocks, non-English identifiers, impl-before-approval).
`compose_squad_workflow`	One-call pipeline: `detect_changed_files` → `classify_work_type` → `score_risk` → `select_squad`.
`compose_advisory_bundle`	One-call full bundle: `compose_squad_workflow` + `slice_files_for_agent` per selected agent + `validate_plan_text`.
`apply_consolidation_rules`	Aggregate advisory reports → final verdict (APPROVED / CHANGES_REQUIRED / REJECTED). Returns weighted rubric scorecard when reports carry per-dimension scores.
`score_rubric`	Pure rubric calculator. Takes per-agent scores (0-100) + optional weight overrides, returns weighted score, per-dimension breakdown, and pre-formatted ASCII scorecard.
`read_squad_config`	Read and resolve `.squad.yaml` (or `.squad.yml`) at workspace_root. Returns effective weights, threshold, min_score, skip_paths, disable_agents.
`read_learnings`	Load past accept/reject decisions from `.squad/learnings.jsonl`. Filters by agent / decision / changed-file scope. Returns entries plus a markdown block ready to inject into agent or consolidator prompts.
`record_learning`	Append one accept/reject decision to `.squad/learnings.jsonl`. Side-effecting; the skill (or CLI) is responsible for per-finding user authorisation.
`prune_learnings`	Lifecycle maintenance (v0.11.0+): mark entries older than `max_age_days` as `archived` and entries with ≥ `min_recurrence` accepts on the same canonicalised finding as `promoted`. Atomic rewrite under file lock. Never auto-runs.
`compose_prd_parse`	Build a prompt + JSON schema for the host LLM to decompose a PRD into atomic tasks. Pure-MCP: server does NO LLM calls. Caller (skill) feeds the prompt to its model, then calls `record_tasks` after user confirmation.
`list_tasks`	Read tasks from `.squad/tasks.json`. Filters: status, agent (matches `agent_hints`), changed_files (glob match against task `scope`).
`next_task`	Pick the next ready task: candidate status (default pending), all dependencies done, optional agent / changed_files filter. Tiebreak priority then id. Returns null + reason when none ready.
`record_tasks`	Bulk-create tasks. Allocates ids sequentially, validates dependencies resolve (forward refs in batch ok), rejects duplicates and self-deps. Atomic write.
`update_task_status`	Flip a task or subtask status: pending / in-progress / review / done / blocked / cancelled.
`expand_task`	Append subtasks to an existing task. Mechanical only — caller (skill or LLM) supplies the subtask inputs.
`slice_files_for_task`	Filter a file list to those matching a task's `scope` glob. Same glob primitive as `skip_paths` and learnings scope.
`list_agents`	List configured agents with role, ownership, naming conventions.
`get_agent_definition`	Return the full markdown system prompt for an agent (local override → embedded default).
`init_local_config`	Copy embedded defaults to the local override directory so they can be edited.
`record_run`	Append one `RunRecord` to `.squad/runs.jsonl`. Single-writer contract: only the squad skill calls this (Phase 1 `in_flight` + Phase 10 terminal). Validates against schema_version 1, enforces 4 KB per-record cap, file mode `0o600`.
`list_runs`	Read-only journal read. Folds the two-row pair by id, filters (since / limit / agent / verdict / mode / invocation / work_type), and returns either the folded list or an aggregate bundle (outcomes + health + trend) when `aggregate: true`.

Prompts

squad_orchestration — full Phase 0–12 orchestration guide.
agent_advisory — sliced prompt for one advisory agent.
consolidator — final verdict prompt for TechLead-Consolidator.

Resources

agent://product-owner, agent://tech-lead-planner, agent://tech-lead-consolidator, agent://architect, agent://dba, agent://developer, agent://reviewer, agent://security, agent://qa. (Renamed from PascalCase / po in v0.6.0 — older 0.5.x consumers must use agent://po instead.)
severity://_severity-and-ownership — severity matrix + ownership rules.
severity://skill-squad-dev, severity://skill-squad-review — full skill specs.

Bundled skills

The plugin auto-registers these skills via skills/:

Skill	Trigger	Purpose
`/squad:implement`	implementation workflow	Single skill, two modes. `/squad:implement <task>` builds an approved plan, distributes work to specialist subagents in parallel, implements the change, consolidates via tech-lead. `/squad:review [target]` is the same skill in review mode — never implements, just produces an advisory verdict on an existing diff/branch/PR. Optional `--codex` second-opinion.
`/squad:question`	read-only code Q&A	Spawns the `code-explorer` subagent (Haiku-class, read-only) to grep, glob, and excerpt the codebase, then synthesizes a `file:line`-cited answer. No plan, no gates, no implementation. Designed to be fast — single dispatch on the default `medium` budget, sub-second on `--quick`.
`/squad:debug`	read-only bug investigation	Bridges `/squad:question` (lookup) and `/squad:implement` (fix). Takes a bug description + optional stack trace + repro steps; dispatches `code-explorer` to locate suspect code, then the new `debugger` persona to emit N ranked hypotheses (1 on `--quick`, 3 on `--normal`, 5 with a cross-check pass on `--deep`) with `file:line` evidence and verification steps. Read-only end-to-end.
`/squad:grillme`	Socratic plan validation	Grills your plan one question at a time against the project's domain language (`CONTEXT.md`) and prior decisions (ADRs in `docs/adr/`), updating both inline as terms resolve. Use before `/squad:implement` to stress-test a plan. Flags: `--quick` / `--normal` / `--deep`, `--no-write` for a dry run.
`/brainstorm`	pre-implementation research	Web research in parallel + specialist agent perspectives → options matrix with cited sources and a recommendation. Produces no code. Position: `/brainstorm` decides what to build, `/squad:implement` implements, `/squad:review` reviews.
`/squad:pipeline`	cradle-to-grave orchestration	Chains six squad steps — `brainstorm → grillme → tasks → next → implement → review` — into one guided sequence. Executor that auto-invokes each sub-skill via the `Skill` tool and halts only at explicit user-decision gates (`proceed` / `adjust` / `skip` / `exit`); sub-skills' internal gates (Gate 1 plan approval, Gate 2 Blocker halt) keep firing as before. Resume cache at `.squad/pipeline-state.json` survives context compaction. `--from <phase>` enters mid-sequence; `--quick` / `--normal` / `--deep` are forwarded to each step.
`/squad:inventory`	codebase audit / inventory	Scans the repo for a named pattern (defined by a YAML "recipe pack") and emits a structured Markdown report cross-referenced with framework metadata (routes, handlers). Hybrid pipeline: a deterministic `rg` sweep does the file IO, then tiered LLM enrichment (Haiku tier-1, Sonnet escalation on `requires_semantic` rules or low confidence) classifies findings; a pure renderer emits byte-stable MD. Bundled pack v1: `php-inline-sql`. Reads source; writes one MD report (default `./docs/inventory/<recipe>-<date>.md`, `--out` overrides). Flags: `--quick` / `--normal` / `--deep`, `--out <path>`, `--dry-run`.
`/commit-suggest`	commit message generator	Read-only suggester for Conventional Commits messages. Runs only an allowlist of git commands; never executes mutations; never adds AI co-author trailers. The user runs the commit themselves.
`/squad:stats`	observability dashboard	Read `.squad/runs.jsonl`, render a single-screen ANSI panel: verdict mix, score buckets, sparkline trend (14 days default), per-agent avg wall-clock + estimated tokens. One accent colour (cyan), Unicode block bars at 1/8 granularity. Flags: `--quick`, `--thorough`, `--since <ISO>`, `--last <N>`, `--no-color`. Token figures are estimates (chars ÷ 3.5).
`/squad:enable-journaling`	auto-journaling opt-in	Copies the bundled `PostToolUse` hook scripts into `.squad/hooks/` and prints the `.claude/settings.json` snippet to wire them up. Capture-only — squad behaviour is unchanged until `journaling` is set to `opt-in` in `.squad.yaml`.

Bundled subagents

The plugin's agents/ directory registers eleven native Claude Code subagents you can also dispatch directly via Task(subagent_type=…):

product-owner, architect, dba, developer, reviewer, security, qa, tech-lead-planner, tech-lead-consolidator, plus two utility roles: code-explorer (fast read-only code search; Haiku-class; dispatched by the planner for context gathering or by /squad:question for direct Q&A) and debugger (hypothesis-first bug investigation; Haiku-class; dispatched by /squad:debug to emit ranked root-cause hypotheses with file:line evidence and verification steps). Neither utility role scores the rubric or is auto-selected by the matrix.

The /squad:implement skill orchestrates them. For non-Claude-Code MCP clients (Cursor, Claude Desktop, Warp), the same role markdowns are accessible through the MCP agent://… resources and get_agent_definition tool.

Workflow positioning — each skill is standalone, and /squad:pipeline chains them:

flowchart LR
  BS["/brainstorm<br/>decide what to build"] --> IM["/squad:implement<br/>implement what was decided"]
  IM --> RV["/squad:review<br/>review what was implemented"]
  RV --> CM["/commit-suggest<br/>craft the commit message"]

/squad:pipeline wraps this whole sequence (with /squad:grillme and /squad:tasks in between) as one guided, human-gated flow — see Cradle-to-grave with /squad:pipeline.

See INSTALL.md for trigger examples and the optional commit-msg git hook + permissions.deny snippet that hard-enforce the read-only and no-AI-attribution invariants at the OS / Claude Code layer.

Repo configuration — `.squad.yaml`

Drop a .squad.yaml (or .squad.yml) at the repo root to override defaults per-project. Versioned with the code, picked up automatically by compose_squad_workflow and compose_advisory_bundle.

# .squad.yaml — example for a regulated fintech backend

# Rubric weights (must sum to 100 across the agents you list).
# Agents NOT listed are zeroed out — listing weights is an explicit choice
# of which dimensions count for this repo.
weights:
  security: 30 # PCI compliance — security weighted higher
  dba: 22 # double-entry ledger, money on the line
  developer: 20
  architect: 15
  qa: 13

# Per-dimension flag threshold (default 75). Below this, the dimension is
# marked with ⚠ in the scorecard.
threshold: 80

# Quality floor: APPROVED with weighted score below this becomes
# CHANGES_REQUIRED. Severity rules (Blocker/Major) take precedence.
min_score: 75

# Files excluded from advisory. Glob syntax: ** for any depth, * for one
# segment, ? for one char. Useful for docs-only or generated paths.
skip_paths:
  - "docs/**"
  - "**/*.md"
  - "**/generated/**"
  - "vendor/**"

# Agents not relevant for this repo (e.g. internal tool, no PO involved).
disable_agents:
  - product-owner

All keys are optional; partial files merge with package defaults. force_agents in tool calls still wins over disable_agents (config is a default policy, not a veto over explicit caller intent). Validation is strict: weights that don't sum to 100, unknown agent names, or invalid threshold ranges are rejected with a clear error.

The reader is cached by mtime — long-running MCP servers automatically pick up edits without a restart.

Learnings — persistent accept/reject memory

Each time the team accepts or rejects an advisory finding, the decision can be appended to .squad/learnings.jsonl. Future runs of the squad load recent decisions and inject them into per-agent and consolidator prompts so the squad stops re-raising findings the team has already considered.

{"ts":"2026-04-12T15:02:31Z","pr":42,"agent":"security","severity":"Major","finding":"missing CSRF on POST /api/refund","decision":"reject","reason":"CSRF terminated at API gateway, see infra/edge.tf","scope":"src/api/**"}
{"ts":"2026-04-15T09:18:11Z","pr":47,"agent":"architect","severity":"Major","finding":"cross-module coupling Auth → Billing","decision":"accept","reason":"refactored to event bus"}

The file lives in git. Decisions are auditable in PR diffs.

Recording decisions (v0.11.0+ Phase 12 prompt)

After /squad:review consolidates findings, the skill surfaces a single batched prompt at the end of the report. It groups the findings by agent + severity (Suggestion-level findings excluded) and asks one question:

Save which findings as precedents? Reply: accept N1,N2,N3 / reject N4 / all accept / skip / because <reason> to attach a rationale.

Each affirmative pick fires one record_learning call. Examples:

accept 1,2 because we ship this pattern across services
reject 3 (records the rejection without a reason; the squad will still suppress the same finding next run)
all accept (accepts every Blocker / Major / Minor in the report)
skip or empty response (records nothing)

Per-finding authorisation is required — silence or "thanks" is not authorisation. The skill never invents a reason; the text after because flows verbatim to record_learning.reason.

For non-MCP environments, use the CLI helper:

node tools/record-learning.mjs --reject \
  --agent security \
  --finding "missing CSRF on POST /api/refund" \
  --reason "CSRF terminated at API gateway" \
  --scope "src/api/**" \
  --pr 42

How the squad uses them

In Phase 5 (per-agent advisory) the skill calls read_learnings(workspace_root, agent, changed_files) and injects the rendered ## Past team decisions block into the agent's prompt. In Phase 10 (consolidator) it does the same without an agent filter — the consolidator sees the full picture across agents.

Each agent is told: when a current finding matches a previously rejected decision (similar agent + similar finding text + matching scope), suppress or downgrade severity unless the diff materially changes the rationale. When a finding contradicts a previously accepted decision, flag the contradiction explicitly.

Lifecycle (v0.11.0+): archive + promote

Two new optional fields on each entry let the journal age gracefully without manual surgery:

archived: true — the entry is past the team's age cutoff and is hidden from default read_learnings injection. The row stays on disk for forensics.
promoted: true — the same finding (matched by canonicalised title) has been accepted ≥ N times and now surfaces FIRST in the rendered block as ⭐ PROMOTED. Advisors are instructed to treat promoted entries as team policy, not ordinary precedent.

Both flags are set by the prune_learnings MCP tool:

prune_learnings({
  workspace_root: <repo>,
  max_age_days: 180,    // entries older than this get archived: true
  min_recurrence: 3,    // accept-decisions on the same finding ≥ 3× get promoted: true
  dry_run: false        // set true to inspect counts without mutating
})

prune_learnings never auto-runs. The defaults are max_age_days: 0 (= disabled) and min_recurrence: 3 — invoking with no arguments is a safe no-op. Wire it into a cron or pre-commit hook if you want regular housekeeping. Each non-no-op run produces an atomic rewrite of .squad/learnings.jsonl under the same file lock used by record_learning; concurrent readers either see the pre-rewrite or post-rewrite file in full, never a torn write. A .prev snapshot is kept alongside the file as the rollback point.

The v0.11.0 schema is additive and backward-compatible — a v0.10.x reader strips the unknown archived / promoted fields silently and continues. No schema_version bump.

Configuration

Override defaults via .squad.yaml:

learnings:
  path: .squad/learnings.jsonl # default
  max_recent: 50 # how many recent entries to inject (hard cap 200)
  enabled: true # set false to disable injection without deleting the journal

The store reader is mtime-cached. The journal is append-only by design — the skill never amends or deletes past entries; correcting a stale decision means appending a new one.

Tasks — PRD-decomposed atomic work units

The biggest source of token bloat in a long-running squad session is the squad re-analysing the whole repo for every prompt. The tasks store fixes that by decomposing a PRD into atomic tasks up front, then running the squad on ONE task's narrowed scope at a time.

// .squad/tasks.json (excerpt)
{
  "version": 1,
  "tasks": [
    {
      "id": 1,
      "title": "Add CSRF token to checkout flow",
      "status": "done",
      "dependencies": [],
      "priority": "high",
      "scope": "src/api/checkout/**",
      "agent_hints": ["security", "developer"],
      "test_strategy": "POST without token → 403; POST with token → 200.",
      "subtasks": [],
      "created_at": "2026-05-08T12:00:00Z",
      "updated_at": "2026-05-09T15:30:00Z"
    },
    {
      "id": 2,
      "title": "Wire CSRF middleware into refund endpoint",
      "status": "pending",
      "dependencies": [1],
      "priority": "high",
      "scope": "src/api/refund/**",
      "subtasks": [],
      ...
    }
  ]
}

scope (glob) and agent_hints are squad-mcp-specific additions on top of the claude-task-master shape — they let slice_files_for_task and compose_squad_workflow narrow the advisory automatically.

Decomposing a PRD

Inside Claude Code:

/squad:tasks docs/prd-payments-refactor.md

The skill (Phase 0.5):

Calls compose_prd_parse with the PRD text.
Receives a prompt + JSON schema and runs them through Claude.
Shows you the parsed tasks — title, deps, priority, scope, agent_hints — for review.
Calls record_tasks only after you say "record" / "go" / "yes".

The parse is pure-MCP: the squad-mcp server never makes LLM calls. The host (Claude Code, Cursor, Warp) does the inference. No provider keys in the server, no surprises for non-Claude clients.

Working tasks

/squad:next                # picks the highest-priority ready task
/squad:task 5              # explicit pick by id

For each task:

slice_files_for_task narrows the changed-files list to the task's scope.
compose_squad_workflow runs against that slice; if agent_hints is set, only those agents wake up.
Phase 1 onward proceeds normally, just with much less context.
When done, the skill flips status to done via update_task_status.

Configuration

Override defaults via .squad.yaml:

tasks:
  path: .squad/tasks.json # default
  enabled: true # set false to silence reads without deleting the file

Writes (record_tasks, update_task_status, expand_task) stay open even when reads are disabled — same policy as learnings. Disabling injection should not throw away the journal.

CLI for non-MCP environments

Mirroring the post-review and record-learning helpers:

# decompose offline (you generate the JSON yourself or via another tool)
echo '[{"title":"Add CSRF","scope":"src/api/**"}]' | node tools/record-tasks.mjs

# inspect
node tools/list-tasks.mjs --status pending
node tools/next-task.mjs --json

# flip status from CI
node tools/update-task-status.mjs --task 5 --status done

The CLIs share tools/_tasks-io.mjs for read/write and require only node 18+. Schema validation is lighter than the MCP tool — production use should prefer the MCP path.

Posting reviews to PRs (GitHub + Bitbucket Cloud)

Once the squad runs, you can post the verdict + scorecard as a PR review on GitHub or Bitbucket Cloud. The skill /squad:review #42 runs the advisory and offers to post the result; default behaviour is dry-run + confirmation — Claude shows the exact request and the markdown body, then waits for your "go" before posting.

# auto-detect platform from `git remote get-url origin`
echo '<consolidation JSON>' | node tools/post-review.mjs --pr 42 --dry-run
echo '<consolidation JSON>' | node tools/post-review.mjs --pr 42

# force a platform
echo '<consolidation JSON>' | node tools/post-review.mjs --pr 31 --platform bitbucket-cloud --repo repos_acgsa/some-repo

The CLI maps verdict → review action deterministically:

Verdict	Score signal	GitHub `gh` action	Bitbucket Cloud
`REJECTED`	—	`--request-changes` (blocks merge)	`POST /comments` + `POST /request-changes`
`CHANGES_REQUIRED`	—	`--comment` (advisory)	`POST /comments` only
`APPROVED` + `downgraded_by_score: true`	weighted < `min_score`	`--comment`	`POST /comments` only
`APPROVED` + score < `request_changes_below_score`	(opt-in floor)	`--request-changes`	`POST /comments` + `POST /request-changes`
`APPROVED` otherwise	passes threshold	`--approve`	`POST /comments` + `POST /approve`

Platform auto-detection

--platform auto (default) parses git remote get-url origin:

github.com/<owner>/<repo> → GitHub
bitbucket.org/<workspace>/<repo> → Bitbucket Cloud
Anything else → exit 6 with a clear error. Pass --platform <name> --repo <a>/<b> to override.

Bitbucket Server / Data Center (self-hosted) is not supported — it has a different REST API surface and would need a separate adapter.

Auth

Platform	Mechanism
GitHub	`gh` CLI on PATH + `gh auth login`. Exits 3 if missing.
Bitbucket Cloud	`SQUAD_BITBUCKET_EMAIL` + `SQUAD_BITBUCKET_TOKEN` env vars. Exits 5 if missing.

For Bitbucket Cloud, generate an API Token at https://id.atlassian.com/manage-profile/security/api-tokens with the pullrequest:write scope (App Passwords were deprecated by Atlassian in 2025). Auth is HTTP Basic with email:apiToken.

Severity budget (A.3, May 2026)

Cap how many findings get expanded inline in the PR body before collapsing the surplus into a footnote. Drops happen lowest-severity-first; Blockers are never silently dropped. Useful for big PRs and platforms with tight rate limits (Bitbucket Cloud is 1000 req/h per user).

CLI:

node tools/post-review.mjs --pr 42 --severity-cap 20 --drop-below Minor

Repo default via .squad.yaml:

pr_posting:
  severity_budget:
    per_pr_max: 20 # cap total expanded findings (Blockers exempt)
    drop_below: Minor # hard floor — anything strictly below this drops FIRST

When the budget hides anything, the body carries a footnote like _Severity budget hid 7 findings (4 minor, 3 suggestion). Tune pr_posting.severity_budget in .squad.yaml._ so it's never silent.

SARIF / JSON output (A.2, May 2026)

Emit a SARIF 2.1.0 artefact for CI gating, IDE annotations, and dedup with linters. Three modes:

# markdown only (default — historical behaviour)
node tools/post-review.mjs --pr 42

# SARIF only — writes .squad/last-review.sarif.json, SKIPS the PR post
node tools/post-review.mjs --output-format sarif

# both — SARIF + post to PR
node tools/post-review.mjs --pr 42 --output-format both

Override path with --sarif-path <file>. Each result carries a partialFingerprints.canonicalHash (16-char sha256) built from (agent, severity, normalized title) — stable across rebases, enables future dedup-on-rerun and cross-tool dedup with Sonar / CodeQL.

Repo default via .squad.yaml:

pr_posting:
  output_format: both # markdown | sarif | both (default markdown)

GitHub Code Scanning, GitLab SAST, Sonar, and most ingestion pipelines consume SARIF 2.1.0 directly.

Auto-post (opt-in)

If .squad.yaml has pr_posting.auto_post: true, the skill posts without the second confirmation prompt — but always shows the body first. Auto-post means "skip the second yes/no", not "skip the preview".

pr_posting:
  auto_post: true # default false — always asks
  request_changes_below_score: 50 # below this, post --request-changes instead of --approve
  omit_attribution_footer: false # default false — footer present

Detection strategy (`select_squad` / `slice_files_for_agent`)

Three layers, in order of strength:

Content sniff — reads the first 16 KB of each file, matches token regexes (e.g. class : DbContext, [ApiController], services.AddScoped<>, from 'express', prisma.<model>.findMany, from sqlalchemy, gorm.Open, gin.New). Strong signal, name-agnostic. Patterns can be ext-gated (e.g. only .py for from sqlalchemy) to avoid cross-stack false positives.
Path hint — file path regex (e.g. *Repository.cs, Migrations/, Controller.cs, api/, models/). Cheap, complementary.
Conventions — each agent flags non-conformant naming as a finding so future detections improve over time.

Output of select_squad includes per-file evidence with confidence and low_confidence_files for unclassified files. Override via the force_agents parameter or by editing local agent definitions.

Local override of agent definitions

The loader picks ONE local override directory:

If SQUAD_AGENTS_DIR is set, that path is used exclusively (the platform default is not consulted).
Otherwise: %APPDATA%\squad-mcp\agents on Windows, $XDG_CONFIG_HOME/squad-mcp/agents on Unix (falls back to ~/.config/squad-mcp/agents if XDG_CONFIG_HOME is unset).

Per-file resolution: if the agent's *.md exists in the chosen local directory, it wins. Otherwise, the embedded default bundled in the package is used.

Override files are loaded verbatim and rendered into the LLM's context with full agent authority — treat the directory as code (user-only writable, not on shared volumes, never sourced from untrusted input).

Since v0.4.0, the override directory is validated against an allowlist (HOME, APPDATA, LOCALAPPDATA, XDG_CONFIG_HOME, process.cwd()); paths outside the allowlist are rejected with OVERRIDE_REJECTED. Set SQUAD_AGENTS_ALLOW_UNSAFE=1 to bypass for unusual setups (logs a warn banner). See INSTALL.md for the full security guidance.

Run the init_local_config tool once to seed the local directory with editable defaults.

Repo layout

squad-mcp/
├── .claude-plugin/             # Claude Code plugin manifest + marketplace
├── .github/workflows/          # CI + release workflows
├── agents/                     # Native subagents (one .md per subagent, kebab-case + frontmatter)
├── shared/                     # Severity matrix + skill specs (resources, not subagents — kept outside agents/ for the plugin manifest validator)
├── commands/                   # Slash commands (/squad:implement, /squad:review, /brainstorm, /commit-suggest)
├── skills/                     # Bundled skills
│   ├── squad/                  # single skill, two modes (implement | review)
│   ├── brainstorm/
│   └── commit-suggest/
├── src/
│   ├── index.ts                # stdio entry
│   ├── tools/                  # MCP tools (deterministic functions)
│   ├── resources/              # MCP resources + agent loader
│   ├── prompts/                # MCP prompt templates
│   ├── exec/git.ts             # hardened git execution layer
│   ├── observability/logger.ts # structured stderr JSON logs
│   ├── util/path-safety.ts     # path-traversal-safe resolution
│   └── config/
│       └── ownership-matrix.ts # agents, work types, content/path patterns
├── tests/                      # vitest unit + integration + stdio smoke
├── tools/
│   └── git-hooks/commit-msg    # opt-in hook rejecting AI-attribution trailers
└── dist/                       # compiled JS (gitignored, shipped via npm)

Tests

npm test                # vitest (unit + integration)
node tests/smoke.mjs    # stdio JSON-RPC smoke test (requires npm run build first)

Versioning + release

This project follows SemVer. Releases are tagged vX.Y.Z on main, which triggers the .github/workflows/release.yml workflow to publish @gempack/squad-mcp@X.Y.Z to npm with provenance. See CHANGELOG.md for the version history.

Contributing

Issues and PRs welcome at https://github.com/ggemba/squad-mcp. Run npm test && npm run build before opening a PR. CI runs on Linux + Windows on Node 22 and 24.

License

Apache-2.0. See NOTICE for attribution and third-party dependencies.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

1Releases (12mo)

Commit activity

Resources

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

View all tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ggemba/squad-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

squad-mcp

Install

Claude Code plugin (recommended)

npm package (any MCP client)

Claude Desktop

Cursor

Warp

From source (development)

Your first /squad:implement in 60 seconds

How it works

Examples in practice

Cradle-to-grave with /squad:pipeline

What it provides

Tools (deterministic, pure functions)

Prompts

Resources

Bundled skills

Bundled subagents

Repo configuration — .squad.yaml

Learnings — persistent accept/reject memory

Recording decisions (v0.11.0+ Phase 12 prompt)

How the squad uses them

Lifecycle (v0.11.0+): archive + promote

Configuration

Tasks — PRD-decomposed atomic work units

Decomposing a PRD

Working tasks

Configuration

CLI for non-MCP environments

Posting reviews to PRs (GitHub + Bitbucket Cloud)

Platform auto-detection

Auth

Severity budget (A.3, May 2026)

SARIF / JSON output (A.2, May 2026)

Auto-post (opt-in)

Detection strategy (select_squad / slice_files_for_agent)

Local override of agent definitions

Repo layout

Tests

Versioning + release

Contributing

License

Maintenance

Resources

Looking for Admin?

Tools

Latest Blog Posts

MCP directory API

Your first `/squad:implement` in 60 seconds

Cradle-to-grave with `/squad:pipeline`

Repo configuration — `.squad.yaml`

Detection strategy (`select_squad` / `slice_files_for_agent`)