Skip to main content
Glama

ForgeSwarm ๐Ÿ› ๏ธ๐Ÿ

CI License: MIT

An MCP server that turns independent AI agents into a coordinated engineering team.

Most MCP servers give agents data (GitHub, databases, web). ForgeSwarm gives them coordination: a shared task board with atomic claiming, a shared context blackboard, a decision log, and an enforced plan โ†’ implement โ†’ review โ†’ iterate loop โ€” the same workflow shape that powers orchestration harnesses like CyOps, distilled into an open protocol primitive any MCP client can plug into.

Connect Claude Code, Codex, OpenCode, or a MiniMax M3-powered script to the same ForgeSwarm server, and they instantly become citizens of one swarm: claiming tasks without collisions, briefing each other through shared memory, and reviewing each other's work before anything counts as done.

Built for the CyOps Arena Hackathon โ€” MCP Server Sprint (co-hosted with MiniMax).

Why this exists

Multi-agent coding fails in predictable ways: two agents grab the same task, an agent starts work with no idea what the others decided, "done" means "the model said done", and a crashed agent silently stalls the project. ForgeSwarm fixes each one server-side, so correctness doesn't depend on prompt discipline:

Failure mode

ForgeSwarm mechanism

Two agents do the same work

claim_task is a single atomic conditional UPDATE โ€” one winner, always

Agent starts cold, repeats settled debates

get_briefing bundles goal, constraints, decisions, dependency summaries, and prior review feedback into one onboarding packet

"Done" is just an assertion

submit_for_review โ†’ a different agent must post_review; self-review is rejected; request_changes auto-returns the task to its author with feedback attached and bumps the iteration counter

"Tests pass, trust me"

run_checks runs allowlisted test/lint commands with a hard timeout and records exit code + output on the task as review evidence

Crashed agent stalls the swarm

Claims carry leases; expired leases put tasks back on the board automatically

Disagreements evaporate into chat

open_discussion โ†’ positions from โ‰ฅ2 distinct agents (server-enforced) โ†’ resolve_discussion auto-records the consensus as a binding decision in every future briefing

The swarm never learns

get_retrospective compiles hard evidence โ€” review bounce rates, check pass rates, per-agent stats, hotspot tasks โ€” for the swarm to analyze and act on

State lost between sessions

Everything persists in SQLite (WAL) โ€” swarms survive restarts and work across both transports

Related MCP server: jt-mcp-server

Install

# with uv (recommended)
uvx forgeswarm

# or with pip
pip install forgeswarm
forgeswarm

From source:

git clone https://github.com/H2SO4620/forgeswarm && cd forgeswarm
pip install -e ".[dev]"
pytest   # 20 tests, including end-to-end MCP client sessions

Transports

forgeswarm                            # stdio (local clients spawn it)
forgeswarm --transport http --port 8765   # one shared endpoint for a whole swarm
forgeswarm --db ./myproject.db        # or set FORGESWARM_DB

State is SQLite either way (default ~/.forgeswarm/forgeswarm.db), so stdio clients โ€” which each spawn their own server process โ€” still share one swarm.

Claude Code

claude mcp add forgeswarm -- uvx forgeswarm

Or in any MCP client config:

{
  "mcpServers": {
    "forgeswarm": { "command": "uvx", "args": ["forgeswarm"] }
  }
}

The loop

flowchart LR
    G[Goal] --> P[create_project<br/>submit_plan]
    P --> B[Task board]
    B -->|claim_task<br/>atomic| W[Agent works<br/>get_briefing ยท save_context ยท run_checks]
    W --> S[submit_for_review]
    S --> R{post_review<br/>by a different agent}
    R -->|approve| D[done โœ“]
    R -->|request_changes<br/>iteration++| W
    D --> B

Tools (24)

Planning โ€” create_project, submit_plan (whole dependency graph in one call), list_projects, register_agent

Task board โ€” list_tasks (with ready_only), claim_task (atomic, leased), update_task (progress + lease renewal), complete_task, get_task_graph

Shared context โ€” save_context, search_context, record_decision, get_briefing โญ

Review loop โ€” submit_for_review, get_review_queue, post_review

Discussion & consensus โ€” open_discussion, post_to_discussion, resolve_discussion (consensus becomes a recorded decision automatically), list_discussions

Workflow templates โ€” list_workflow_templates, get_workflow_template (ship-feature, refactor-module, debug-issue โ€” dependency-wired task graphs ready for submit_plan)

Verification & reflection โ€” run_checks (allowlisted: pytest, ruff, mypy, npm, cargo, go, โ€ฆ; no shell, hard timeout, evidence recorded), get_retrospective (swarm performance evidence: bounce rates, iterations, per-agent stats)

Resources & Prompts

Live swarm state, readable without tool calls: swarm://projects ยท swarm://agents ยท swarm://project/{id}/status ยท swarm://project/{id}/tasks ยท swarm://project/{id}/decisions ยท swarm://project/{id}/discussions ยท swarm://project/{id}/retrospective ยท swarm://project/{id}/context

Role prompts that make any MCP client swarm-ready in one message: planner ยท implementer ยท reviewer ยท standup_summary (rendered from live board state)

Demo: a MiniMax M3 swarm builds software through ForgeSwarm

examples/minimax_swarm_demo.py runs three MiniMax M3 agents โ€” planner, implementer, reviewer โ€” that coordinate entirely through ForgeSwarm tools over a real MCP stdio session: the planner decomposes a goal into a task graph, the implementer claims tasks and submits work, the reviewer approves or bounces it, and the loop runs until the board is green.

pip install "forgeswarm[demo]"
set MINIMAX_API_KEY=sk-...        # export on macOS/Linux
python examples/minimax_swarm_demo.py "Build a CLI pomodoro timer in Python"

M3 is also available through OpenRouter (same model, smaller minimum top-up):

set MINIMAX_API_KEY=sk-or-...
set MINIMAX_BASE_URL=https://openrouter.ai/api/v1
set MINIMAX_MODEL=minimax/minimax-m3

No API key handy? examples/quickstart_client.py walks the identical workflow with a scripted client โ€” no LLM required:

python examples/quickstart_client.py

Verified run

A real run of the M3 swarm against "Build a CLI pomodoro timer in Python" went from a bare goal to a finished, reviewed project with zero human intervention โ€” three M3 agents talking only through ForgeSwarm tools:

  1. m3-planner registered itself, created the project, decomposed the goal into 8 dependency-ordered tasks (scaffold โ†’ timer state machine โ†’ config โ†’ notifier โ†’ CLI โ†’ tests โ†’ docs), and recorded 4 architectural decisions (stdlib-only, foreground blocking timer, XDG config path, stderr UI honoring NO_COLOR).

  2. m3-impl-1 claimed each ready task in dependency order, wrote the source via save_context, and submit_for_review'd every deliverable.

  3. m3-reviewer-1 pulled the review queue, cross-checked each submission against get_briefing (goal, constraints, decisions, prior feedback), and post_review'd a verdict for each.

The board went 8/8 done, and the closing standup_summary prompt โ€” also answered by M3, purely from live board state โ€” correctly reported:

All planned work is complete โ€“ 8/8 tasks closed... The project is feature-complete: scaffold, timer FSM, config, notifications, CLI, tests, and docs are all landed.

Single Most Important Next Action: Run a full end-to-end smoke test of the shipped CLI... and, if green, tag v0.1.0 and cut a release. Until we exercise the integrated binary, the "done" labels reflect unit-level completion only.

No agent ever had to be told what another agent decided, claimed, or reviewed โ€” every coordination fact came from ForgeSwarm's shared state.

Architecture

src/forgeswarm/
โ”œโ”€โ”€ server.py        # FastMCP app + stdio/streamable-HTTP entrypoint
โ”œโ”€โ”€ store.py         # SQLite (WAL): atomic claims, leases, review state machine
โ”œโ”€โ”€ models.py        # Pydantic contracts returned by every tool
โ”œโ”€โ”€ tools/           # planning ยท tasks ยท context ยท review ยท checks
โ”œโ”€โ”€ resources.py     # swarm:// live state
โ””โ”€โ”€ prompts.py       # planner / implementer / reviewer / standup

Design choices worth knowing:

  • SQLite over in-memory โ€” over stdio every client spawns its own server process; shared swarm state must live on disk. WAL mode + a busy timeout keeps concurrent agents safe, and one conditional UPDATE makes claims race-free.

  • The loop is server-enforced โ€” review outcomes mutate task state in the same transaction as the verdict. An agent cannot skip review by prompt injection or forgetfulness; the state machine simply won't move.

  • run_checks is verification, not execution โ€” clients already execute code. The server's job is evidence: allowlisted executables, no shell, hard timeout, output recorded where reviewers can see it.

License

MIT

Install Server
A
license - permissive license
A
quality
B
maintenance

Maintenance

โ€“Maintainers
โ€“Response time
โ€“Release cycle
โ€“Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/H2SO4620/forgeswarm'

If you have feedback or need assistance with the MCP directory API, please join our Discord server