Which integrations are available for this server?

Integrates with Ollama to run local language models as part of the model council. Integrates with OpenAI-compatible endpoints (including OpenAI, LM Studio, etc.) to serve as model backends for the council.

How do I use local-fusion MCP server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@local-fusion MCP server Provide a multi-model analysis of the best approach to cloud cost optimization." That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

local-fusion MCP server

by kelvincushman

Overview Schema Related Servers Score Discussions

JavaScript

Local

local-fusion

Run an OpenRouter Fusion-style multi-model council locally — no OpenRouter, no hosted router, no billing middleman.

local-fusion sends one prompt to a panel of models, has a judge model compare their answers into structured analysis, then has a synthesizer model write the final answer. It runs against any OpenAI-compatible server (Ollama, LM Studio, llama.cpp, vLLM, MLX, LocalAI, …) and/or against the subscription models you have configured in Pi (ChatGPT, Claude, GLM/ZAI, Kimi) — without you holding any API keys.

It ships four runtimes from one small, dependency-free codebase:

Runtime	Command	What it is
One-shot council	`ask`	Panel → judge → synthesizer, returns a final answer (+ optional full trace).
OpenAI-compatible server	`serve`	Exposes the council at `POST /v1/chat/completions` so any OpenAI client can call it.
Looped PI Fusion	`looped`	A prompt driver: four role-views + a conductor read run state and emit the next agent prompt.
MCP server	`mcp`	Exposes the council and a build/check loop as tools to Claude Code / any MCP client.

There is a Node.js implementation (src/, zero dependencies) and a parallel Python implementation (local_fusion/) with matching core behavior.

Related MCP server: Arena MCP Server

Why

OpenRouter's Fusion router has five useful ideas. local-fusion recreates all of them with no hosted dependency:

A panel of configurable models answers the same prompt.
Panel calls run in parallel when the runtimes can handle it.
A judge compares the answers instead of naively merging them.
The judge returns structured analysis: consensus, contradictions, partial coverage, unique insights, blind spots, notes.
A synthesizer writes the final answer from the raw responses plus the judge analysis.

The payoff is diversity plus a strict judge. Best results come from genuinely different model families; fusing one model under different "perspective" system prompts still helps, but that is mostly extra test-time compute rather than true diversity.

What it deliberately does not do: OpenRouter, web search, web fetch, hosted routing, or your-own-API-key billing.

How it works

                    ┌──────────────────────────── panel (parallel or sequential) ───────────────────────────┐
   prompt  ───────► │  model A (perspective 1)     model B (perspective 2)     model C (perspective 3)  ...   │
                    └───────────────────────────────────────────────┬──────────────────────────────────────┘
                                                                     │ raw independent answers
                                                                     ▼
                                                           ┌──────── judge ────────┐
                                                           │ strict JSON analysis:  │
                                                           │  consensus / contradic │
                                                           │  partial / unique /    │
                                                           │  blind spots / notes   │
                                                           └───────────┬───────────┘
                                                                       │ analysis + raw answers
                                                                       ▼
                                                        ┌──────── synthesizer ────────┐
                                                        │  one final answer, resolving │
                                                        │  contradictions explicitly   │
                                                        └───────────┬─────────────────┘
                                                                    ▼
                                              final_answer  (+ full trace via --json / structuredContent)

The harness is degradation-tolerant:

If some panel models fail, it proceeds with the survivors and records degradation_reasons.
If all panel models fail, it returns status: "error" with an empty answer.
If the judge returns non-JSON, it falls back to a heuristic analysis and notes the degradation.
If the synthesizer fails, it returns the first panel answer rather than nothing.

Requirements

Node.js ≥ 20 (the Node runtime is dependency-free — nothing to npm install).
Python 3.10+ — only if you use the Python implementation.
At least one model source:
- One or more local OpenAI-compatible servers exposing /v1/chat/completions, and/or
- Pi installed and authenticated, to use subscription/OAuth models.

Known-compatible local servers:

Server	Typical base URL
Ollama	`http://localhost:11434/v1`
LM Studio	`http://localhost:1234/v1`
llama.cpp server	`http://localhost:8080/v1`
vLLM / MLX / LocalAI	per their docs

Install

git clone https://github.com/kelvincushman/local-fusion.git
cd local-fusion

# Node runtime needs nothing installed. Verify it runs:
node src/cli.mjs --help

# Optional: link the `local-fusion` bin onto your PATH
npm link            # then: local-fusion ask "..."

Quick start

A) Local models (Ollama example)

ollama pull qwen2.5-coder:14b
ollama serve

cp config.example.json local-fusion.config.json   # then edit baseUrl/model to match your servers
node src/cli.mjs ask --config local-fusion.config.json "Compare ridge, lasso, and elastic-net regression. Where does each shine?"

⚠️ The committed local-fusion.config.json is configured for the Pi backend (subscription models). For pure local servers, base your config on config.example.json instead.

B) Subscription models via Pi (no API keys of your own)

# Pi owns auth; log in once to the providers you want
pi /login

# The shipped config runs GLM-5.2 + GPT-5.4 + Kimi K2 this way
node src/cli.mjs ask "Design a local-first AI writing assistant architecture"

Print the full trace as JSON:

node src/cli.mjs ask --json "Design a local-first AI writing assistant architecture"

Pipe a prompt from stdin:

pbpaste | node src/cli.mjs ask --json

Configuration reference

Config is a single JSON file (default local-fusion.config.json, override with --config). Paths are resolved relative to your current working directory.

Top-level options

Key	Type	Default	Meaning
`parallel`	boolean	`true`	Run panel calls concurrently. Set `false` if your machine can't hold several models at once. All `pi`-backend calls are serialized through one subprocess regardless.
`timeoutMs`	number	—	Per-call timeout in milliseconds.
`panel`	array	—	Model configs for the independent first-pass answers (≥ 1 required).
`judge`	object	—	Model config for the structured comparison JSON (required).
`synthesizer`	object	falls back to `judge`	Model config for final synthesis.
`model_roster` / `role_model_policy`	objects	—	Only for `looped` — see Looped PI Fusion.

Per-model config

Key	Applies to	Meaning
`name`	all	Readable name shown in the trace.
`backend`	all	`"openai"` (default) POSTs to `baseUrl`; `"pi"` drives a headless Pi subprocess.
`baseUrl`	`openai`	OpenAI-compatible base URL ending in `/v1`.
`apiKey`	`openai`	Dummy value is fine for local servers that ignore auth.
`apiKeyEnv`	`openai`	Name of an env var to read the key from instead of inlining it.
`model`	`openai`	Model name the server expects.
`provider` / `modelId`	`pi`	Pi provider + model id (e.g. `zai`/`glm-5.2`, `openai-codex`/`gpt-5.4`, `anthropic`/`claude-opus-4-8`, `kimi-coding`/`kimi-k2-thinking`). Discover with `pi --list-models`.
`temperature`	`openai`	Sampling temperature. (Ignored for `pi`.)
`maxTokens`	`openai`	Max response tokens. (Ignored for `pi`.)
`system`	all	Role/perspective system prompt — this is what shapes each panel member.

Example: pure local servers

See config.example.json — three local panel members (coder / generalist / skeptic), a strict judge, and a synthesizer, all pointed at Ollama / LM Studio / llama.cpp.

Example: Pi subscription council

See local-fusion.config.json — GLM-5.2 (generalist) + GPT-5.4 (builder) + Kimi K2 (critic) panel, Kimi K2 judge, GPT-5.4 synthesizer.

Backends: local OpenAI vs Pi subscriptions

local-fusion mixes two backends freely — even within a single panel.

backend: "openai" (default). A plain HTTP POST to baseUrl/chat/completions. Use for Ollama, LM Studio, llama.cpp, vLLM, MLX, LocalAI, or any OpenAI-compatible endpoint.

backend: "pi". Spawns a headless pi subprocess and uses Pi's stored auth (~/.pi/agent/auth.json), including OAuth subscriptions (ChatGPT Plus/Pro, Claude Pro/Max) and ZAI/Kimi. local-fusion holds no API keys of its own. Use provider + modelId instead of baseUrl/apiKey. All pi calls funnel through one subprocess, so they are serialized even when parallel: true.

💳 Subscription billing note. With backend: "pi", Anthropic subscription access (Claude Pro/Max) through a third-party harness is billed per-token from your extra usage budget, not your plan limits. If Opus errors with 400 ... Add more at claude.ai/settings/usage, add budget there. GLM-5.2 (free) and GPT-5.4 (ChatGPT subscription) are not affected.

🔒 Kimi Code / Kimi K2. The kimi-coding provider is gated to approved coding-agent clients — a direct backend: "openai" call returns access_terminated_error. It works only through the pi backend (which presents the approved client identity). Auth it with pi /login → Kimi For Coding.

Commands

All commands share these options where relevant:

--config <path>   Config file. Default: local-fusion.config.json
--json            (ask) print the full Fusion JSON result
--host <host>     (serve) default 127.0.0.1
--port <port>     (serve) default 8787
--rootDir <path>  (mcp) file-backed run state. Default: ./runs/mcp
--runDir <path>   (looped) artifact trail directory
--step <n>        (looped) current loop step, default 1
--runId <id>      (looped) override the generated run id

`ask` — one-shot council

node src/cli.mjs ask "Your question or instruction"
node src/cli.mjs ask --json "Your question"           # full trace
node src/cli.mjs ask --config my.config.json "..."    # alternate config
echo "prompt from stdin" | node src/cli.mjs ask --json

Without --json it prints just final_answer (and any degradation notes to stderr). With --json it prints the full result object.

`serve` — OpenAI-compatible endpoint

node src/cli.mjs serve --port 8787

Then call it like any OpenAI chat endpoint:

curl http://127.0.0.1:8787/v1/chat/completions \
  -H 'content-type: application/json' \
  -d '{
    "model": "local/fusion",
    "messages": [
      { "role": "user", "content": "What are the strongest arguments for and against carbon taxes?" }
    ]
  }'

The assistant answer is at choices[0].message.content.
The full Fusion trace is attached at the top-level local_fusion field of the response.
Health check: GET /health → { "ok": true }.

The serve endpoint is prose-only — it returns a synthesized natural-language answer. It is not intended to back a tool-calling agent; for that, use the MCP server.

`looped` — Looped PI Fusion

A prompt driver layer over an agent loop (implements docs/prd-pi-agent-fusion-loop.md). It does not execute shell, edit files, or run tests — per run it:

Resolves model assignments for four view-roles + a conductor from a roster.
Dispatches Explorer / Builder / Critic / Performance Sentinel views (each a role-specialized council call).
The Loop Conductor reads the views + a run-state snapshot and emits a machine-parseable decision JSON.
Writes an artifact trail under <runDir>/artifacts/pi-fusion/.
Returns { decision, prompt, summary } for the caller to inject as the next agent prompt.

node src/cli.mjs looped "Implement and test feature X" \
  --config looped-fusion.config.json \
  --runDir ./runs/last \
  --step 1 \
  --heartbeat fresh

This command requires model_roster and role_model_policy in the config — see looped-fusion.config.json and docs/using-looped-fusion.md.

Role	Job
Explorer	Map what's actually true: files, symbols, facts vs assumptions, unknowns.
Builder	Identify the smallest safe implementation step and its risks.
Critic	Challenge the direction: hidden assumptions, failure modes, missing tests.
Performance Sentinel	Read loop health (heartbeat, elapsed vs expected, no-progress) and recommend a verdict.
Loop Conductor	Synthesize the views + run state into the next prompt (or `complete`).

`mcp` — MCP server for Claude Code

Lets Claude Code / Opus stay the executor while local-fusion becomes the checking + council layer it calls as tools. The flow is pull-based (MCP tools are client-initiated): Claude Code calls a tool, gets the next instruction, executes, reports evidence, and asks again.

Start the server:

node src/cli.mjs mcp --config local-fusion.config.json --rootDir ./runs/mcp

The server speaks the MCP stdio transport (newline-delimited JSON) that Claude Code uses; legacy LSP-style Content-Length framing is also accepted (auto-detected from the client's first bytes).

Register with Claude Code (user scope = available in every project):

claude mcp add local-fusion --scope user -- \
  node /absolute/path/to/local-fusion/src/cli.mjs mcp \
  --config /absolute/path/to/local-fusion/local-fusion.config.json \
  --rootDir /absolute/path/to/local-fusion/runs/mcp

claude mcp get local-fusion          # expect: ✔ Connected

Use --scope project instead to write a shared .mcp.json at the repo root (Claude Code requires explicit approval for project-scoped servers before first use).

Tools exposed:

Tool	Purpose
`fusion_ask`	One-shot council on a question; returns final answer + disagreement trace.
`looped_start`	Start a build/check loop; freezes objective + acceptance criteria.
`looped_report`	Claude Code reports what changed, tests run, blockers, assumptions, evidence.
`looped_check_work`	Checker gate: decides `done` / `incomplete` / `blocked` / `uncertain`.
`looped_fuse_review`	Conditional full council on the frozen evidence (use only when uncertain/high-risk).
`looped_next`	Conductor returns `complete` or one bounded next prompt.
`looped_status`	Read current run state and artifact path.

The default routing keeps the expensive path optional:

Opus executes → looped_report → checker gate
   ├─ done + high confidence  → looped_next returns complete
   ├─ incomplete + concrete fix → direct retry prompt
   ├─ uncertain / high-risk    → fusion council → conductor prompt
   └─ blocked                  → pause for human

Run artifacts are written under runs/mcp/<run_id>/ (state.json, report-N.json, check-N.json, fusion-review-N.json, next-N.json). runs/ is gitignored. Full details and the recommended Claude Code priming prompt are in docs/using-mcp-connector.md.

Using inside a Pi agent

local-fusion can run as a tool a Pi agent consults for a multi-model second opinion. The Pi skill and project AGENTS.md are already set up — launch pi from this repo and ask it to "get a local-fusion council opinion on X". See docs/using-with-pi.md.

Python implementation

A parallel implementation lives in local_fusion/ with matching core fusion + looped behavior (it does not serve MCP — that is Node-only).

python3 -m local_fusion ask "Compare ridge, lasso, and elastic-net regression."
python3 -m local_fusion ask --json "Design a local-first AI writing assistant architecture"
pbpaste | python3 -m local_fusion ask --json
python3 -m local_fusion serve --port 8787

Output schema

ask --json and the MCP fusion_ask tool return:

{
  "status": "ok",                       // or "error" when all panel models fail
  "final_answer": "…synthesized answer…",
  "analysis": {
    "consensus": ["points most models agreed on"],
    "contradictions": [
      { "topic": "…", "stances": [{ "model": "…", "stance": "…" }] }
    ],
    "partial_coverage": [{ "models": ["…"], "point": "…" }],
    "unique_insights": [{ "model": "…", "insight": "…" }],
    "blind_spots": ["important missing topics"],
    "judge_notes": ["guidance for the synthesizer"]
  },
  "responses": [{ "model": "…", "content": "raw panel answer" }],
  "failed_models": [{ "model": "…", "error": "…" }],
  "degradation_reasons": ["Some panel models failed.", "…"],
  "raw_judge_output": "…optional raw judge text…"
}

When the judge returns non-JSON, analysis is a heuristic fallback and degradation_reasons explains why.

Project layout

local-fusion/
├─ src/                        # Node implementation (zero dependencies)
│  ├─ cli.mjs                  # entrypoint: ask | looped | serve | mcp
│  ├─ fusion.mjs               # panel → judge → synthesizer core
│  ├─ looped.mjs               # Looped PI Fusion prompt driver
│  ├─ mcp.mjs                  # MCP stdio server (ndjson + Content-Length)
│  ├─ mcp-connector.mjs        # build/check loop logic behind the MCP tools
│  ├─ openai-compatible.mjs    # backend: "openai" transport
│  ├─ pi-transport.mjs         # backend: "pi" transport (headless subprocess)
│  └─ config.mjs               # JSON config loader
├─ local_fusion/              # Python implementation (core parity, no MCP)
├─ test/                       # Node tests (node --test)
├─ tests/                      # Python tests (unittest)
├─ docs/                       # PRD + usage guides + blog posts
├─ .pi/                        # Pi skill + extension for in-agent use
├─ config.example.json         # local-server starter config
├─ local-fusion.config.json    # Pi-subscription council config (default)
└─ looped-fusion.config.json   # roster + role policy for `looped`

Testing

# Node (23 tests)
node --test

# Python (15 tests)
python3 -m unittest discover -s tests

The Node suite covers the fusion core, the looped driver, the MCP connector routing, and the MCP wire transport — including framing detection and a full newline-delimited initialize/tools/list handshake, so the Claude Code transport can't silently regress.

Troubleshooting

Symptom	Likely cause / fix
`claude mcp get` shows ✘ Failed to connect	Confirm the `node … mcp` command runs standalone and the paths are absolute. The server now speaks newline-delimited JSON; older builds only spoke `Content-Length`.
MCP server shows ⏸ Pending approval	Project-scoped servers need approval — run `claude` and approve, or register with `--scope user`.
`fusion_ask` / `looped_check_work` return errors with the Pi backend	Pi must be logged in (`pi /login`). The MCP transport being healthy is separate from the models behind it.
Opus errors `400 … Add more at claude.ai/settings/usage`	Subscription extra-usage budget exhausted — add budget.
`access_terminated_error` on Kimi	Don't call Kimi via `backend: "openai"`; use `backend: "pi"`.
`degradation_reasons` populated	Some panel models / the judge failed; check the listed reasons and `failed_models`.
`looped needs model_roster and role_model_policy`	Point `--config` at `looped-fusion.config.json` (or a config with those keys).

Limitations

MCP serving is Node-only. Python has core fusion/looped parity but does not serve MCP stdio.
MCP is pull-based. Claude Code must call the tools; local-fusion cannot push into an already-running conversation.
pi-backend calls are serialized through one subprocess even with parallel: true.
GLM quota can degrade the council — check degradation_reasons in fusion_ask / looped_fuse_review output.

Contributing

Contributions are welcome — see CONTRIBUTING.md for dev setup, the test commands, and the PR flow. main is protected: all CI checks (Node 20/22, Python 3.10/3.12) must pass before merge.

License

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Related MCP Servers

consult-mcp
Agent Orchestration Autonomous Agents AI & Machine Learning
oblogin
A
license
A
quality
D
maintenance
MCP server orchestrating local CLI agents (Claude Code, OpenAI Codex, Google Gemini) for cross-validation, second opinions, and persona-driven prompting.
Last updated 2026-05-03
18
MIT
Arena MCP Server
tim101010101
A
license
-
quality
A
maintenance
Enables running position-driven adversarial debates and code reviews between AI agents via MCP tools, supporting custom positions, multiple rounds, and local CLI models like Claude, Codex, and Gemini.
Last updated 2026-05-12
123
1
MIT
mcp-flow
Coding Agents Developer Tools
remimenguy
F
license
-
quality
C
maintenance
A safe, local MCP server that lets Claude drive a controlled software-development loop (inspect, read, plan, patch, apply, check, analyze, fix, summarize) on a project, using deterministic tools and real diffs/test runs.
Last updated 2026-06-09
1
LLM Council MCP
AI & Machine Learning Autonomous Agents Developer Tools
salutaris91
A
license
-
quality
A
maintenance
A lightweight, local MCP server for Mac that replicates a 3-stage LLM council workflow using OpenRouter, enabling multi-model deliberation and synthesis directly from AI coding tools.
Last updated 2026-06-30
6
MIT

View all related MCP servers

Related MCP Connectors

Reasoning Commons
AI Reasoning Cache & Consensus Layer with 11 MCP tools via Streamable HTTP.
agentmailrooms-mcp
A paid remote MCP for OpenAI Codex agent coordination MCP, built to return verdicts, receipts, usage
Backtest360
MCP server exposing the Backtest360 engine API as tools for AI agents.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kelvincushman/local-fusion'

If you have feedback or need assistance with the MCP directory API, please join our Discord server