How do I use openrouter-subagents?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@openrouter-subagents why is the sky blue?" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

openrouter-subagents

by Wally-Ahmed

Overview Schema Related Servers Score Discussions

TypeScript

Remote

openrouter-subagents

An MCP server and CLI that exposes a model-agnostic "subagent" tool backed by OpenRouter — one API key, every model. It defaults to OpenRouter Fusion (openrouter/fusion), which runs a panel of models in parallel and has a judge model synthesize them into a single answer. Sibling to gpt-subagents-api (OpenAI API key) and gpt-subagents-subscription (ChatGPT subscription), and it ships the same orchestration patterns system.

Note: Uses an OpenRouter API key (Authorization: Bearer …) against the OpenAI-compatible Chat Completions endpoint. Not affiliated with or endorsed by OpenRouter.

Tools

Tool	What it does
`ask_openrouter`	Ask any OpenRouter model. `model` defaults to `openrouter/fusion` (multi-model synthesis); pass any OpenRouter id to override (e.g. `anthropic/claude-opus-latest`, `openai/gpt-latest`, or a fast cheap model). Write `instructions` (the system prompt) every call. Reasoning is fully controllable (see below). For Fusion only — `analysis_models` (the panel, 1–8 ids) and `judge_model` (the synthesizer).
`list_patterns` / `get_pattern`	Orchestration patterns for driving the model well (see below).

Fusion cost: a Fusion call bills for every panel model plus the judge. Reach for it when multiple perspectives are worth the spend (research, critique, high cost-of-being-wrong); for quick tactical prompts pass a single model id instead.

Reasoning & sampling controls

Any model's reasoning level can be set — the server exposes OpenRouter's full unified reasoning object, and OpenRouter translates it into whatever the target model natively speaks (OpenAI/Grok effort levels, Anthropic thinking budgets, Gemini thinkingLevel, Qwen thinking budgets, on/off flags for models like DeepSeek/GLM). A level a model doesn't support is mapped to the nearest one it offers.

Param	Meaning
`reasoning_effort`	Named level, lowest → highest: `none`, `minimal`, `low`, `medium`, `high`, `xhigh`, `max`. (`none` disables reasoning. On Anthropic these become budget ratios ≈ 0.1/0.2/0.5/0.8/0.95 of `max_tokens`, clamped to [1024, 128000]; on Gemini they map to `thinkingLevel`.)
`reasoning_max_tokens`	Exact reasoning token budget (Anthropic/Gemini/Qwen-style) for fine-grained control. Mutually exclusive with `reasoning_effort`.
`reasoning_enabled`	Turn default-strength reasoning on/off without picking a level or budget.
`reasoning_exclude`	Model still reasons, but the reasoning tokens aren't returned in the response.
`temperature`	Sampling temperature, 0–2 (lower = more deterministic). Applied when the model supports it; OpenRouter drops it for models that don't.

Note for agents (MCP or CLI): if a call fails or any error occurs — timeout, rate limit, model rejection, provider outage — retry the exact same call first (waiting briefly for transient errors). If it keeps failing, report the error and ask; do not downgrade or change the configuration the user set (model, reasoning level/budget, temperature, Fusion panel/judge) without their direct say-so. This rule is also baked into the server's MCP instructions and the CLI --help.

Related MCP server: agentloop

CLI

Everything the MCP server does is also available as a plain shell command — same client, same patterns library, but the answer comes back as raw text on stdout with zero JSON-RPC framing. For agents that can run shell commands, this is the token-cheap way to delegate: no MCP envelope in either direction, and piped stdin means large inputs (diffs, logs, files) never have to be echoed through the model's context at all.

npm run build        # compiles dist/cli.js
npm link             # optional: puts `openrouter-subagents` on your PATH

# ask (the subcommand is optional); raw answer on stdout
openrouter-subagents "why is the sky blue?"
openrouter-subagents ask -m anthropic/claude-haiku-4.5 -e xhigh "prove sqrt(2) is irrational"

# piped stdin becomes the prompt — or the context when a prompt is given
git diff | openrouter-subagents ask -p "review this diff for bugs" -e high
openrouter-subagents ask -p "summarize" --context-file big-report.md -m openai/gpt-5-mini

# patterns
openrouter-subagents patterns
openrouter-subagents pattern two-layer-cross-model-expert

Flags mirror the MCP tool: -m/--model, -i/--instructions (defaults to a terse general-purpose prompt), -p/--prompt, -c/--context (each with a --*-file variant), -e/--effort (none…max), --reasoning-tokens <n>, --reasoning on|off, --hide-reasoning, -t/--temperature, and Fusion's --analysis-models / --judge. --help shows the full reference. Exit codes: 0 success, 2 usage error, 1 API/network error.

Orchestration patterns

Patterns are reusable playbooks (Markdown in patterns/) that describe how to drive the expert tool — splitting work, bundling context, calling the expert, verifying its output against ground truth, and aggregating. They're exposed via list_patterns (catalog) and get_pattern("<name>") (full text), read from disk at call time (no rebuild to add one), and the server's instructions nudge the agent to consult them before non-trivial expert work.

name	what it does
`two-layer-cross-model-expert`	Wrap the OpenRouter expert in verifying Claude subagents so the orchestrator only ever sees parallel, context-cheap, ground-truth-checked conclusions. (Fusion makes the "cross-model" premise even stronger — the expert is a whole panel of model families.)
`worker-orchestrator`	Fan concrete work out to the OpenRouter worker (`ask_openrouter` with a fast model) through cheap Sonnet wrapper subagents — validated by execution, not a verification gate.

Both patterns ship a rendered diagram under patterns/html/. See patterns/README.md to add your own.

Setup

Requires Node 18+ (uses the global fetch) and an OpenRouter API key.

npm install
npm run build
cp .env.example .env       # then put your key in .env

Get a key at https://openrouter.ai/keys and set OPENROUTER_API_KEY in .env. .env is gitignored and must never be committed — only .env.example is tracked.

Configuring a default Fusion panel + judge (optional)

By default, openrouter/fusion uses OpenRouter's built-in "Quality" preset. You can override that default for every Fusion call from your .env:

# 1–8 panel models that answer in parallel:
OPENROUTER_FUSION_ANALYSIS_MODELS=anthropic/claude-opus-latest,openai/gpt-latest,google/gemini-pro-latest
# the judge that synthesizes them:
OPENROUTER_FUSION_JUDGE_MODEL=anthropic/claude-opus-latest

Precedence is per-call arg > .env default > OpenRouter preset, resolved independently for the panel and the judge: a per-call analysis_models / judge_model on ask_openrouter overrides the matching .env default, and these defaults apply only to openrouter/fusion (they're ignored for any other model).

Register with Claude Code

claude mcp add -s user openrouter-subagents -- node /absolute/path/to/openrouter-subagents/dist/server.js

(Claude Code reads MCP registrations at startup, so a newly added server appears after a full restart.)

How it works

ask_openrouter builds an OpenAI-style Chat Completions request (system + user messages).
For openrouter/fusion, the panel (analysis_models) and judge (judge_model) are resolved with precedence per-call arg > .env default > OpenRouter's preset, then sent as a plugins: [{ id: "fusion", … }] entry. With none of them set, OpenRouter's built-in Quality preset is used.
The request is POSTed to https://openrouter.ai/api/v1/chat/completions with Authorization: Bearer $OPENROUTER_API_KEY; the answer is choices[0].message.content.
Fusion is slow (parallel panel + synthesis), so the client uses a generous request timeout (~280s).

Security

The API key lives in .env (gitignored everywhere); only .env.example (a placeholder) is tracked.
Outbound instructions / prompt / context are run through a best-effort secret redactor (API keys, tokens, private keys) before they leave your machine — not a guarantee; don't paste highly sensitive data.
Data boundary: with the default openrouter/fusion, a single call fans your input out to several third-party providers at once (e.g. Anthropic, OpenAI, Google) via OpenRouter.
Local agent/editor state (.mempalace/, .claude/, CLAUDE.local.md, IDE folders) is gitignored.

License

MIT

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Wally-Ahmed/openrouter-subagents'

If you have feedback or need assistance with the MCP directory API, please join our Discord server