How do I use GLM Subagent MCP?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@GLM Subagent MCP Ask glm to review and fix the bug in server.js" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

GLM Subagent MCP

by djerok

Overview Schema Related Servers Score Discussions

JavaScript

Local

GLM-as-Subagent for Claude Code — plug & play

📦 Canonical source: https://github.com/djerok/glm_mcp_claude — created by @djerok. If you found this via a fork, mirror, or an awesome-list, the original lives here. Please ⭐ / file issues / open PRs at the source.

Add GLM (Zhipu / Z.ai) to Claude Code as a cheap, full-capability subagent (~10× cheaper than Opus), with automatic per-task routing between Opus and GLM. Your main agent stays on Opus; GLM does the well-specified, cost-sensitive work — and can read, write, edit, and run your files directly. One command to install.

✅ Works in the Claude Code app on a subscription-based Claude. Your main agent runs on the Claude you already pay for through the Claude Code app (Pro / Max / Team subscription) — no separate pay-per-token Anthropic API key required. Only GLM needs a (cheap) Z.ai key. Opus orchestrates on your subscription; GLM does the heavy lifting for a fraction of the cost.

The glm subagent (orchestrated by Haiku 4.5, the cheap layer) reading this repo and offloading generation to GLM

↑ The glm subagent (orchestrated by Haiku 4.5, the cheap layer) reading the repo and offloading the heavy work to GLM via the MCP tools — the Opus → Haiku → GLM hybrid in action.

# no clone needed — run straight from GitHub:
npx github:djerok/glm_mcp_claude --key YOUR_ZAI_API_KEY

# or clone and run the installer:
node install.mjs --key YOUR_ZAI_API_KEY

…then restart Claude Code. That's it. (Details below.)

🔑 Your key must be from the Z.ai / Zhipu GLM Coding Plan. Get one at https://z.ai → subscribe to the GLM Coding Plan, then create an API key. A generic / free key without coding-plan access will not work for the coding models used here.

What you get

glm subagent — a full-tool subagent (read/write/edit/bash) powered by GLM.
glm_agent tool — GLM as a real file-editing agent with built-in oversight (diff, dry-run, git revert).
glm_delegate / glm_recommend / glm_status — draft-only delegation, a free routing advisor, and a health check.
Auto-delegation hook — when you spawn a subagent, it injects a GLM-vs-Opus verdict so cheap work goes to GLM automatically. Zero token cost when you're not spawning subagents.
If you explicitly name an agent ("use opus", "use the sonnet agent", "use glm"), the hook stays silent and just routes where you asked.

Related MCP server: Ollama MCP Server

Prerequisites

The Claude Code app (desktop or CLI), signed in with a subscription-based Claude (Pro / Max / Team). Your main agent uses this — no Anthropic API key needed. The claude CLI should be on your PATH (claude --version).
Node.js ≥ 18 (node -v)
A Z.ai / Zhipu API key with GLM Coding Plan access — get one at https://z.ai. ⚠️ It must be on the GLM Coding Plan (the coding-plan subscription); a generic or free key won't have access to the coding models this uses. This is the only paid key required, and GLM is ~10× cheaper than Opus.
Git (optional, but enables glm_agent's one-command revert)

Install (recommended: global, all projects)

# from this folder:
node install.mjs --key YOUR_ZAI_API_KEY

The installer:

copies the server to ~/.claude/glm-mcp/ and runs npm install,
writes your key into ~/.claude/glm-mcp/.env,
installs the glm subagent (~/.claude/agents/glm.md) and the hook (~/.claude/hooks/),
wires the hook into ~/.claude/settings.json (backs it up first),
adds a short delegation policy to ~/.claude/CLAUDE.md,
registers the MCP server with claude mcp add glm -s user.

Then restart Claude Code and run glm_status — you should see "api_key_loaded": true.

Options: --no-register (skip the CLI step), --skip-npm, --claude-dir PATH (custom config dir). Re-running is safe (idempotent). No key on the command line? Run node install.mjs, then edit ~/.claude/glm-mcp/.env and set GLM_API_KEY=....

Per-project instead of global

Don't want it everywhere? Skip the installer. Copy glm-mcp/ into your project, cd glm-mcp && npm install, copy .mcp.json.example → .mcp.json in the project root, set the key, and (optionally) copy agents/glm.md to .claude/agents/ and the hook into .claude/ + .claude/settings.json.

How it works (the short version)

You ask for something
  → Opus orchestrates
  → wants to delegate a chunk → spawns a subagent
       → [hook fires] "[GLM router] GLM-suitable repo task → use glm_agent (dry_run first)"
              (or "keep on Opus" for hard/sensitive work)
       → Opus runs glm_agent (GLM edits the files, runs tests) — or keeps it on Opus
       → you get a diff + action log + a one-command revert

The routing rules live in glm-mcp/src/router.js and the hook — not in always-on context — so they cost nothing until a subagent is actually spawned.

Routing in one line: GLM is the default (it's ~10× cheaper); Opus is the exception for work where being wrong is expensive — subtle debugging, architecture, large refactors, security, tool-heavy dependent loops, huge context, vision, or anything you mark sensitive.

The tools

Tool	Cost	What it does
`glm_recommend`	free (local)	GLM-or-Opus decision + model pick + reasons.
`glm_status`	free (local)	Peak window, active model, key/config health.
`glm_delegate`	GLM tokens	Text in → text out. GLM drafts; you place it.
`glm_agent`	GLM tokens	GLM works your repo directly (read/write/edit/bash). Returns a diff + action log + git revert; supports `dry_run` (propose, don't write).

Example: directly calling the GLM agent

A real run — asking GLM (via glm_agent) to write a file end-to-end on disk:

GLM agent writing a 2000-word Shakespearean essay to disk in 18 iterations for about 6 cents

Prompt: "Using the GLM agent glm_agent, write a 2000-word essay in Shakespearean format about the usefulness of an umbrella, into my Desktop."

GLM did it itself — created the file directly, no round-tripping the content through the main agent:

Output: Umbrella-Essay-Shakespeare.md — ~2,260 words of Early Modern English (thee/thou/thy, doth/hath) with two blank-verse interludes
Work: 18 tool-loop iterations; one file created, nothing existing touched
Cost: ~$0.064 — a fraction of running the same task on Opus

That's the point: the orchestrator stays on Opus while glm_agent does the heavy, file-touching work for cents.

Oversight (how you stay in control of `glm_agent`)

Entry: you/Opus choose when to call it and with which workdir.
dry_run: true: GLM proposes a full diff and writes nothing — approve, then apply.
After a real run: you get the unified diff, an action log, and a one-command git revert (git checkout <baseline> -- .).

Note: file/bash ops inside glm_agent run in the MCP server process (not gated per-edit) and are scoped to the workdir you pass. That's intentional (max autonomy) — point it only at repos you're fine letting it modify.

Configuration (`~/.claude/glm-mcp/.env`)

Var	Default	Meaning
`GLM_API_KEY`	—	Your Z.ai key. Required.
`GLM_BASE_URL`	`https://api.z.ai/api/anthropic`	Anthropic-compatible endpoint.
`GLM_COST_BIAS`	`1.5`	How hard to favor GLM (it's ~10× cheaper). Higher = more GLM; `0` = decide on capability only.
`GLM_CAP`	`off`	Output-token cap. Off by default = generous (up to 131072 per call). Set `on` to enforce `GLM_MAX_TOKENS` and rein in spend.
`GLM_MAX_TOKENS`	`32768`	The hard per-call limit applied only when `GLM_CAP=on`. (`max_tokens` is a ceiling, not a target — you pay for actual output.)
`GLM_MAX_TOKENS_CEILING`	`131072`	The generous default used when the cap is off.
`GLM_MAX_CONCURRENT`	`1`	GLM caps in-flight requests; keep at 1.
`GLM_OFFPEAK_MODEL` / `GLM_PEAK_MODEL`	`glm-5.2` / `glm-5.2`	Model(s) for `auto`. Each can be a comma-separated list (e.g. `glm-5.2,glm-5-turbo`) and the router auto-picks — most capable for hard tasks, cheapest for easy ones. Peak rule: when `auto` lands on a glm-5.x model (3× surcharge) the router routes less work to GLM at peak; if you include a no-surcharge model (e.g. `GLM_PEAK_MODEL=glm-5.2,glm-4.7`) it's preferred at peak and GLM stays fine to use.
`GLM_PEAK_START_CN` / `GLM_PEAK_END_CN`	`14` / `18`	Peak window (China hour, UTC+8).
`GLM_AGENT_MAX_ITERS`	`30`	Max tool-loop turns for `glm_agent`.

Full list with comments: glm-mcp/.env.example.

Uninstall

node uninstall.mjs          # remove agent, hook, settings entry, MCP registration
node uninstall.mjs --purge  # also delete ~/.claude/glm-mcp (and its .env)

Security

Never commit/share your .env or a .mcp.json containing the key. .gitignore excludes them.
GLM routes through servers in China — don't send secrets/regulated code you wouldn't send to a third-party API. (Routing keeps sensitive-flagged work on Opus, but you decide what to delegate.)

Troubleshooting

Symptom	Fix
`glm_status` missing / tools absent	Restart Claude Code; `claude mcp get glm` to confirm registration.
`api_key_loaded: false`	Set `GLM_API_KEY` in `~/.claude/glm-mcp/.env`.
Server fails to start	`cd ~/.claude/glm-mcp && npm run smoke` to see the real error.
`Too much concurrency`	Expected under load; it auto-retries. Don't fan out parallel GLM calls.
Hook not firing	Check `~/.claude/settings.json` has a `PreToolUse` `Task` matcher pointing at `glm_subagent_router.mjs`.

More background and the research behind the routing rules: see docs/.

Contributing

PRs and issues welcome — see CONTRIBUTING.md. Good first areas: routing rules (glm-mcp/src/router.js + the hook), provider adapters, and docs. Please never commit secrets/.env.

License

MIT © djerok

Original / canonical repository: https://github.com/djerok/glm_mcp_claude. If you fork, mirror, or redistribute this project, please keep a link back to the source so others can find updates, file issues, and contribute. Built by @djerok.

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/djerok/glm_mcp_claude'

If you have feedback or need assistance with the MCP directory API, please join our Discord server