Skip to main content
Glama
djerok

GLM Subagent MCP

by djerok

GLM-as-Subagent for Claude Code β€” plug & play

πŸ“¦ Canonical source: https://github.com/djerok/glm_mcp_claude β€” created by @djerok. If you found this via a fork, mirror, or an awesome-list, the original lives here. Please ⭐ / file issues / open PRs at the source.

Add GLM (Zhipu / Z.ai) to Claude Code as a cheap, full-capability subagent (~10Γ— cheaper than Opus), with automatic per-task routing between Opus and GLM. Your main agent stays on Opus; GLM does the well-specified, cost-sensitive work β€” and can read, write, edit, and run your files directly. One command to install.

βœ… Works in the Claude Code app on a subscription-based Claude. Your main agent runs on the Claude you already pay for through the Claude Code app (Pro / Max / Team subscription) β€” no separate pay-per-token Anthropic API key required. Only GLM needs a (cheap) Z.ai key. Opus orchestrates on your subscription; GLM does the heavy lifting for a fraction of the cost.

The glm subagent (orchestrated by Haiku 4.5, the cheap layer) reading this repo and offloading generation to GLM

↑ The glm subagent (orchestrated by Haiku 4.5, the cheap layer) reading the repo and offloading the heavy work to GLM via the MCP tools β€” the Opus β†’ Haiku β†’ GLM hybrid in action.

# no clone needed β€” run straight from GitHub:
npx github:djerok/glm_mcp_claude --key YOUR_ZAI_API_KEY

# or clone and run the installer:
node install.mjs --key YOUR_ZAI_API_KEY

…then restart Claude Code. That's it. (Details below.)

πŸ”‘ Your key must be from the Z.ai / Zhipu GLM Coding Plan. Get one at https://z.ai β†’ subscribe to the GLM Coding Plan, then create an API key. A generic / free key without coding-plan access will not work for the coding models used here.


What you get

  • glm subagent β€” a full-tool subagent (read/write/edit/bash) powered by GLM.

  • glm_agent tool β€” GLM as a real file-editing agent with built-in oversight (diff, dry-run, git revert).

  • glm_delegate / glm_recommend / glm_status β€” draft-only delegation, a free routing advisor, and a health check.

  • Auto-delegation hook β€” when you spawn a subagent, it injects a GLM-vs-Opus verdict so cheap work goes to GLM automatically. Zero token cost when you're not spawning subagents.

  • If you explicitly name an agent ("use opus", "use the sonnet agent", "use glm"), the hook stays silent and just routes where you asked.


Related MCP server: Ollama MCP Server

Prerequisites

  • The Claude Code app (desktop or CLI), signed in with a subscription-based Claude (Pro / Max / Team). Your main agent uses this β€” no Anthropic API key needed. The claude CLI should be on your PATH (claude --version).

  • Node.js β‰₯ 18 (node -v)

  • A Z.ai / Zhipu API key with GLM Coding Plan access β€” get one at https://z.ai. ⚠️ It must be on the GLM Coding Plan (the coding-plan subscription); a generic or free key won't have access to the coding models this uses. This is the only paid key required, and GLM is ~10Γ— cheaper than Opus.

  • Git (optional, but enables glm_agent's one-command revert)


# from this folder:
node install.mjs --key YOUR_ZAI_API_KEY

The installer:

  1. copies the server to ~/.claude/glm-mcp/ and runs npm install,

  2. writes your key into ~/.claude/glm-mcp/.env,

  3. installs the glm subagent (~/.claude/agents/glm.md) and the hook (~/.claude/hooks/),

  4. wires the hook into ~/.claude/settings.json (backs it up first),

  5. adds a short delegation policy to ~/.claude/CLAUDE.md,

  6. registers the MCP server with claude mcp add glm -s user.

Then restart Claude Code and run glm_status β€” you should see "api_key_loaded": true.

Options: --no-register (skip the CLI step), --skip-npm, --claude-dir PATH (custom config dir). Re-running is safe (idempotent). No key on the command line? Run node install.mjs, then edit ~/.claude/glm-mcp/.env and set GLM_API_KEY=....

Per-project instead of global

Don't want it everywhere? Skip the installer. Copy glm-mcp/ into your project, cd glm-mcp && npm install, copy .mcp.json.example β†’ .mcp.json in the project root, set the key, and (optionally) copy agents/glm.md to .claude/agents/ and the hook into .claude/ + .claude/settings.json.


How it works (the short version)

You ask for something
  β†’ Opus orchestrates
  β†’ wants to delegate a chunk β†’ spawns a subagent
       β†’ [hook fires] "[GLM router] GLM-suitable repo task β†’ use glm_agent (dry_run first)"
              (or "keep on Opus" for hard/sensitive work)
       β†’ Opus runs glm_agent (GLM edits the files, runs tests) β€” or keeps it on Opus
       β†’ you get a diff + action log + a one-command revert

The routing rules live in glm-mcp/src/router.js and the hook β€” not in always-on context β€” so they cost nothing until a subagent is actually spawned.

Routing in one line: GLM is the default (it's ~10Γ— cheaper); Opus is the exception for work where being wrong is expensive β€” subtle debugging, architecture, large refactors, security, tool-heavy dependent loops, huge context, vision, or anything you mark sensitive.


The tools

Tool

Cost

What it does

glm_recommend

free (local)

GLM-or-Opus decision + model pick + reasons.

glm_status

free (local)

Peak window, active model, key/config health.

glm_delegate

GLM tokens

Text in β†’ text out. GLM drafts; you place it.

glm_agent

GLM tokens

GLM works your repo directly (read/write/edit/bash). Returns a diff + action log + git revert; supports dry_run (propose, don't write).

Example: directly calling the GLM agent

A real run β€” asking GLM (via glm_agent) to write a file end-to-end on disk:

GLM agent writing a 2000-word Shakespearean essay to disk in 18 iterations for about 6 cents

Prompt: "Using the GLM agent glm_agent, write a 2000-word essay in Shakespearean format about the usefulness of an umbrella, into my Desktop."

GLM did it itself β€” created the file directly, no round-tripping the content through the main agent:

  • Output: Umbrella-Essay-Shakespeare.md β€” ~2,260 words of Early Modern English (thee/thou/thy, doth/hath) with two blank-verse interludes

  • Work: 18 tool-loop iterations; one file created, nothing existing touched

  • Cost: ~$0.064 β€” a fraction of running the same task on Opus

That's the point: the orchestrator stays on Opus while glm_agent does the heavy, file-touching work for cents.


Oversight (how you stay in control of glm_agent)

  • Entry: you/Opus choose when to call it and with which workdir.

  • dry_run: true: GLM proposes a full diff and writes nothing β€” approve, then apply.

  • After a real run: you get the unified diff, an action log, and a one-command git revert (git checkout <baseline> -- .).

Note: file/bash ops inside glm_agent run in the MCP server process (not gated per-edit) and are scoped to the workdir you pass. That's intentional (max autonomy) β€” point it only at repos you're fine letting it modify.


Configuration (~/.claude/glm-mcp/.env)

Var

Default

Meaning

GLM_API_KEY

β€”

Your Z.ai key. Required.

GLM_BASE_URL

https://api.z.ai/api/anthropic

Anthropic-compatible endpoint.

GLM_COST_BIAS

1.5

How hard to favor GLM (it's ~10Γ— cheaper). Higher = more GLM; 0 = decide on capability only.

GLM_CAP

off

Output-token cap. Off by default = generous (up to 131072 per call). Set on to enforce GLM_MAX_TOKENS and rein in spend.

GLM_MAX_TOKENS

32768

The hard per-call limit applied only when GLM_CAP=on. (max_tokens is a ceiling, not a target β€” you pay for actual output.)

GLM_MAX_TOKENS_CEILING

131072

The generous default used when the cap is off.

GLM_MAX_CONCURRENT

1

GLM caps in-flight requests; keep at 1.

GLM_OFFPEAK_MODEL / GLM_PEAK_MODEL

glm-5.2 / glm-5.2

Model(s) for auto. Each can be a comma-separated list (e.g. glm-5.2,glm-5-turbo) and the router auto-picks β€” most capable for hard tasks, cheapest for easy ones. Peak rule: when auto lands on a glm-5.x model (3Γ— surcharge) the router routes less work to GLM at peak; if you include a no-surcharge model (e.g. GLM_PEAK_MODEL=glm-5.2,glm-4.7) it's preferred at peak and GLM stays fine to use.

GLM_PEAK_START_CN / GLM_PEAK_END_CN

14 / 18

Peak window (China hour, UTC+8).

GLM_AGENT_MAX_ITERS

30

Max tool-loop turns for glm_agent.

Full list with comments: glm-mcp/.env.example.


Uninstall

node uninstall.mjs          # remove agent, hook, settings entry, MCP registration
node uninstall.mjs --purge  # also delete ~/.claude/glm-mcp (and its .env)

Security

  • Never commit/share your .env or a .mcp.json containing the key. .gitignore excludes them.

  • GLM routes through servers in China β€” don't send secrets/regulated code you wouldn't send to a third-party API. (Routing keeps sensitive-flagged work on Opus, but you decide what to delegate.)


Troubleshooting

Symptom

Fix

glm_status missing / tools absent

Restart Claude Code; claude mcp get glm to confirm registration.

api_key_loaded: false

Set GLM_API_KEY in ~/.claude/glm-mcp/.env.

Server fails to start

cd ~/.claude/glm-mcp && npm run smoke to see the real error.

Too much concurrency

Expected under load; it auto-retries. Don't fan out parallel GLM calls.

Hook not firing

Check ~/.claude/settings.json has a PreToolUse Task matcher pointing at glm_subagent_router.mjs.

More background and the research behind the routing rules: see docs/.


Contributing

PRs and issues welcome β€” see CONTRIBUTING.md. Good first areas: routing rules (glm-mcp/src/router.js + the hook), provider adapters, and docs. Please never commit secrets/.env.

License

MIT Β© djerok


Original / canonical repository: https://github.com/djerok/glm_mcp_claude. If you fork, mirror, or redistribute this project, please keep a link back to the source so others can find updates, file issues, and contribute. Built by @djerok.

A
license - permissive license
-
quality - not tested
C
maintenance

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/djerok/glm_mcp_claude'

If you have feedback or need assistance with the MCP directory API, please join our Discord server