token-savior
Token Savior is an MCP server providing structural code navigation, persistent memory, and token-efficient tooling for AI coding agents.
Project & Workspace Management
List, switch, and add workspace projects; trigger reindexing; get high-level project summaries.
Code Navigation & Symbol Lookup
Find symbol definitions, retrieve full source, list all functions/classes/imports, search via regex, get file/project structure summaries.
Dependency & Impact Analysis
Find dependencies and dependents; analyze full transitive change impact; trace call chains; detect cross-project dependencies.
All-in-One Context
Get edit context (source + dependencies + callers) in a single call; find all files related to a feature keyword.
Git & Diff Tools
Structured git status; symbol-oriented summaries of worktree changes or changes since a ref; detect breaking API changes; build compact commit/review summaries.
Safe Editing & Checkpoints
Replace symbol source or insert content before/after a symbol without file-wide patches. Create, list, restore, compare, and prune checkpoints for safe mutations.
Testing & Validation
Find impacted test files; run only impacted tests; replace a symbol and auto-validate with optional rollback on failure.
Code Quality & Framework Analysis
Find dead code; rank complexity hotspots; detect Next.js routes/API endpoints and React components; analyze config files for secrets/duplicates; inspect Dockerfiles; cross-reference environment variable usage.
Persistent Memory Engine
Store and retrieve observations, decisions, and conventions across sessions with hybrid search (BM25 + vector), progressive disclosure, Bayesian validity, contradiction detection, decay/TTL, ROI tracking, and MDL distillation.
Efficiency & Customization
Session token-saving stats; customizable tool profiles (full, code_mode, auto) to minimize manifest size; defer-loading via
ts_search.
Synchronizes the codebase index by monitoring git status and diffs, and provides symbol-level impact analysis since specific git references.
Indexes documentation files via heading detection to allow section-based navigation and querying.
Identifies and executes specific tests impacted by symbol changes, providing compact summaries instead of raw logs.
Provides structural indexing and surgical source code retrieval for Python projects, including functions, classes, and dependency mapping.
Indexes Rust projects to extract and query symbols such as functions, structs, traits, and impl blocks.
Enables structural analysis and symbol extraction for TypeScript and JavaScript, covering functions, interfaces, and type aliases.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@token-saviorWhat is the impact of changing the LLMClient class?"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
⚡ Token Savior Recall
The MCP server that turns Claude into the only coding agent hitting 100% on a real benchmark. Structural code navigation + persistent memory. −77% active tokens. −76% wall time. Zero losses.
📖 mibayy.github.io/token-savior — project site + benchmark landing 🧪 github.com/Mibayy/tsbench — benchmark source + fixtures
Benchmark — 96 real coding tasks (tiny+v2 default)
Plain Claude Code | With Token Savior | |
Score | 141 / 180 (78.3%) | 192 / 192 (100.0%) |
Active tokens / task | 17 221 | 3 929 (−77%) |
Wall time / task | 110.6 s | 26.6 s (−76%) |
Wins / Ties / Losses | — | 25 / 65 / 0 (90 paired) |
Perfect (100%) across all 11 categories: audit, bug_fixing,
code_generation, code_review, config_infra, data_analysis,
documentation, explanation, git, navigation, refactoring,
writing_tests. Zero losses against plain Claude — every task is a
win or a tie.
The default config — TS_PROFILE=tiny_plus (15 tools, ~2.5 KT manifest)
TS_CAPTURE_DISABLED=1+ the v2 system prompt that bansAgentsub-agent delegation — reproduces 100% on Opus 4.7 with −54% active tokens vs the legacyleanprofile.
Also validated on Sonnet 4.6 (ts 170/180 = 94.4% vs base 156/180 = 86.7%).
Model: Claude Opus 4.7 · Methodology + per-task breakdown: mibayy.github.io/token-savior.
What it does
Claude Code reads whole files to answer questions about three lines, and forgets
everything the moment a session ends. Token Savior Recall fixes both. It
indexes your codebase by symbol — functions, classes, imports, call graph — so
the model navigates by pointer instead of by cat. Measured reduction: 97%
fewer chars injected across 170+ real sessions.
On top of that sits a persistent memory engine. Every decision, bugfix, convention, guardrail and session rollup is stored in SQLite WAL + FTS5 + vector embeddings, ranked by Bayesian validity and ROI, and re-injected as a compact delta at the start of the next session. Contradictions are detected at save time; observations decay with explicit TTLs; a 3-layer progressive-disclosure contract keeps lookup cost bounded.
Token savings
Operation | Plain Claude | Token Savior | Reduction |
| 41M chars (full read) | 67 chars | −99.9% |
| grep + cat chain | 4.5K chars | direct |
| impossible | 16K chars (154 direct + 492 transitive) | new capability |
| 130 lines | 12 lines | −92% |
| n/a | ~15 tokens/result | Layer 1 shortlist |
90-task tsbench (Opus base→ts) | 17.2 KT active/task | 3.9 KT active/task | −77% |
tsbench score (Opus, 96 tasks) | 141/180 (78.3%) | 192/192 (100.0%) | +22 pts |
Full benchmark methodology and per-task results: tsbench.
Memory engine
Capability | How it works |
Storage | SQLite WAL + FTS5 + |
Hybrid search | BM25 + vector ( |
Progressive disclosure | 3-layer contract: |
Citation URIs |
|
Bayesian validity | Each obs carries a validity prior + update rule; stale obs are surfaced, not silently trusted |
Contradiction detection | Triggered at save time against existing index; flagged in hook output |
Decay + TTL | Per-type TTL (command 60d, research 90d, note 60d) + LRU scoring |
Symbol staleness | Obs linked to symbols are invalidated when the symbol's content hash changes |
ROI tracking | Access count × context weight — unused obs age out, high-ROI obs are promoted |
MDL distillation | Minimum Description Length grouping compresses redundant observations into conventions |
Auto-promotion | note ×5 accesses → convention; warning ×5 → guardrail |
Hooks | 8 Claude Code lifecycle hooks (SessionStart/Stop/End, PreCompact, PreToolUse ×2, UserPromptSubmit, PostToolUse) |
Web viewer |
|
LLM auto-extraction | Opt-in |
vs claude-mem
Two projects share the goal — persistent memory for Claude Code. The axes below are measured, not marketing.
Axis | claude-mem | Token Savior Recall |
Bayesian validity | no | yes |
Contradiction detection at save | no | yes |
Per-type decay + TTL | no | yes |
Symbol staleness (content-hash linked obs) | no | yes |
ROI tracking + auto-promotion | no | yes |
MDL distillation into conventions | no | yes |
Code graph / AST navigation | no | yes (90 tools, cross-language) |
Progressive disclosure contract | no | yes (3 layers, ~15/60/200 tokens) |
Hybrid FTS + vector search (RRF) | no | yes |
Token Savior Recall is a superset: it ships the memory engine plus the structural codebase server that gave the project its name.
Install
uvx (no venv, no clone)
uvx token-savior-recallpip
pip install "token-savior-recall[mcp]"
# Optional hybrid vector search:
pip install "token-savior-recall[mcp,memory-vector]"Claude Code one-liner
claude mcp add token-savior -- /path/to/venv/bin/token-saviorDevelopment
git clone https://github.com/Mibayy/token-savior
cd token-savior
python3 -m venv .venv
.venv/bin/pip install -e ".[mcp,dev]"
pytest tests/ -qConfigure
{
"mcpServers": {
"token-savior-recall": {
"command": "/path/to/venv/bin/token-savior",
"env": {
"WORKSPACE_ROOTS": "/path/to/project1,/path/to/project2",
"TOKEN_SAVIOR_CLIENT": "claude-code"
}
}
}
}Optional env: TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID (critical-observation
feed), TS_VIEWER_PORT (web viewer), TS_AUTO_EXTRACT=1 + TS_API_KEY
(LLM auto-extraction), TOKEN_SAVIOR_PROFILE (full / core / nav / lean /
ultra — filters advertised tool set to shrink the per-turn MCP manifest).
Tools (90)
Category counts — full catalog is served via MCP tools/list.
Category | Count |
Core navigation | 14 |
Dependencies & graph | 9 |
Git & diffs | 5 |
Safe editing | 8 |
Checkpoints | 6 |
Test & run | 6 |
Config & quality | 8 |
Docker & multi-project | 2 |
Advanced context (slicing, packing, RWR, prefetch, verify) | 6 |
Memory engine | 21 |
Reasoning (plan/decision traces) | 3 |
Stats, budget, health | 10 |
Project management | 7 |
Profiles
TOKEN_SAVIOR_PROFILE filters the advertised tools/list payload while
keeping handlers live.
Profile | Advertised | ~Tokens | Use case |
| 15-18 | ~2 500 | Adaptive manifest sized from your real telemetry |
| 68 | ~8 770 | All capabilities, debug, power users |
| 4 | ~1 500 | Multi-tool chains in one |
— | — | Deprecated in v3.4, removed in v4.0 — use |
Bench-mode env vars
For benchmark / cold-start workloads where memory and capture sandboxing add no value, pair the profile with these env vars:
export TOKEN_SAVIOR_PROFILE=lean # or 'tiny' for max trim
export TS_MEMORY_DISABLE=1 # hide memory_* (-300 t)
export TS_CAPTURE_DISABLED=1 # hide capture_*, skip PostToolUse hook
export TS_HOOK_MINIMAL=1 # SessionStart emits Memory Index only
export TS_NO_HINTS=1 # drop _hints / _suggestion (~30-50 t/call)Measured on tsbench (90 tasks, Claude Opus 4.7):
Configuration | Active tokens / task | Score |
Plain agent (Read/Grep/Bash, no Token Savior) | 17 221 | 78.3 % |
| 8 928 | 100.0 % |
| ~5 500 | 100.0 % |
Defer-loading via ts_search
When the manifest budget is the bottleneck, the new tiny profile
exposes only 6 tools (switch_project, find_symbol,
get_function_source, get_full_context, search_codebase,
ts_search). Other ~60 tools are reachable just-in-time via:
ts_search(query="find dependents of update_user", top_k=5)
# → {"matched_tools": [{"name": "get_dependents", "score": 0.68, ...}, ...]}Embeddings (Nomic 768d) score every tool description against the query; top-K candidates come back with their full inputSchema so the next turn can call them directly. Mirrors the Tool Attention paper (47.3k → 2.4k tokens / turn at 120 tools, −95 % prefix).
Code Mode — collapse multi-tool chains into one JS sandbox
TOKEN_SAVIOR_PROFILE=code_mode exposes just 4 tools (ts_execute,
ts_search, switch_project, list_projects) and lets the model write
a JS body that calls 34 internal Token Savior tools through a typed
facade. Replaces the standard find_symbol → get_function_source →
get_dependents 3-round-trip chain with a single tool call.
# Step 1: discover signatures on demand
ts_search(query="locate symbol and find callers", format="ts")
# → matched_tools: [
# {"name":"find_symbol", "signature":"find_symbol: (args?: { name?: string; ... }) => Promise<unknown>"},
# {"name":"get_dependents", "signature":"get_dependents: (args: { name: string; ... }) => Promise<unknown>"},
# ]
# Step 2: chain them in one round-trip
ts_execute(script="""
const sym = await tools.find_symbol({ name: "process_payment" });
const callers = await tools.get_dependents({ name: sym.symbol });
return { sym, callers };
""")
# → {"value": {...}, "logs": [...], "tool_calls": 2, "duration_ms": 52}Adapted from Cloudflare's Code Mode for MCP.
Sandbox is a Node subprocess with stdio IPC. Each script runs in an
isolated context, ~50 ms cold spawn, configurable timeout. Disable
entirely with TS_CODE_MODE_DISABLE=1.
Anthropic API users — pair with native context management
For long agent loops, combine Token Savior with Anthropic's native context primitives (Claude API ≥ 2025-09-19):
client = anthropic.Anthropic(default_headers={
"anthropic-beta": "context-management-2025-06-27,clear-tool-uses-2025-09-19",
})
resp = client.messages.create(
model="claude-opus-4-7",
context_management={"edits": [{
"type": "clear_tool_uses_20250919",
"trigger": {"type": "input_tokens", "value": 30_000},
"keep": {"type": "tool_uses", "value": 4},
"exclude_tools": ["replace_symbol_source", "edit_lines_in_symbol"],
}]},
tools=[...],
messages=[...],
)Anthropic's cookbook measures −48 % peak context with clearing alone on long agent loops.
Progressive disclosure — memory search
Three layers, increasing cost. Always start at Layer 1. Escalate only if the previous layer paid off. Full contract: docs/progressive-disclosure.md.
Layer | Tool | Tokens/result | When |
1 |
| ~15 | Always first |
2 |
| ~60 | If Layer 1 matched |
3 |
| ~200 | If Layer 2 confirmed |
Each Layer 1 row ends with [ts://obs/{id}] — pass it straight to Layer 3.
Links
Benchmark — https://github.com/Mibayy/tsbench
Changelog — CHANGELOG.md
Progressive disclosure — docs/progressive-disclosure.md
License
MIT — see LICENSE.
Works with any MCP-compatible AI coding tool. Claude Code · Cursor · Codex CLI · Antigravity · Cline · Continue · Windsurf · Aider · Gemini CLI · Copilot CLI · Zed · any custom MCP client
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Mibayy/token-savior'
If you have feedback or need assistance with the MCP directory API, please join our Discord server