TokenSaver MCP
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| ANTHROPIC_API_KEY | No | Optional API key for higher quality abstractive compression using Claude Haiku. Not required for core functionality. |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| logging | {} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| extensions | {
"io.modelcontextprotocol/ui": {}
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| count_tokensA | Estimate token count for text or a message list before sending to an API. Use this to decide whether to compress, prune, or skip content. |
| compress_contextA | Compress long text or conversation history into a dense summary. Use before re-injecting large context on repeated turns. Extractive mode (default): offline, free, uses LSA sentence ranking. Abstractive mode: higher quality but requires ANTHROPIC_API_KEY env var. |
| cache_storeA | Store a tool result in the persistent cache with a TTL. Prevents re-running the same expensive operation twice. Recommended: set key = make_cache_key(tool_name, args). |
| cache_getA | Retrieve a cached result. If hit, skip re-running the original tool. |
| cache_invalidateA | Remove a stale cache entry (e.g. after file changes). |
| extract_webpageA | Fetch a webpage and return only its main readable content — no HTML, scripts, navigation, ads, or cookie banners. Saves 85–95% of tokens vs raw HTML. |
| summarize_fileB | Summarize a file or directory without reading every byte. Agents get full structural understanding in ~500 tokens instead of 50,000+. |
| prune_conversationC | Reduce conversation history token footprint by removing filler turns and compressing older verbose ones. Saves 60–80% on long conversations. |
| optimize_promptA | Shorten a verbose or redundant prompt/system prompt while preserving intent. Typical savings: 30–65%. Run once on system prompts that accumulate over iterations. |
| advise_context_windowA | Analyze current token usage vs model context window and recommend what to trim. Use this meta-tool to know WHERE to apply compress_context, prune_conversation, or other tokensaver tools for maximum effect. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/pozii/tokensaver'
If you have feedback or need assistance with the MCP directory API, please join our Discord server