Skip to main content
Glama

CodeModeTOON MCP Server

CI Status License NPM Version

A lightweight Model Context Protocol (MCP) orchestrator designed for efficiency at scale. It features TOON compression (reducing token usage by 30-90%) and Lazy Loading, making it the ideal solution for complex, multi-tool agentic workflows.

The "Context Trap" in Agentic Workflows

Recent articles from Anthropic and Cloudflare (see Here) highlights a critical bottleneck: AI agents struggle with complex, multi-step workflows because they lack state.

While Code Execution (e.g., TypeScript) allows agents to maintain state and structure workflows effectively, it introduces a new problem: Data Bloat. Real-world operations (like SRE log analysis or database dumps) generate massive JSON payloads that explode the context window, making stateful execution prohibitively expensive.

CodeModeTOON bridges this gap. It enables:

  1. Stateful Execution: Run complex TypeScript workflows to maintain context outside the model.

  2. Context Efficiency: Use TOON Compression to "zip" the results, allowing agents to process massive datasets without blowing their token budget.

Related MCP server: workflows-mcp

How It Works

graph LR A[AI Agent<br/>Claude/Cursor] -->|JSON-RPC| B[CodeModeTOON<br/>Server] B -->|Lazy Load| C[Perplexity] B -->|Lazy Load| D[Context7] B -->|Lazy Load| E[Custom Servers] C -->|Raw JSON| B D -->|Raw JSON| B E -->|Raw JSON| B B -->|TOON<br/>Compressed| A style B fill:#4f46e5,color:#fff style A fill:#10b981,color:#fff

Data Flow: Requests route through CodeModeTOON β†’ Servers are lazy-loaded on-demand β†’ Responses are TOON-compressed before returning to the agent.

πŸ”₯ Key Features

πŸ—œοΈ TOON Compression

Reduces token usage by 30-90% for structured data.

  • Validated: ~83% savings on Kubernetes audits

  • Best for: SRE logs, database dumps, API responses

  • How it works: Schema extraction + value compression

⚑ Lazy Loading

Servers only start when needed. Zero overhead for unused tools.

  • Best for: Multi-tool workflows, resource-constrained environments

  • Performance: Sub-100ms startup for active servers

πŸ”’ Sandboxed Execution

Secure JS execution with auto-proxied MCP tool access.

  • Best for: Complex stateful workflows, batch operations

  • Security: Uses Node.js vm module (not for multi-tenant use)

πŸ€– Agent-Friendly Features

Designed for programmatic discovery and self-correction.

  • suggest_approach: Meta-tool that recommends the best execution strategy (code vs workflow vs direct call).

  • Efficiency Metrics: execute_code returns operation counts and compression savings to reinforce efficient behavior.

  • Recovery Hints: Error messages include actionable next steps for agents (e.g., "Server not found? Try list_servers").

Table of Contents

When to Use CodeModeTOON

βœ… Perfect for:

  • Multi-step AI workflows requiring state management

  • Processing large structured datasets (logs, DB dumps, K8s manifests)

  • Coordinating multiple MCP servers in parallel

  • Token-constrained environments (reducing API costs)

❌ Not ideal for:

  • Simple single-tool queries

  • Unstructured text-heavy responses (compression <10%)

  • Multi-tenant production servers (vm module security limitation)

Installation

One‑Click (Cursor)

Add to Cursor

Manual Setup

Add this to your ~/.cursor/mcp.json:

{ "mcpServers": { "code-mode-toon": { "type": "stdio", "command": "npx", "args": ["-y", "code-mode-toon"], "env": { "CODE_MODE_TOON_CONFIG": "~/.cursor/mcp.json" } } } }

🧠 Claude Skills

CodeModeTOON includes a pre-built Claude Skill to make your AI assistant an expert at using this orchestrator.

code-mode-toon-workflow-expert

A specialized skill that teaches Claude how to:

  • Decide when to use a workflow vs ad-hoc code.

  • Create new workflows following best practices.

  • Orchestrate multiple tools efficiently.

Installation:

  1. Unzip claude-skills/code-mode-toon-workflow-expert.skill

  2. Place the folder in your .claude/skills/ directory (or import via Claude desktop app).

πŸ€– AI Assistant Prompts

Copy these prompts into your AI's custom instructions (e.g., .cursorrules or Claude Project instructions) to maximize CodeModeTOON's potential.

1. System Identity & Orchestration (Essential)

Goal: Teaches the AI to act as an orchestrator and prioritize workflows.

YOU ARE AN AGENTIC ORCHESTRATOR. You have access to "CodeModeTOON", a high-efficiency MCP bridge. 1. PRIORITIZE WORKFLOWS: Before running single tools, check `list_workflows`. If a workflow exists (e.g., `research`, `k8s-detective`), USE IT. It is faster and saves tokens. 2. HANDLE COMPRESSED DATA: Outputs may be "TOON encoded" (highly compressed JSON). This is normal. Do not complain about "unreadable data" - simply parse it or ask for specific fields if needed. 3. BATCH OPERATIONS: Never run 3+ sequential tool calls if they can be batched. Use `execute_code` to run them in a single block.

2. Tool Discovery (Lazy Loading)

Goal: Prevents the AI from giving up if a tool isn't immediately visible.

TOOLS ARE LAZY LOADED. If you need a capability (e.g., "search", "kubernetes", "database") and don't see the tool: 1. DO NOT assume it's missing. 2. RUN `search_tools({ query: "..." })` to find it. 3. RUN `get_tool_api({ serverName: "..." })` to learn how to use it. 4. Only then, execute the tool.

3. Efficiency & TOON Compression

Goal: Enforces token-saving behaviors for large data operations.

OPTIMIZE FOR TOKENS. When fetching large datasets (logs, docs, API responses): 1. ALWAYS wrap the output in `TOON.encode(data)` inside `execute_code`. 2. PREFER structured data (JSON/Objects) over plain text. TOON compresses structure by ~83%, but text by only ~4%. 3. IF synthesizing data, do it server-side (via workflow `synthesize: true`) to avoid pulling raw data into context.

Quick Start

After installation, try this 30-second demo in Claude or Cursor:

// Ask your AI assistant to run this via execute_code const api = await get_tool_api({ serverName: 'perplexity' }); const result = await servers['perplexity'].perplexity_ask({ messages: [{ role: 'user', content: "Explain TOON compression" }] }); console.log(result); // See compression in action! ~40% token savings

What just happened? The response was automatically TOON-encoded, saving tokens.

Usage Examples

// Inside execute_code const api = await get_tool_api({ serverName: 'perplexity' }); // Request large data - automatically compressed! const result = await servers['perplexity'].perplexity_ask({ messages: [{ role: 'user', content: "Summarize the history of Rome" }] }); console.log(result); // Returns TOON-encoded string, saving ~40% tokens
// Fetch large documentation from Context7 const api = await get_tool_api({ serverName: 'context7' }); const docs = await servers['context7']['get-library-docs']({ context7CompatibleLibraryID: 'kubernetes/kubernetes' }); console.log(TOON.encode(docs)); // Massive compression on structured data
// Run a complex research workflow const result = await workflows.research({ goal: "Compare xsync vs sync.Map performance", queries: ["xsync vs sync.Map benchmarks"], synthesize: true, outputFile: "/tmp/research.toon" }); console.log(result.synthesis); // LLM-synthesized findings

Workflows

CodeModeTOON supports Workflowsβ€”pre-defined, server-side TypeScript modules that orchestrate multiple MCP tools.

Research Workflow

A powerful research assistant that:

  • Parallelizes data fetching from multiple sources (Context7, Wikipedia, Perplexity).

  • Synthesizes findings using LLMs (optional).

  • Outputs TOON-encoded files for maximum context efficiency.

  • Retries failed requests automatically.

See .workflows/README.md for detailed documentation, usage examples, and AI prompts.

Performance Benchmark

Why This Matters

Scenario 2 (92% savings) demonstrates CodeModeTOON's strength:

Metric

Original

TOON

Savings

Characters

37,263

2,824

~83%

Estimated Tokens*

~9,315

~706

~8,600 tokens

Cost (Claude Sonnet)**

$0.028

$0.002

$0.026

*Assuming 4 chars/token average
***$3/M tokens input pricing*

Key Insight: For infrastructure audits, log analysis, or database dumps, TOON compression can reduce token costs by 90%+, making complex agentic workflows feasible within budget.

Scenario 1: Natural Language Query (History of Rome) Unstructured text compresses poorly, as expected.

  • Original JSON: 11,651 chars

  • TOON Encoded: 11,166 chars

  • Compression Ratio: ~4.16% Savings

Scenario 2: Kubernetes Cluster Audit (50 Pods) Highly structured, repetitive JSON (infrastructure dumps) compresses extremely well.

  • Original JSON: 37,263 chars

  • TOON Encoded: 2,824 chars

  • Compression Ratio: ~83% Savings πŸ“‰

Troubleshooting

"Server not found" error

Cause: CodeModeTOON can't locate your MCP config. Solution: Ensure CODE_MODE_TOON_CONFIG points to your config:

export CODE_MODE_TOON_CONFIG=~/.cursor/mcp.json

TOON encoding not working

Cause: Results aren't being encoded. Solution: Use console.log(TOON.encode(data)), not console.log(data).

Lazy server won't load

Cause: Server name mismatch. Solution: Verify server name matches your config. Use get_tool_api({ serverName: 'name' }) to inspect available servers.

Security Note

⚠️ The Suitable for personal AI assistant use (Claude, Cursor) with trusted code. Not for multi-tenant or public services.

Acknowledgments

Author

Built by Ziad Hassan (Senior SRE/DevOps) β€” LinkedIn Β· GitHub

Contributing

Contributions are welcome! πŸ™Œ

Ways to Contribute

  1. Report bugs - Open an issue with reproduction steps

  2. Suggest features - Discuss use cases in Issues

  3. Add workflows - See Workflows

  4. Improve docs - Documentation PRs always welcome

Development Setup

git clone https://github.com/ziad-hsn/code-mode-toon.git cd code-mode-toon npm install npm test

License

MIT License β€” see LICENSE for details.

Install Server
A
security – no known vulnerabilities
A
license - permissive license
A
quality - confirmed to work

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ziad-hsn/code-mode-toon'

If you have feedback or need assistance with the MCP directory API, please join our Discord server