Optimizes SRE workflows and infrastructure audits by compressing large Kubernetes manifests and log datasets by up to 83%.
Enables efficient querying of Perplexity AI with automatic TOON compression on the returned search results to save token usage.
Allows parallelized data fetching from Wikipedia as part of research workflows to synthesize findings with minimal context overhead.
CodeModeTOON MCP Server
A lightweight Model Context Protocol (MCP) orchestrator designed for efficiency at scale. It features TOON compression (reducing token usage by 30-90%) and Lazy Loading, making it the ideal solution for complex, multi-tool agentic workflows.
The "Context Trap" in Agentic Workflows
Recent articles from Anthropic and Cloudflare (see Here) highlights a critical bottleneck: AI agents struggle with complex, multi-step workflows because they lack state.
While Code Execution (e.g., TypeScript) allows agents to maintain state and structure workflows effectively, it introduces a new problem: Data Bloat. Real-world operations (like SRE log analysis or database dumps) generate massive JSON payloads that explode the context window, making stateful execution prohibitively expensive.
CodeModeTOON bridges this gap. It enables:
Stateful Execution: Run complex TypeScript workflows to maintain context outside the model.
Context Efficiency: Use TOON Compression to "zip" the results, allowing agents to process massive datasets without blowing their token budget.
Related MCP server: workflows-mcp
How It Works
Data Flow: Requests route through CodeModeTOON β Servers are lazy-loaded on-demand β Responses are TOON-compressed before returning to the agent.
π₯ Key Features
ποΈ TOON Compression
Reduces token usage by 30-90% for structured data.
Validated: ~83% savings on Kubernetes audits
Best for: SRE logs, database dumps, API responses
How it works: Schema extraction + value compression
β‘ Lazy Loading
Servers only start when needed. Zero overhead for unused tools.
Best for: Multi-tool workflows, resource-constrained environments
Performance: Sub-100ms startup for active servers
π Sandboxed Execution
Secure JS execution with auto-proxied MCP tool access.
Best for: Complex stateful workflows, batch operations
Security: Uses Node.js
vmmodule (not for multi-tenant use)
π€ Agent-Friendly Features
Designed for programmatic discovery and self-correction.
suggest_approach: Meta-tool that recommends the best execution strategy (code vs workflow vs direct call).Efficiency Metrics:
execute_codereturns operation counts and compression savings to reinforce efficient behavior.Recovery Hints: Error messages include actionable next steps for agents (e.g., "Server not found? Try list_servers").
Table of Contents
When to Use CodeModeTOON
β Perfect for:
Multi-step AI workflows requiring state management
Processing large structured datasets (logs, DB dumps, K8s manifests)
Coordinating multiple MCP servers in parallel
Token-constrained environments (reducing API costs)
β Not ideal for:
Simple single-tool queries
Unstructured text-heavy responses (compression <10%)
Multi-tenant production servers (vm module security limitation)
Installation
OneβClick (Cursor)
Manual Setup
Add this to your ~/.cursor/mcp.json:
π§ Claude Skills
CodeModeTOON includes a pre-built Claude Skill to make your AI assistant an expert at using this orchestrator.
code-mode-toon-workflow-expert
A specialized skill that teaches Claude how to:
Decide when to use a workflow vs ad-hoc code.
Create new workflows following best practices.
Orchestrate multiple tools efficiently.
Installation:
Unzip
claude-skills/code-mode-toon-workflow-expert.skillPlace the folder in your
.claude/skills/directory (or import via Claude desktop app).
π€ AI Assistant Prompts
Copy these prompts into your AI's custom instructions (e.g., .cursorrules or Claude Project instructions) to maximize CodeModeTOON's potential.
1. System Identity & Orchestration (Essential)
Goal: Teaches the AI to act as an orchestrator and prioritize workflows.
2. Tool Discovery (Lazy Loading)
Goal: Prevents the AI from giving up if a tool isn't immediately visible.
3. Efficiency & TOON Compression
Goal: Enforces token-saving behaviors for large data operations.
Quick Start
After installation, try this 30-second demo in Claude or Cursor:
What just happened? The response was automatically TOON-encoded, saving tokens.
Usage Examples
Workflows
CodeModeTOON supports Workflowsβpre-defined, server-side TypeScript modules that orchestrate multiple MCP tools.
Research Workflow
A powerful research assistant that:
Parallelizes data fetching from multiple sources (Context7, Wikipedia, Perplexity).
Synthesizes findings using LLMs (optional).
Outputs TOON-encoded files for maximum context efficiency.
Retries failed requests automatically.
See .workflows/README.md for detailed documentation, usage examples, and AI prompts.
Performance Benchmark
Why This Matters
Scenario 2 (92% savings) demonstrates CodeModeTOON's strength:
Metric | Original | TOON | Savings |
Characters | 37,263 | 2,824 | ~83% |
Estimated Tokens* | ~9,315 | ~706 | ~8,600 tokens |
Cost (Claude Sonnet)** | $0.028 | $0.002 | $0.026 |
*Assuming 4 chars/token average
***$3/M tokens input pricing*
Key Insight: For infrastructure audits, log analysis, or database dumps, TOON compression can reduce token costs by 90%+, making complex agentic workflows feasible within budget.
Scenario 1: Natural Language Query (History of Rome) Unstructured text compresses poorly, as expected.
Original JSON: 11,651 chars
TOON Encoded: 11,166 chars
Compression Ratio: ~4.16% Savings
Scenario 2: Kubernetes Cluster Audit (50 Pods) Highly structured, repetitive JSON (infrastructure dumps) compresses extremely well.
Original JSON: 37,263 chars
TOON Encoded: 2,824 chars
Compression Ratio: ~83% Savings π
Troubleshooting
"Server not found" error
Cause: CodeModeTOON can't locate your MCP config.
Solution: Ensure CODE_MODE_TOON_CONFIG points to your config:
TOON encoding not working
Cause: Results aren't being encoded.
Solution: Use console.log(TOON.encode(data)), not console.log(data).
Lazy server won't load
Cause: Server name mismatch.
Solution: Verify server name matches your config. Use get_tool_api({ serverName: 'name' }) to inspect available servers.
Security Note
β οΈ The Suitable for personal AI assistant use (Claude, Cursor) with trusted code. Not for multi-tenant or public services.
Acknowledgments
Anthropic: Code execution with MCP
Cloudflare: Code Mode announcement
Author
Built by Ziad Hassan (Senior SRE/DevOps) β LinkedIn Β· GitHub
Contributing
Contributions are welcome! π
Ways to Contribute
Report bugs - Open an issue with reproduction steps
Suggest features - Discuss use cases in Issues
Add workflows - See Workflows
Improve docs - Documentation PRs always welcome
Development Setup
License
MIT License β see LICENSE for details.