CODE_MODE_IMPLEMENTATION_PLAN.md•40.5 kB
# Code Mode Implementation Plan for Claude Code
## Based on Cloudflare's Revolutionary MCP Pattern
**Version:** 1.0.0
**Date:** 2025-11-17
**Status:** Design Phase
---
## Executive Summary
This document outlines the implementation of **Code Mode** - a revolutionary approach to using MCP (Model Context Protocol) where instead of exposing tools directly to the LLM, we:
1. **Convert MCP tools into TypeScript APIs** with full type definitions
2. **Ask the LLM to write code** that calls those APIs
3. **Execute code in a secure sandbox** with access only to specified MCP servers
4. **Return only the final results** to the LLM, not intermediate data
### Key Benefits
- ✅ **Better tool usage**: LLMs excel at writing code (trained on millions of real examples) vs tool calling (trained on synthetic examples)
- ✅ **Massive token reduction**: Intermediate data stays in sandbox, never enters LLM context
- ✅ **Handle complex tools**: Full TypeScript APIs vs simplified tool schemas
- ✅ **Multi-step orchestration**: Chain multiple tool calls in code without context bloat
- ✅ **Security**: No API keys in LLM context, isolated sandbox execution
---
## Architecture Overview
### Standard MCP Pattern (What We're NOT Doing)
```
┌─────────────────────────────────────┐
│ Claude Code (LLM) │
│ Context: 100K tokens │
│ - 50 tool definitions (10K) │
│ - Intermediate results (40K) │
│ - Conversation (50K) │
└──────────────┬──────────────────────┘
│ Tool calls via special tokens
│ <|tool_call|> ... <|end_tool_call|>
▼
┌──────────────────────────┐
│ MCP Server(s) │
│ - Google Drive │
│ - Salesforce │
│ - Weather API │
└──────────────────────────┘
```
**Problems:**
- LLM context fills with tool definitions
- Every intermediate result flows through LLM context
- LLM struggles with many/complex tools
- Token costs scale with workflow complexity
### Code Mode Pattern (What We're Building)
```
┌─────────────────────────────────────┐
│ Claude Code (LLM) │
│ Context: 5K tokens │
│ - ONE tool: "execute_code" │
│ - TypeScript API definitions │
│ - Final results only │
└──────────────┬──────────────────────┘
│ Writes TypeScript code
▼
┌──────────────────────────────────┐
│ Code Execution Sandbox │
│ (Isolated runtime) │
│ - Executes TS code │
│ - No network access │
│ - Only MCP bindings available │
└──────────────┬───────────────────┘
│ RPC calls via bindings
▼
┌──────────────────────────────────┐
│ MCP Orchestrator │
│ (Routes to correct server) │
└──────────────┬───────────────────┘
│
┌─────────┴─────────┬──────────┐
▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌─────────┐
│ Google │ │Salesforce│ │ Weather │
│ Drive │ │ │ │ API │
│ MCP │ │ MCP │ │ MCP │
└─────────┘ └──────────┘ └─────────┘
```
**Benefits:**
- LLM sees only TypeScript API (familiar from training data)
- Intermediate data never enters LLM context
- Sandbox isolates execution
- API keys hidden in orchestrator layer
---
## Component Architecture
### Component 1: MCP Schema to TypeScript Generator
**Purpose:** Convert MCP tool definitions into TypeScript API files
**Input:** MCP server tool schemas (JSON)
```json
{
"name": "google_drive__get_document",
"description": "Retrieve a document from Google Drive",
"inputSchema": {
"type": "object",
"properties": {
"documentId": {"type": "string"}
},
"required": ["documentId"]
}
}
```
**Output:** TypeScript definition files
```typescript
// generated/servers/google-drive/getDocument.ts
/**
* Retrieve a document from Google Drive
*/
export interface GetDocumentInput {
documentId: string;
}
export interface GetDocumentOutput {
content: string;
metadata: {
title: string;
author: string;
modifiedTime: string;
};
}
export async function getDocument(
input: GetDocumentInput
): Promise<GetDocumentOutput> {
// This calls back to the orchestrator via RPC binding
return __mcp_call('google_drive__get_document', input);
}
```
**Implementation:**
- Read MCP server schemas via `tools/list` protocol
- Convert JSON Schema to TypeScript interfaces
- Generate wrapper functions that call `__mcp_call()`
- Create index files for each server
- Generate complete type definitions
### Component 2: Code Execution Sandbox
**Purpose:** Securely execute LLM-generated TypeScript code
**Requirements:**
- ✅ Isolated from network (no `fetch()`, no external connections)
- ✅ No filesystem access (except explicitly allowed)
- ✅ TypeScript compilation (on-the-fly or pre-compiled)
- ✅ Access to MCP bindings only
- ✅ Capture `console.log()` output for LLM
- ✅ Error handling and timeouts
- ✅ Resource limits (CPU, memory, time)
**Technology Options:**
#### Option A: Node.js VM2 (Deprecated, but concept useful)
```typescript
import { VM } from 'vm2';
const sandbox = new VM({
timeout: 5000,
sandbox: {
// Inject MCP bindings
__mcp_call: async (toolName, args) => {
return orchestrator.callTool(toolName, args);
},
console: capturedConsole
}
});
const result = sandbox.run(llmGeneratedCode);
```
**Pros:** Easy to implement, good for prototype
**Cons:** VM2 deprecated, security concerns
#### Option B: Deno Sandboxing (Recommended for Production)
```typescript
import { Worker } from 'deno';
const worker = new Worker(new URL('./sandbox-worker.ts', import.meta.url), {
type: 'module',
deno: {
permissions: {
net: false, // No network
read: false, // No filesystem
write: false,
env: false,
run: false
}
}
});
// Send code to worker
worker.postMessage({ code: llmGeneratedCode, bindings: mcpBindings });
// Receive results
worker.addEventListener('message', (e) => {
const { logs, result, error } = e.data;
// Send logs back to LLM
});
```
**Pros:** Built-in security, TypeScript native, excellent isolation
**Cons:** Requires Deno runtime
#### Option C: Isolated-VM (Node.js, Security-Focused)
```typescript
import ivm from 'isolated-vm';
const isolate = new ivm.Isolate({ memoryLimit: 128 });
const context = await isolate.createContext();
// Inject MCP call binding
const jail = context.global;
await jail.set('__mcp_call', new ivm.Reference(async (toolName, args) => {
return orchestrator.callTool(toolName, args);
}));
// Execute code
const script = await isolate.compileScript(llmGeneratedCode);
await script.run(context, { timeout: 5000 });
```
**Pros:** Production-ready, secure, Node.js compatible
**Cons:** More complex setup
#### Option D: Cloudflare Workers (Future, Most Advanced)
```typescript
// Using Dynamic Worker Loader API
const worker = env.LOADER.get(workerId, async () => {
return {
modules: {
'main.ts': llmGeneratedCode
},
env: {
// MCP bindings as RPC interfaces
google_drive: ctx.exports.GoogleDriveBinding(),
salesforce: ctx.exports.SalesforceBinding()
},
globalOutbound: null // Block all network
};
});
const result = await worker.getEntrypoint().execute();
```
**Pros:** Best isolation, disposable, cheap, Cloudflare infrastructure
**Cons:** Requires Cloudflare Workers deployment, beta access
### Component 3: MCP Orchestrator
**Purpose:** Coordinate between sandbox and multiple MCP servers
**Responsibilities:**
- Connect to multiple MCP servers (stdio, HTTP, WebSocket)
- Route tool calls from sandbox to correct MCP server
- Handle authentication (API keys stored here, not in sandbox)
- Maintain MCP server connections (persistent)
- Aggregate tool schemas from all servers
- Return results to sandbox
**Architecture:**
```typescript
interface MCPConnection {
name: string;
transport: StdioTransport | HttpTransport | WebSocketTransport;
client: Client; // MCP SDK client
tools: MCPTool[]; // Cached tool definitions
}
class MCPOrchestrator {
private connections: Map<string, MCPConnection> = new Map();
async connectServer(config: MCPServerConfig) {
// Create transport (stdio, http, websocket)
const transport = createTransport(config);
const client = new Client({ name: 'code-mode-orchestrator' }, {});
await client.connect(transport);
// Fetch tool list
const { tools } = await client.request({ method: 'tools/list' });
this.connections.set(config.name, {
name: config.name,
transport,
client,
tools
});
}
async callTool(toolName: string, args: any): Promise<any> {
// Parse tool name: "google_drive__get_document"
const [serverName, ...toolParts] = toolName.split('__');
const actualToolName = toolParts.join('__');
const connection = this.connections.get(serverName);
if (!connection) {
throw new Error(`MCP server not found: ${serverName}`);
}
// Call the tool
const result = await connection.client.request({
method: 'tools/call',
params: { name: actualToolName, arguments: args }
});
return result;
}
async getAllTools(): Promise<MCPTool[]> {
const allTools = [];
for (const [serverName, connection] of this.connections) {
// Prefix tools with server name
const prefixedTools = connection.tools.map(tool => ({
...tool,
name: `${serverName}__${tool.name}`
}));
allTools.push(...prefixedTools);
}
return allTools;
}
}
```
### Component 4: TypeScript API Generator
**Purpose:** Generate filesystem structure of TypeScript APIs
**Process:**
1. **Connect to all MCP servers** via orchestrator
2. **Fetch tool schemas** from each server
3. **Generate TypeScript interfaces** from JSON Schema
4. **Create wrapper functions** that call `__mcp_call()`
5. **Organize by server** in directory structure
6. **Generate index files** for easy imports
**Directory Structure:**
```
generated/
├── servers/
│ ├── google-drive/
│ │ ├── getDocument.ts
│ │ ├── uploadFile.ts
│ │ ├── listFiles.ts
│ │ └── index.ts
│ ├── salesforce/
│ │ ├── queryRecords.ts
│ │ ├── updateRecord.ts
│ │ ├── createRecord.ts
│ │ └── index.ts
│ ├── weather/
│ │ ├── getCurrentWeather.ts
│ │ └── index.ts
│ └── index.ts (exports all servers)
└── types/
└── mcp-runtime.d.ts (global __mcp_call declaration)
```
**Implementation:**
```typescript
import { JSONSchema7 } from 'json-schema';
import { compile } from 'json-schema-to-typescript';
class TypeScriptGenerator {
async generateFromMCPServer(
serverName: string,
tools: MCPTool[]
): Promise<void> {
const outputDir = `generated/servers/${serverName}`;
await fs.mkdir(outputDir, { recursive: true });
for (const tool of tools) {
const tsCode = await this.generateToolFile(serverName, tool);
const fileName = this.camelToKebab(tool.name);
await fs.writeFile(`${outputDir}/${fileName}.ts`, tsCode);
}
// Generate index.ts
const indexCode = this.generateIndexFile(tools);
await fs.writeFile(`${outputDir}/index.ts`, indexCode);
}
async generateToolFile(serverName: string, tool: MCPTool): Promise<string> {
const functionName = this.toCamelCase(tool.name);
const inputTypeName = `${this.toPascalCase(tool.name)}Input`;
const outputTypeName = `${this.toPascalCase(tool.name)}Output`;
// Convert JSON Schema to TypeScript
const inputInterface = await compile(
tool.inputSchema as JSONSchema7,
inputTypeName,
{ bannerComment: '' }
);
// Output schema (inferred or defined by server)
const outputInterface = tool.outputSchema
? await compile(tool.outputSchema as JSONSchema7, outputTypeName)
: `export interface ${outputTypeName} {\n [key: string]: any;\n}`;
return `
/**
* ${tool.description || tool.name}
*/
${inputInterface}
${outputInterface}
export async function ${functionName}(
input: ${inputTypeName}
): Promise<${outputTypeName}> {
return __mcp_call('${serverName}__${tool.name}', input);
}
`.trim();
}
generateIndexFile(tools: MCPTool[]): string {
const exports = tools.map(tool => {
const fileName = this.camelToKebab(tool.name);
const functionName = this.toCamelCase(tool.name);
return `export { ${functionName} } from './${fileName}.js';`;
}).join('\n');
return exports;
}
// Utility functions
toCamelCase(str: string): string {
return str.replace(/_([a-z])/g, (_, letter) => letter.toUpperCase());
}
toPascalCase(str: string): string {
const camel = this.toCamelCase(str);
return camel.charAt(0).toUpperCase() + camel.slice(1);
}
camelToKebab(str: string): string {
return str.replace(/[A-Z]/g, letter => `-${letter.toLowerCase()}`);
}
}
```
### Component 5: Main MCP Server (Code Mode Interface)
**Purpose:** Expose a single MCP server to Claude Code with one tool: `execute_code`
**Architecture:**
```typescript
#!/usr/bin/env node
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import { MCPOrchestrator } from './orchestrator.js';
import { TypeScriptGenerator } from './generator.js';
import { CodeSandbox } from './sandbox.js';
// Initialize components
const orchestrator = new MCPOrchestrator();
const generator = new TypeScriptGenerator();
const sandbox = new CodeSandbox(orchestrator);
// Connect to MCP servers from config
const mcpServers = loadMCPConfig(); // Load from .env or config file
for (const serverConfig of mcpServers) {
await orchestrator.connectServer(serverConfig);
}
// Generate TypeScript APIs
const allTools = await orchestrator.getAllTools();
await generator.generateAPIs(allTools);
// Create MCP server for Claude Code
const server = new Server(
{ name: 'code-mode-server', version: '1.0.0' },
{ capabilities: { tools: {} } }
);
// Expose ONE tool: execute_code
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [
{
name: 'execute_code',
description: `Execute TypeScript code with access to MCP tool APIs.
Available APIs:
${generateAPIDocumentation(allTools)}
The code will run in a secure sandbox with:
- No network access
- No filesystem access
- Access only to MCP tool APIs via imports
- console.log() output will be returned to you
Example:
\`\`\`typescript
import { getDocument } from './servers/google-drive';
import { updateRecord } from './servers/salesforce';
const doc = await getDocument({ documentId: 'abc123' });
await updateRecord({
objectType: 'Account',
recordId: 'xyz',
data: { notes: doc.content }
});
console.log('Updated Salesforce with Google Doc content');
\`\`\`
`,
inputSchema: {
type: 'object',
properties: {
code: {
type: 'string',
description: 'TypeScript code to execute'
},
timeout: {
type: 'number',
description: 'Timeout in milliseconds (default: 30000)',
default: 30000
}
},
required: ['code']
}
}
]
}));
// Execute code
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const { name, arguments: args } = request.params;
if (name !== 'execute_code') {
throw new Error(`Unknown tool: ${name}`);
}
try {
const { code, timeout = 30000 } = args;
// Execute in sandbox
const { logs, result, error } = await sandbox.execute(code, { timeout });
if (error) {
return {
content: [{
type: 'text',
text: `Execution Error:\n${error}\n\nLogs:\n${logs.join('\n')}`
}],
isError: true
};
}
// Return logs to LLM
const output = [
'=== Execution Logs ===',
...logs,
'',
'=== Result ===',
JSON.stringify(result, null, 2)
].join('\n');
return {
content: [{ type: 'text', text: output }]
};
} catch (error) {
return {
content: [{
type: 'text',
text: `Sandbox Error: ${error.message}`
}],
isError: true
};
}
});
// Start server
const transport = new StdioServerTransport();
await server.connect(transport);
console.error('Code Mode MCP server started');
console.error(`Connected to ${mcpServers.length} MCP servers`);
console.error(`Generated ${allTools.length} TypeScript APIs`);
function generateAPIDocumentation(tools: MCPTool[]): string {
// Group tools by server
const byServer = new Map<string, MCPTool[]>();
for (const tool of tools) {
const [serverName] = tool.name.split('__');
if (!byServer.has(serverName)) {
byServer.set(serverName, []);
}
byServer.get(serverName)!.push(tool);
}
let doc = '';
for (const [serverName, serverTools] of byServer) {
doc += `\n**${serverName}**:\n`;
for (const tool of serverTools) {
const funcName = tool.name.split('__')[1];
doc += ` - ${funcName}()\n`;
}
}
return doc;
}
```
---
## Implementation Phases
### Phase 1: Foundation (Week 1)
**Goals:**
- ✅ Set up project structure
- ✅ Implement basic MCP orchestrator
- ✅ Connect to one test MCP server
- ✅ Generate TypeScript API from one tool
**Deliverables:**
- Working MCP orchestrator that connects to a server
- TypeScript generator that creates one API file
- Basic project structure
**Tasks:**
1. Initialize npm project with TypeScript
2. Install dependencies (`@modelcontextprotocol/sdk`, `json-schema-to-typescript`)
3. Create `MCPOrchestrator` class with `connectServer()` method
4. Create `TypeScriptGenerator` class with `generateToolFile()` method
5. Test with a simple MCP server (e.g., weather API)
### Phase 2: Sandbox Implementation (Week 2)
**Goals:**
- ✅ Choose sandbox technology (recommend: isolated-vm or Deno)
- ✅ Implement secure code execution
- ✅ Inject `__mcp_call()` binding
- ✅ Capture console output
- ✅ Handle errors and timeouts
**Deliverables:**
- Working `CodeSandbox` class
- Successful execution of simple TypeScript code
- Proper isolation (no network, no filesystem)
**Tasks:**
1. Research and choose sandbox technology
2. Implement `CodeSandbox` class with `execute()` method
3. Set up TypeScript compilation in sandbox
4. Inject MCP call binding
5. Test with sample code that calls MCP tools
### Phase 3: Integration (Week 3)
**Goals:**
- ✅ Connect all components
- ✅ Create main MCP server with `execute_code` tool
- ✅ Generate complete TypeScript APIs
- ✅ Test end-to-end workflow
**Deliverables:**
- Functional Code Mode MCP server
- Complete TypeScript API generation
- Successful multi-tool orchestration
**Tasks:**
1. Implement main MCP server (Component 5)
2. Connect orchestrator, generator, and sandbox
3. Generate APIs for all connected MCP servers
4. Test with complex multi-step workflows
5. Handle edge cases and errors
### Phase 4: Claude Code Integration (Week 4)
**Goals:**
- ✅ Register server with Claude Code
- ✅ Test with real LLM interactions
- ✅ Optimize API documentation for LLM
- ✅ Refine error messages
**Deliverables:**
- Working integration with Claude Code
- Optimized tool descriptions
- User documentation
**Tasks:**
1. Create configuration guide for Claude Code
2. Test with various prompts
3. Refine tool description for better LLM understanding
4. Create troubleshooting guide
5. Document limitations and best practices
### Phase 5: Production Hardening (Week 5-6)
**Goals:**
- ✅ Add comprehensive error handling
- ✅ Implement resource limits
- ✅ Add logging and monitoring
- ✅ Security audit
- ✅ Performance optimization
**Deliverables:**
- Production-ready Code Mode server
- Security documentation
- Performance benchmarks
- Deployment guide
**Tasks:**
1. Implement timeout handling
2. Add memory/CPU limits to sandbox
3. Create structured logging system
4. Security review of sandbox isolation
5. Load testing and optimization
6. Create deployment documentation
---
## Project Structure
```
code2mcp/
├── src/
│ ├── index.ts # Main MCP server entry point
│ ├── orchestrator/
│ │ ├── MCPOrchestrator.ts # Connects to multiple MCP servers
│ │ ├── connections.ts # Transport management
│ │ └── types.ts # Orchestrator types
│ ├── generator/
│ │ ├── TypeScriptGenerator.ts # Schema → TS conversion
│ │ ├── templates/ # Code generation templates
│ │ │ ├── function.ts.hbs
│ │ │ ├── interface.ts.hbs
│ │ │ └── index.ts.hbs
│ │ └── utils/
│ │ ├── jsonSchemaToTS.ts
│ │ └── naming.ts # camelCase, PascalCase utils
│ ├── sandbox/
│ │ ├── CodeSandbox.ts # Main sandbox implementation
│ │ ├── isolate/ # isolated-vm implementation
│ │ │ └── IsolateRunner.ts
│ │ ├── deno/ # Deno implementation (alternative)
│ │ │ └── DenoRunner.ts
│ │ ├── bindings/ # MCP call bindings
│ │ │ └── mcpBinding.ts
│ │ └── types.ts
│ ├── types/
│ │ ├── mcp.ts # MCP protocol types
│ │ └── index.ts
│ └── utils/
│ ├── logger.ts # Structured logging
│ ├── config.ts # Config loader
│ └── errors.ts # Error classes
├── generated/ # Auto-generated TypeScript APIs
│ ├── servers/
│ │ ├── google-drive/
│ │ ├── salesforce/
│ │ └── weather/
│ └── types/
│ └── mcp-runtime.d.ts
├── config/
│ ├── mcp-servers.json # MCP server configurations
│ └── sandbox.json # Sandbox settings
├── tests/
│ ├── unit/
│ │ ├── orchestrator.test.ts
│ │ ├── generator.test.ts
│ │ └── sandbox.test.ts
│ ├── integration/
│ │ └── end-to-end.test.ts
│ └── fixtures/
│ └── sample-mcp-schemas.json
├── scripts/
│ ├── generate-apis.ts # CLI to regenerate APIs
│ └── test-sandbox.ts # Test sandbox in isolation
├── build/ # Compiled output
├── package.json
├── tsconfig.json
├── .env.example
├── .gitignore
└── README.md
```
---
## Configuration
### MCP Servers Configuration
**config/mcp-servers.json:**
```json
{
"servers": [
{
"name": "google-drive",
"transport": "stdio",
"command": "node",
"args": ["/path/to/google-drive-mcp/build/index.js"],
"env": {
"GOOGLE_API_KEY": "${GOOGLE_API_KEY}"
}
},
{
"name": "salesforce",
"transport": "stdio",
"command": "node",
"args": ["/path/to/salesforce-mcp/build/index.js"],
"env": {
"SALESFORCE_API_KEY": "${SALESFORCE_API_KEY}"
}
},
{
"name": "weather",
"transport": "http",
"url": "http://localhost:3000/mcp"
}
]
}
```
### Sandbox Configuration
**config/sandbox.json:**
```json
{
"type": "isolated-vm",
"limits": {
"timeout": 30000,
"memory": 128,
"cpuTime": 10000
},
"permissions": {
"network": false,
"filesystem": false,
"env": false
}
}
```
### Claude Code Configuration
**~/.claude.json:**
```json
{
"mcpServers": {
"code-mode": {
"command": "node",
"args": ["/absolute/path/to/code2mcp/build/index.js"],
"env": {
"NODE_ENV": "production",
"GOOGLE_API_KEY": "your_key_here",
"SALESFORCE_API_KEY": "your_key_here"
}
}
}
}
```
---
## Usage Examples
### Example 1: Simple Single Tool Call
**User prompt:**
> "What's the weather in Austin, TX?"
**Claude generates:**
```typescript
import { getCurrentWeather } from './servers/weather';
const weather = await getCurrentWeather({
location: 'Austin, TX, USA'
});
console.log(`Temperature: ${weather.temperature}°F`);
console.log(`Conditions: ${weather.conditions}`);
```
**Execution:**
- Code runs in sandbox
- Calls `__mcp_call('weather__get_current_weather', { location: ... })`
- Orchestrator routes to weather MCP server
- Returns result to sandbox
- Sandbox logs output
- Claude sees logs, not raw API response
**Output to Claude:**
```
=== Execution Logs ===
Temperature: 93°F
Conditions: sunny
=== Result ===
undefined
```
### Example 2: Multi-Step Workflow (Token Savings!)
**User prompt:**
> "Get the meeting transcript from Google Drive (doc ID: abc123) and update the Salesforce lead (ID: xyz789) with the transcript content"
**Claude generates:**
```typescript
import { getDocument } from './servers/google-drive';
import { updateRecord } from './servers/salesforce';
// Fetch transcript (50,000 tokens)
const doc = await getDocument({ documentId: 'abc123' });
// Update Salesforce
await updateRecord({
objectType: 'Lead',
recordId: 'xyz789',
data: {
MeetingNotes__c: doc.content,
LastModifiedDate: new Date().toISOString()
}
});
console.log(`Updated lead ${recordId} with ${doc.content.length} characters`);
```
**Key benefit:** The 50,000-token transcript **never enters Claude's context**!
**Standard MCP approach:**
- Call 1: `google_drive__get_document` → 50K tokens returned to Claude
- Claude's context: 50K tokens
- Call 2: `salesforce__update` with 50K tokens in args
- Total context: ~100K tokens
**Code Mode approach:**
- Claude writes code: 2K tokens
- Code executes in sandbox
- Transcript stays in sandbox memory
- Only logs returned to Claude
- Total context: ~2K tokens
**Token reduction: 98%!**
### Example 3: Complex Orchestration
**User prompt:**
> "Find all Google Drive documents modified in the last week, analyze their sentiment, and create a Salesforce report"
**Claude generates:**
```typescript
import { listFiles, getDocument } from './servers/google-drive';
import { createRecord } from './servers/salesforce';
// Get documents modified in last week
const oneWeekAgo = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000);
const files = await listFiles({
modifiedAfter: oneWeekAgo.toISOString()
});
console.log(`Found ${files.length} documents`);
// Analyze sentiment (simple keyword-based)
const sentimentScores = [];
for (const file of files) {
const doc = await getDocument({ documentId: file.id });
// Simple sentiment analysis
const positiveWords = ['great', 'excellent', 'success', 'happy'];
const negativeWords = ['issue', 'problem', 'failure', 'concern'];
const positive = positiveWords.reduce((count, word) =>
count + (doc.content.match(new RegExp(word, 'gi')) || []).length, 0
);
const negative = negativeWords.reduce((count, word) =>
count + (doc.content.match(new RegExp(word, 'gi')) || []).length, 0
);
const sentiment = positive - negative;
sentimentScores.push({
title: file.title,
sentiment,
classification: sentiment > 0 ? 'Positive' : sentiment < 0 ? 'Negative' : 'Neutral'
});
console.log(`${file.title}: ${sentiment > 0 ? '+' : ''}${sentiment}`);
}
// Create Salesforce report
const reportData = {
Name: `Document Sentiment Analysis - ${new Date().toLocaleDateString()}`,
TotalDocuments__c: files.length,
PositiveDocuments__c: sentimentScores.filter(s => s.sentiment > 0).length,
NegativeDocuments__c: sentimentScores.filter(s => s.sentiment < 0).length,
NeutralDocuments__c: sentimentScores.filter(s => s.sentiment === 0).length,
Details__c: JSON.stringify(sentimentScores)
};
const report = await createRecord({
objectType: 'SentimentReport__c',
data: reportData
});
console.log(`Created Salesforce report: ${report.id}`);
```
**This workflow would be impossible with standard MCP:**
- Too many tool calls (N+2 where N = number of documents)
- Massive context bloat (all document contents in context)
- Complex orchestration logic hard to express in tool calls
**With Code Mode:**
- Claude writes natural TypeScript
- All data stays in sandbox
- Only logs returned
- Clean, maintainable code
---
## Security Considerations
### 1. Sandbox Isolation
**Threats:**
- Code escaping sandbox
- Accessing host filesystem
- Making network requests
- Consuming excessive resources
**Mitigations:**
- Use `isolated-vm` or Deno with strict permissions
- No `require()`, `import` from host filesystem
- Block all network primitives (`fetch`, `XMLHttpRequest`, `WebSocket`)
- Set memory limits (128MB default)
- Set CPU time limits (10s default)
- Set timeout limits (30s default)
### 2. API Key Protection
**Threats:**
- LLM-generated code leaking API keys
- Logging sensitive data
**Mitigations:**
- API keys stored in orchestrator, not sandbox
- `__mcp_call()` binding handles auth transparently
- Sandbox cannot access environment variables
- Filter logs for potential secrets before returning to LLM
### 3. Code Injection
**Threats:**
- Malicious user injecting harmful code
- LLM generating dangerous code patterns
**Mitigations:**
- Sandbox prevents harm (no network, no filesystem)
- Static analysis before execution (optional)
- Code review logging for audit trail
- Rate limiting on execution
### 4. Resource Exhaustion
**Threats:**
- Infinite loops
- Memory leaks
- Recursive calls
**Mitigations:**
- Hard timeout (30s default)
- Memory limits (128MB)
- CPU time limits
- Kill switch for runaway processes
---
## Performance Optimization
### 1. API Generation Caching
Don't regenerate TypeScript APIs on every request:
```typescript
class TypeScriptGenerator {
private cache: Map<string, GeneratedAPI> = new Map();
async generateAPIs(tools: MCPTool[], force = false): Promise<void> {
const hash = this.hashTools(tools);
if (!force && this.cache.has(hash)) {
console.error('Using cached TypeScript APIs');
return;
}
// Generate new APIs
await this.doGenerate(tools);
this.cache.set(hash, { tools, timestamp: Date.now() });
}
}
```
### 2. Sandbox Pooling (Optional)
For high-throughput scenarios, reuse sandbox instances:
```typescript
class SandboxPool {
private pool: CodeSandbox[] = [];
private maxSize = 10;
async acquire(): Promise<CodeSandbox> {
if (this.pool.length > 0) {
return this.pool.pop()!;
}
return new CodeSandbox(this.orchestrator);
}
release(sandbox: CodeSandbox): void {
if (this.pool.length < this.maxSize) {
sandbox.reset(); // Clear state
this.pool.push(sandbox);
} else {
sandbox.destroy();
}
}
}
```
### 3. Lazy MCP Connection
Connect to MCP servers only when needed:
```typescript
async callTool(toolName: string, args: any) {
const [serverName] = toolName.split('__');
let connection = this.connections.get(serverName);
if (!connection) {
// Lazy connect
console.error(`Lazy connecting to ${serverName}`);
connection = await this.connectServer(serverName);
}
return connection.client.request({ method: 'tools/call', params: { name, arguments: args } });
}
```
---
## Testing Strategy
### Unit Tests
**orchestrator.test.ts:**
```typescript
describe('MCPOrchestrator', () => {
it('should connect to MCP server', async () => {
const orchestrator = new MCPOrchestrator();
await orchestrator.connectServer({
name: 'test-server',
transport: 'stdio',
command: 'node',
args: ['./tests/fixtures/test-mcp-server.js']
});
const tools = await orchestrator.getAllTools();
expect(tools.length).toBeGreaterThan(0);
});
it('should route tool calls correctly', async () => {
const result = await orchestrator.callTool('test-server__echo', { message: 'hello' });
expect(result.content[0].text).toContain('hello');
});
});
```
**generator.test.ts:**
```typescript
describe('TypeScriptGenerator', () => {
it('should generate TypeScript from JSON Schema', async () => {
const tool = {
name: 'get_weather',
description: 'Get weather',
inputSchema: {
type: 'object',
properties: {
city: { type: 'string' }
},
required: ['city']
}
};
const code = await generator.generateToolFile('weather', tool);
expect(code).toContain('export interface GetWeatherInput');
expect(code).toContain('city: string');
expect(code).toContain('export async function getWeather');
});
});
```
**sandbox.test.ts:**
```typescript
describe('CodeSandbox', () => {
it('should execute simple code', async () => {
const code = `console.log('Hello, world!');`;
const { logs } = await sandbox.execute(code);
expect(logs).toContain('Hello, world!');
});
it('should block network access', async () => {
const code = `await fetch('https://example.com');`;
const { error } = await sandbox.execute(code);
expect(error).toContain('fetch is not defined');
});
it('should inject MCP binding', async () => {
const code = `
const result = await __mcp_call('test__echo', { message: 'hi' });
console.log(result);
`;
const { logs } = await sandbox.execute(code);
expect(logs).toContain('hi');
});
});
```
### Integration Tests
**end-to-end.test.ts:**
```typescript
describe('Code Mode End-to-End', () => {
it('should handle complete workflow', async () => {
// Start Code Mode server
const server = await startCodeModeServer();
// Simulate Claude Code calling execute_code
const result = await server.callTool('execute_code', {
code: `
import { getWeather } from './servers/weather';
const w = await getWeather({ city: 'Austin' });
console.log('Temperature:', w.temperature);
`
});
expect(result.content[0].text).toContain('Temperature:');
expect(result.content[0].text).toMatch(/\d+/);
});
});
```
---
## Deployment
### Local Development
```bash
# Install dependencies
npm install
# Generate TypeScript APIs from configured MCP servers
npm run generate-apis
# Start in development mode
npm run dev
# Test with MCP Inspector
npx @modelcontextprotocol/inspector build/index.js
```
### Production Deployment
**Option 1: User's Local Machine**
```bash
# Build
npm run build
# Configure Claude Code
code ~/.claude.json
# Add:
{
"mcpServers": {
"code-mode": {
"command": "node",
"args": ["/absolute/path/to/code2mcp/build/index.js"]
}
}
}
# Restart Claude Code
```
**Option 2: Docker Container**
```dockerfile
FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY build/ ./build/
COPY config/ ./config/
COPY generated/ ./generated/
CMD ["node", "build/index.js"]
```
**Option 3: Cloudflare Workers (Future)**
Once Dynamic Worker Loader API is in production:
```typescript
// Deploy to Cloudflare Workers
// Each code execution gets a fresh isolate
// Minimal latency, infinite scale
```
---
## Monitoring and Observability
### Structured Logging
```typescript
import winston from 'winston';
const logger = winston.createLogger({
format: winston.format.combine(
winston.format.timestamp(),
winston.format.json()
),
transports: [
new winston.transports.Console({ level: 'debug', stream: process.stderr })
]
});
// Usage
logger.info('Code execution started', {
codeLength: code.length,
timeout
});
logger.error('Sandbox error', {
error: error.message,
stack: error.stack
});
```
### Metrics
Track key performance indicators:
```typescript
interface Metrics {
executionCount: number;
executionTimeAvg: number;
executionTimeP95: number;
errorRate: number;
timeoutRate: number;
mcpCallCount: Map<string, number>;
}
class MetricsCollector {
track(event: 'execution_start' | 'execution_end' | 'execution_error', data: any) {
// Log to file, send to monitoring service, etc.
}
}
```
---
## Limitations and Future Work
### Current Limitations
1. **No streaming execution**: Code must complete before returning results
2. **Single-threaded**: One execution at a time per instance
3. **TypeScript only**: No Python, Go, etc.
4. **No persistent state**: Each execution is isolated
5. **Limited debugging**: No breakpoints, stepping, etc.
### Future Enhancements
1. **Streaming results**: Return logs as code executes
2. **Multi-language support**: Python, Go sandbox runners
3. **Persistent state**: Optional state sharing between executions
4. **Interactive debugging**: REPL-like capabilities
5. **Cloudflare Workers deployment**: Ultimate isolation and scale
6. **API exploration UI**: Browse available APIs in web interface
7. **Code templates**: Pre-built code snippets for common workflows
8. **Optimized API loading**: Load API definitions lazily as needed
---
## References
- [Cloudflare Code Mode Blog Post](https://blog.cloudflare.com/code-mode/)
- [Anthropic MCP Code Execution Blog](https://www.anthropic.com/news/mcp-code-execution)
- [Model Context Protocol Specification](https://modelcontextprotocol.io/)
- [MCP TypeScript SDK](https://github.com/modelcontextprotocol/typescript-sdk)
- [Cloudflare Workers Dynamic Loading](https://developers.cloudflare.com/workers/runtime-apis/bindings/worker-loader/)
- [Isolated-VM](https://github.com/laverdet/isolated-vm)
- [JSON Schema to TypeScript](https://github.com/bcherny/json-schema-to-typescript)
---
## Conclusion
Code Mode represents a paradigm shift in how AI agents interact with tools. By converting MCP tools into TypeScript APIs and having the LLM write code instead of making direct tool calls, we unlock:
- **Better tool understanding**: LLMs trained on millions of TypeScript examples
- **Massive token reduction**: Data stays in sandbox, not LLM context
- **Complex orchestration**: Multi-step workflows with natural code flow
- **Enhanced security**: API keys hidden, sandbox isolation
- **Scalability**: Handle dozens of tools without context bloat
This implementation plan provides a complete roadmap to build a production-ready Code Mode system that integrates with Claude Code, transforming how users interact with MCP servers.
The future of AI agents is **writing code, not calling tools**. Let's build it.
---
**Next Steps:**
1. Review this plan
2. Choose sandbox technology (recommend: isolated-vm for Node.js or Deno for best security)
3. Start with Phase 1: Foundation
4. Iterate and improve
Ready to revolutionize MCP usage? Let's begin. 🚀