Executes TypeScript and JavaScript code in a secure Deno sandbox with fine-grained permissions for file system and network access
Provides containerized deployment with security hardening, resource limits, and multi-layer isolation for production environments
Allows repository operations and GitHub API interactions through HTTP-based MCP connections with token authentication
Enables orchestration of Linear's API through code execution, supporting issue creation and project management with OAuth authentication
Runtime environment for the MCP server, enabling TypeScript execution and MCP client management
Executes Python code in isolated subprocesses with access to MCP tools and configurable security restrictions
Supports posting messages and interacting with Slack workspaces through orchestrated MCP tool calls
Primary language for code execution with full type safety, Zod validation, and sandbox isolation
Code Executor MCP
Stop hitting the 2-3 MCP server wall. One MCP to orchestrate them all - 98% token savings, unlimited tool access.
Why Use Code Executor MCP?
98% Token Reduction - 141k → 1.6k tokens (load 1 executor vs 50+ tools)
Sandboxed Security - Isolated Deno/Python execution, no internet by default, audit logging
Type-Safe Wrappers - Auto-generated TypeScript/Python SDK with full IntelliSense
Progressive Disclosure - Tools loaded on-demand inside sandbox, not upfront in context
Zero Config Setup - Wizard auto-detects existing MCP servers from Claude Code/Cursor
Production Ready - 606 tests, 95%+ coverage, Docker support, rate limiting
The Problem
You can't use more than 2-3 MCP servers before context exhaustion kills you.
Research confirms: Tool accuracy drops significantly after 2-3 servers
6,490+ MCP servers available, but you can only use 2-3
47 tools = 141k tokens consumed before you write a single word
You're forced to choose: filesystem OR browser OR git OR AI tools. Never all of them.
The Solution
Disable all MCPs. Enable only code-executor-mcp.
Inside the sandbox, access ANY MCP tool on-demand:
Result: Unlimited MCP access, zero context overhead.
How Progressive Disclosure Works
Traditional MCP exposes all 47 tools upfront (141k tokens). Code Executor exposes 2 tools with outputSchema (1.6k tokens), loading others on-demand inside the sandbox when needed.
Quick Start
Option 1: Interactive Setup Wizard (Recommended)
Don't configure manually. Our wizard does everything:
What the wizard does:
🔍 Scans for existing MCP configs (Claude Code
~/.claude.json, Cursor~/.cursor/mcp.json, project.mcp.json)⚙️ Configures with smart defaults (or customize interactively)
🤖 NEW: Writes complete MCP configuration (sampling + security + sandbox + performance)
📦 Generates type-safe TypeScript/Python wrappers for autocomplete
📅 Optional: Sets up daily sync to keep wrappers updated
Complete Configuration (all written automatically):
AI Sampling: Multi-provider support (Anthropic, OpenAI, Gemini, Grok, Perplexity)
Security: Audit logging, content filtering, project restrictions
Sandbox: Deno/Python execution with timeouts
Performance: Rate limiting, schema caching, execution timeouts
Smart defaults (just press Enter):
Port: 3333 | Timeout: 120s | Rate limit: 60/min
Audit logs:
~/.code-executor/audit-logs/Sampling: Disabled (enable optionally with API key)
Supported AI Tools: Claude Code and Cursor (more coming soon)
First-Run Detection:
If you try to run code-executor-mcp without configuration:
What are Wrappers?
The wizard generates TypeScript/Python wrapper functions for your MCP tools:
Before (manual):
After (wrapper):
Benefits:
✅ Type-safe with full IntelliSense/autocomplete
✅ Self-documenting JSDoc comments from schemas
✅ No need to remember exact tool names
✅ Matches actual MCP tool APIs (generated from schemas)
Keeping Wrappers Updated:
The wizard can set up daily sync (optional) to automatically regenerate wrappers:
macOS: launchd plist runs at 4-6 AM
Linux: systemd timer runs at 4-6 AM
Windows: Task Scheduler runs at 4-6 AM
Daily sync re-scans your AI tool configs and project config for new/removed MCP servers. You can also manually update anytime with code-executor-mcp setup.
Option 2: Manual Configuration
1. Install
2. Configure
IMPORTANT: Code-executor discovers and merges MCP servers from BOTH locations:
Global:
~/.claude.json(cross-project MCPs like voice-mode, personal tools)Project:
.mcp.json(team-shared MCPs in your project root)
Config Merging: Global MCPs + Project MCPs = All available (project overrides global for duplicate names)
Add to your project .mcp.json or global ~/.claude.json:
Configuration Guide:
MCP_CONFIG_PATH: Optional - points to project.mcp.json(still discovers global~/.claude.json)DENO_PATH: Runwhich denoto find it (required for TypeScript execution)Global MCPs (
~/.claude.json): Personal servers available across all projectsProject MCPs (
.mcp.json): Team-shared servers in version controlConnection Flow: Claude Code → code-executor ONLY, then code-executor → all other MCPs
Quick Setup:
Minimal (Python-only):
3. Use
Claude can now access any MCP tool through code execution:
That's it. No configuration, no allowlists, no manual tool setup.
Real-World Example
Task: "Review auth.ts for security issues and commit fixes"
Without code-executor (impossible - hit context limit):
With code-executor (single AI message):
All in ONE tool call. Variables persist, no context switching.
Features
Feature | Description |
98% Token Savings | 141k → 1.6k tokens (47 tools → 2 tools) |
Unlimited MCPs | Access 6,490+ MCP servers without context limits |
Multi-Step Workflows | Chain multiple MCP calls in one execution |
Auto-Discovery | AI agents find tools on-demand (0 token cost) |
Deep Validation | AJV schema validation with helpful error messages |
Security | Sandboxed (Deno/Python), allowlists, audit logs, rate limiting |
Production Ready | TypeScript, 606 tests, 95%+ coverage, Docker support |
MCP Sampling (Beta) - LLM-in-the-Loop Execution
New in v1.0.0: Enable Claude to call itself during code execution for dynamic reasoning and analysis.
What is Sampling?
MCP Sampling allows TypeScript and Python code running in sandboxed environments to invoke Claude (via Anthropic's API) through a simple interface. Your code can now "ask Claude for help" mid-execution.
Use Cases:
Code Analysis: Read a file, ask Claude to analyze it for security issues
Multi-Step Reasoning: Have Claude break down complex tasks into steps
Data Processing: Process each file/record with Claude's intelligence
Interactive Debugging: Ask Claude to explain errors or suggest fixes
Quick Example
TypeScript:
Python:
API Reference
TypeScript API:
llm.ask(prompt: string, options?)- Simple query, returns response textllm.think({messages, model?, maxTokens?, systemPrompt?})- Multi-turn conversation
Python API:
llm.ask(prompt: str, system_prompt='', max_tokens=1000)- Simple queryllm.think(messages, model='', max_tokens=1000, system_prompt='')- Multi-turn conversation
Security Controls
Sampling includes enterprise-grade security controls:
Control | Description |
Rate Limiting | Max 10 rounds, 10,000 tokens per execution (configurable) |
Content Filtering | Auto-redacts secrets (API keys, tokens) and PII (emails, SSNs) |
System Prompt Allowlist | Only pre-approved prompts accepted (prevents prompt injection) |
Bearer Token Auth | 256-bit secure token per bridge session |
Localhost Binding | Bridge server only accessible locally (no external access) |
Audit Logging | All calls logged with SHA-256 hashes (no plaintext secrets) |
Configuration
Enable Sampling:
Option 1 - Per-Execution (recommended):
Option 2 - Environment Variable:
Option 3 - Config File (~/.code-executor/config.json):
Hybrid Architecture
Code Executor automatically detects the best sampling method:
MCP SDK Sampling (free) - If your MCP client supports
sampling/createMessageDirect Anthropic API (paid) - Fallback if MCP sampling unavailable (requires
ANTHROPIC_API_KEY)
⚠️ Claude Code Limitation (as of November 2025):
Claude Code does not support MCP sampling yet (Issue #1785). When using Claude Code, sampling will fall back to Direct API mode (requires ANTHROPIC_API_KEY).
Compatible clients with MCP sampling:
✅ VS Code (v0.20.0+)
✅ GitHub Copilot
❌ Claude Code (pending Issue #1785)
When Claude Code adds sampling support, no code changes are needed - it will automatically switch to free MCP sampling.
Documentation
See the comprehensive sampling guide: docs/sampling.md
Covers:
What/Why/How with architecture diagrams
Complete API reference for TypeScript & Python
Security model with threat matrix
Configuration guide (env vars, config file, per-execution)
Troubleshooting guide (8 common errors)
Performance benchmarks (<50ms bridge startup)
FAQ (15+ questions)
Security (Enterprise-Grade)
Code Executor doesn't just "run code." It secures it:
Feature | Implementation |
Sandbox Isolation | Deno (TypeScript) with restricted permissions, Pyodide WebAssembly (Python) |
File Access Control | Non-root user (UID 1001), read-only root FS, explicit project path allowlist |
Network Policy | NO internet by default, explicit domain allowlist required |
Path Validation | Symlink resolution, directory traversal protection |
Audit Logging | Every execution logged to |
Rate Limiting | 30 requests/min default (configurable per MCP) |
Dangerous Pattern Detection | Blocks eval, exec, import, pickle.loads |
Schema Validation | AJV deep validation before execution |
SSRF Protection | Blocks AWS metadata, localhost, private IPs |
Example: Block all internet except GitHub API:
Sandbox Architecture:
Deno (TypeScript): V8 isolate, explicit permissions (--allow-read=/tmp), no shell access
Pyodide (Python): WebAssembly sandbox, virtual filesystem, network restricted to MCP proxy only
Docker (Optional): Non-root container, read-only root, minimal attack surface
See SECURITY.md for complete threat analysis and security model.
Advanced Usage
Allowlists (Optional Security)
Restrict which tools can be executed:
Discovery Functions
AI agents can explore available tools:
Zero token cost - discovery functions hidden from AI agent's tool list.
MCP Sampling: LLM-in-the-Loop Execution
Enable AI to autonomously call other AIs inside sandboxed code for iterative problem-solving, multi-agent collaboration, and complex workflows.
Key Features:
Multi-Provider Support: Anthropic, OpenAI, Gemini, Grok, Perplexity
Hybrid Mode: Free MCP sampling with automatic fallback to paid API
Simple API:
llm.ask(prompt)andllm.think(messages)helpersSecurity: Rate limiting, content filtering, localhost-only bridge
Setup:
See SAMPLING_SETUP.md for complete setup guide.
Basic Usage:
Advanced Example - Multi-Agent Code Review:
5 AI agents collaborate to review, secure, refactor, test, and document code:
Real-World Results:
5 AI agents, 10 seconds, ~2,600 tokens
Complete code transformation: review → secure → refactor → test → document
See
examples/multi-agent-code-review.tsfor full working example
Use Cases:
🤖 Multi-agent systems (code review, planning, execution)
🔄 Iterative refinement (generate → validate → improve loop)
🧪 Autonomous testing (generate tests, run them, fix failures)
📚 Auto-documentation (analyze code, write docs, validate examples)
Multi-Action Workflows
Complex automation in a single tool call:
State persists across calls - no context switching.
Python Execution (Pyodide WebAssembly)
Secure Python execution with Pyodide sandbox:
Enable Python:
Example - Python with MCP tools:
Security guarantees:
✅ WebAssembly sandbox (same security as Deno)
✅ Virtual filesystem (no host file access)
✅ Network restricted to authenticated MCP proxy
✅ No subprocess spawning
✅ Memory limited by V8 heap
Limitations:
Pure Python only (no native C extensions unless WASM-compiled)
~2-3s first load (Pyodide npm package), <100ms cached
No multiprocessing/threading (use async/await)
10-30% slower than native Python (WASM overhead acceptable for security)
See SECURITY.md for complete security model.
Installation Options
npm (Recommended)
Docker (Production)
Quick Start:
With docker-compose (Recommended):
First-Run Auto-Configuration: Docker deployment automatically generates complete MCP configuration from environment variables on first run:
✅ All environment variables → comprehensive config
✅ Includes sampling, security, sandbox, and performance settings
✅ Config saved to
/app/config/.mcp.json✅ Persistent across container restarts (use volume mount)
See DOCKER_TESTING.md for security details and docker-compose.example.yml for all available configuration options.
Local Development
Configuration
Complete Example (.mcp.json):
Environment Variables:
Variable | Required | Description | Example |
| ⚠️ Optional | Explicit path to project |
|
| ✅ For TypeScript | Path to Deno binary |
|
| ⚠️ Recommended | Enable security audit logging |
|
| No | Custom audit log location |
|
| ⚠️ Recommended | Restrict file access |
|
| No | Enable Python executor |
|
Security Note: Store API keys in environment variables, not directly in config files.
Multi-Provider AI Sampling Configuration
NEW: Support for 5 AI providers (Anthropic, OpenAI, Gemini, Grok, Perplexity) with automatic provider-specific model selection.
Quick Setup:
Provider Comparison (January 2025):
Provider | Default Model | Cost (Input/Output per MTok) | Best For |
Gemini ⭐ |
| $0.10 / $0.40 | Cheapest + FREE tier |
Grok |
| $0.20 / $0.50 | 2M context, fast |
OpenAI |
| $0.15 / $0.60 | Popular, reliable |
Perplexity |
| $1.00 / $1.00 | Real-time search |
Anthropic |
| $1.00 / $5.00 | Premium quality |
Configuration Options: See .env.example for full list of sampling configuration options including:
API keys for all providers
Model allowlists
Rate limiting & quotas
Content filtering
System prompt controls
Auto-discovery (NEW in v0.7.3): Code-executor automatically discovers and merges:
~/.claude.json(global/personal MCPs).mcp.json(project MCPs)MCP_CONFIG_PATHif set (explicit override, still merges with global)
No configuration needed - just add MCPs to either location and code-executor finds them all!
TypeScript Support
Full type definitions included:
Performance
Metric | Value |
Token savings | 98% (141k → 1.6k) |
Tool discovery | <5ms (cached), 50-100ms (first call) |
Validation | <1ms per tool call |
Sandbox startup | ~200ms (Deno), ~2-3s first/~100ms cached (Pyodide) |
Test coverage | 606 tests, 95%+ security, 90%+ overall |
Documentation
AGENTS.md - Repository guidelines for AI agents
CONTRIBUTING.md - Development setup and workflow
SECURITY.md - Security model and threat analysis
DOCKER_TESTING.md - Docker security details
CHANGELOG.md - Version history
FAQ
Q: Do I need to configure each MCP server?
A: No. Code-executor auto-discovers MCPs from ~/.claude.json (global) AND .mcp.json (project). Just add MCPs to either location.
Q: How does global + project config merging work? A: Code-executor finds and merges both:
Global (
~/.claude.json): Personal MCPs available everywhereProject (
.mcp.json): Team MCPs in version controlResult: All MCPs available, project configs override global for duplicate names
Q: How does validation work? A: AJV validates all tool calls against live schemas. On error, you get a detailed message showing expected parameters.
Q: What about Python support?
A: Full Python sandbox via Pyodide WebAssembly. Requires PYTHON_SANDBOX_READY=true environment variable. Same security model as Deno (WASM isolation, virtual FS, network restricted). Pure Python only - no native C extensions unless WASM-compiled. See SECURITY.md for details.
Q: Can I use this in production? A: Yes. 606 tests, 95%+ coverage, Docker support, audit logging, rate limiting.
Q: Does this work with Claude Code only? A: Built for Claude Code. Untested on other MCP clients, but should work per MCP spec.
Contributing
Contributions welcome! See CONTRIBUTING.md for guidelines.
Code Quality Standards:
✅ TypeScript strict mode
✅ 90%+ test coverage on business logic
✅ ESLint + Prettier
✅ All PRs require passing tests
License
MIT - See LICENSE
Links
Docker Hub: https://hub.docker.com/r/aberemia24/code-executor-mcp
Issues: https://github.com/aberemia24/code-executor-MCP/issues
Built with Claude Code | Based on Anthropic's Code Execution with MCP