Skip to main content
Glama

sandbox-mcp CI

Local AI agent sandbox. Run isolated Linux VMs on your Mac in ~60ms. No cloud costs. VM-level isolation via Virtualization.framework. Works with MCP clients that support local stdio servers (Claude Code, Claude Desktop, Cursor).

What this is

An MCP server that gives AI agents a sandboxed Linux environment using Apple Containerization (Virtualization.framework). Each sandbox is a real VM — not a container sharing your kernel — that boots in ~700ms and executes commands in ~60ms via a persistent shell over vsock.

Compared to cloud sandboxes (as of early 2025):

Exec latency

Cost

Isolation

This (local)

~60ms

Local hardware

VM (Virtualization.framework)

E2B

~150ms + network

$0.18/hr

Firecracker microVM

Daytona

~90ms + network

Usage-based

Docker container

Related MCP server: EdgeBox

Quick demo

Once registered, your MCP client can use the sandbox tools directly:

Agent: exec(command="uname -a")
→ Linux mcp-sb-abc123 6.12.6 #1 SMP aarch64 Linux

Agent: install(packages="python3 py3-pip")
→ Installed python3 py3-pip (1230ms)

Agent: exec(command="python3 -c 'print(sum(range(1000)))'")
→ 499500

Agent: bg(command="python3 -m http.server 8000")
→ Started [bg-a1b2c3] PID 42

Agent: expose(port=8000)
→ Forwarding localhost:8000 → 'default':8000
  Open http://localhost:8000

Cold boot is ~700ms, subsequent commands ~60ms each.

Requirements

  • Apple Silicon Mac (M1+)

  • macOS 15 Sequoia+

  • Python 3.11+

  • uv (for packaging)

Setup

1. Install Apple Containers

# Download and install the container CLI
curl -LO https://github.com/apple/containerization/releases/download/v0.9.0/container-v0.9.0.pkg
sudo installer -pkg container-v0.9.0.pkg -target /

# Start the container system (downloads kernel on first run)
container system start

# Verify it works
time container run --rm alpine echo "hello"  # ~700ms cold boot

2. Build the dev image

The included Containerfile.mcp-dev builds an Alpine image pre-loaded with Python, Node.js, Go, Rust, and standard build tools:

cd sandbox-mcp
container build -t mcp-dev -f Containerfile.mcp-dev .

3. Install the MCP server

uv sync

4. Register with your MCP client

Claude Code:

claude mcp add sandbox -- uv --directory /path/to/sandbox-mcp run sandbox-mcp

Manual (~/.claude.json):

{
  "mcpServers": {
    "sandbox": {
      "type": "stdio",
      "command": "/path/to/uv",
      "args": ["--directory", "/path/to/sandbox-mcp", "run", "sandbox-mcp"]
    }
  }
}

5. Building the optimized kernel (optional)

Apple's containerization repo includes a stripped-down Linux kernel config. Compiling it yourself doesn't meaningfully improve exec latency — the ~700ms floor is VM lifecycle overhead (Virtualization.framework + EXT4 + network + vminitd), not kernel boot. The real win is keeping VMs warm and using persistent shell exec (~60ms).

That said, if you want a smaller kernel:

git clone https://github.com/apple/containerization.git
cd containerization/kernel
make                                            # ~3 min on M-series
container system kernel set --binary ./vmlinux
container system stop && container system start

How it works

Agent ──MCP/stdio──▶ sandbox_mcp_server.py (FastMCP)
                           │
                           ├── SandboxManager
                           │     ├── _sandboxes: dict[name, Sandbox]
                           │     ├── _port_forwards: dict[port, PortForward]
                           │     ├── _sync_jobs: dict[id, SyncJob]
                           │     └── _cleanup_loop (idle TTL + child TTL)
                           │
                           ├── SandboxCtlServer (per-parent UDS listener)
                           │     └── NDJSON over /run/sandbox-ctl.sock
                           │           → spawn, list, exec, destroy, run
                           │
                           └── Sandbox (per-VM)
                                 ├── PersistentShell (container exec -i <name> sh)
                                 ├── _bg_processes: dict[id, Process]
                                 └── _audit_log: deque

Latency breakdown:

  • Cold boot: ~700ms (Virtualization.framework + EXT4 + network + vminitd)

  • Warm exec: ~60ms (command piped to persistent shell via vsock)

  • Why warm is fast: Each sandbox holds open a container exec -i <name> sh process. Commands are written to stdin with a unique end-marker, output is read until the marker appears. No process spawn overhead per command.

Port forwarding

Apple Containers v0.9.0 -p port publishing is broken (TCP connects but data never flows), and VM IPs are not routable from the host. Port forwarding works via asyncio TCP proxy:

  1. expose starts a local TCP server on 127.0.0.1:<host_port>

  2. Each incoming connection spawns container exec -i <name> nc 127.0.0.1 <container_port>

  3. Data is piped bidirectionally between the client and the nc process via vsock

Multi-sandbox

Sandboxes are named (default: "default"). Each gets isolated volumes for /workspace and package caches (apk, pip, npm). Caches persist across resets for fast reinstalls. Sandboxes can reach each other by name via /etc/hosts entries auto-injected when networking is available.

Profiles

Configure per-sandbox-name resources in SANDBOX_PROFILES at the top of the server:

SANDBOX_PROFILES = {
    "ml": {"cpus": 4, "memory": "2G"},
    "build": {"cpus": 4, "memory": "1G"},
    "nested": {"cpus": 2, "memory": "1G", "virtualization": True},
}

The virtualization flag enables nested virtualization (--virtualization). GPU/Metal passthrough is not supported by Apple Containers — the kernel has CONFIG_DRM_VIRTIO_GPU disabled and the Swift framework doesn't use VZVirtioGraphicsDeviceConfiguration.

Child sandboxes

Sandboxes can spawn child sandboxes, controlled by SPAWN_POLICIES at the top of the server. Policies define per-parent limits: max concurrent children, lifetime spawn count, CPU/memory budgets, allowed images, and TTL. Unlisted sandbox names cannot spawn.

Children are lightweight — they skip cache volumes and get their own isolated workspace. They're auto-destroyed when their TTL expires or their parent is reset/destroyed.

Setting child_can_spawn: True in a policy allows children to spawn their own children (grandchildren), up to a depth of _MAX_SPAWN_GENERATION (default 2). Grandchild policies are derived automatically — halved concurrency/budget limits, no further sub-spawning. A tree-wide budget check ensures the root sandbox's CPU/memory envelope is never exceeded regardless of spawn depth. This is off by default and not recommended for most use cases.

In-container API (sandbox-ctl)

When a sandbox has a spawn policy with inject_ctl: True (the default), the server mounts a UDS socket and the sandbox-ctl binary into the VM. Set inject_ctl: False to skip injection for sandboxes that don't need in-VM sub-launching. Code running inside the VM can then spawn/manage sibling containers:

sandbox-ctl ping                                          # verify connection
sandbox-ctl spawn --image mcp-dev --cpus 1 --memory 256M  # create child
sandbox-ctl list                                           # show children
sandbox-ctl exec <child> -- echo hello                     # run in child
sandbox-ctl destroy <child>                                # tear down
sandbox-ctl run -- echo test                               # ephemeral: spawn + exec + destroy

Communication uses NDJSON over the mounted socket (/run/sandbox-ctl.sock). The host-side SandboxCtlServer handles requests and delegates to SandboxManager.

State persistence

Sandbox-to-container mappings are saved to ~/.local/state/sandbox-mcp/state.json (schema v2). On restart, the server reconnects to any still-running containers from the previous session. Expired children are cleaned up on reconnect.

Tools (35)

Core

Tool

Description

exec

Run a shell command (~60ms)

python

Execute Python code

write_file

Write a file to the sandbox

read_file

Read a file from the sandbox

batch_write

Write multiple files in one transfer

install

Install packages via apk

env

Manage persistent environment variables

Process management

Tool

Description

bg

Run a command in the background

logs

Read output from a background process

kill

Kill a background process

Sandbox lifecycle

Tool

Description

status

Show pool and sandbox info

health

Quick liveness/disk/memory check across all sandboxes

stats

Show CPU/memory/disk usage for one sandbox

reset

Destroy and recreate (clean state)

list_all

List all active sandboxes

destroy

Permanently kill a sandbox

clone

Clone a running sandbox to a new name

history

Show recent command audit log

File transfer

Tool

Description

upload

Copy files from host into sandbox

download

Copy files from sandbox to host

git_clone

Clone a git repo (with optional auth token)

sync_start

Watch and live-sync a host directory

sync_stop

Stop a running sync job

Snapshots & images

Tool

Description

snapshot

Save sandbox state as a reusable image

restore

Boot from a saved snapshot

list_snapshots

List available snapshots

delete_snapshot

Delete a saved snapshot image

build_image

Build a container image from a Containerfile

images

List all available container images

Networking

Tool

Description

expose

Forward a sandbox port to localhost (TCP proxy)

unexpose

Stop a port forward

network_info

Show IPs and connectivity between sandboxes

Child sandboxes

Tool

Description

spawn

Spawn a child sandbox under a parent

children

List child sandboxes of a parent

destroy_child

Destroy a child sandbox

Files

File

Description

sandbox_mcp_server.py

Sandbox class, SandboxManager, MCP tool definitions

cmd/sandbox-ctl/

In-container CLI for spawning sibling sandboxes (Go)

pyproject.toml

uv/hatchling packaging, entry point sandbox-mcp

Containerfile.mcp-dev

Alpine 3.23 dev image with Python, Node, Go, Rust

tests/

pytest test suite

Customization

Edit constants at the top of sandbox_mcp_server.py:

Constant

Default

Description

DEFAULT_IMAGE

"mcp-dev"

Container image for new sandboxes

SANDBOX_CPUS

2

Default CPU cores per sandbox

SANDBOX_MEMORY

"512M"

Default memory per sandbox

IDLE_TTL

1800

Seconds before auto-destroying idle sandboxes

DEFAULT_TIMEOUT

30

Default command timeout in seconds

MAX_OUTPUT

50000

Max output bytes per command

SPAWN_POLICIES

{"default": {...}}

Per-sandbox child spawn limits and permissions

_MAX_SPAWN_GENERATION

2

Maximum spawn depth (root → child → grandchild)

Testing

uv run pytest tests/ -v

CI

Tests run on Python 3.11, 3.12, and 3.13 via GitHub Actions. Pre-push:

uv run pytest tests/ -q && python3 -m compileall sandbox_mcp_server.py tests
Install Server
A
license - permissive license
A
quality
D
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bird/sandbox-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server