Which integrations are available for this server?

Integrates with Google Gemini's API to provide LLM-powered chat capabilities within the sandboxed code execution environment. Integrates with Ollama's API to provide LLM-powered chat capabilities within the sandboxed code execution environment. Integrates with OpenAI's API to provide LLM-powered chat capabilities within the sandboxed code execution environment.

How do I use Sandbox Agent?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Sandbox Agent Run Python code to plot a sine wave" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

Sandbox Agent

by eduresser

Overview Schema Related Servers Score Discussions

Python

Local

Sandbox Agent

LangGraph agent with Docker-based sandboxed code execution. Each session runs in an isolated, hardened Docker container with a persistent kernel — IPython for Python, vm.createContext for Node.js, and a dedicated R environment. Supports 3 runtimes, provider-agnostic LLM configuration, and vision (auto-detection of multimodal models). Available as an interactive CLI, MCP server (Cursor, Claude Desktop), REST API (Aegra), and React frontend.

Features

Docker isolation — each session runs in its own container, no ports exposed, no host volumes
Hardened containers — non-root user (UID 65532), PID limits, memory+swap limits, tmpfs-only writable dirs, no-new-privileges
Crash detection — OOM-kill, fork bombs, segfaults are detected and reported clearly to the agent
Persistent state — variables survive between code executions (like Jupyter cells)
Checkpointer PostgreSQL — conversation history persists across restarts (shared with Aegra)
Async support — Promises (Node.js) and coroutines (Python) are automatically awaited
Multi-runtime — Python, Node.js, and R
Rich display outputs — captures matplotlib/ggplot figures, Plotly charts, IPython Audio, HTML widgets, and more; auto-sends images to multimodal LLMs
Provider-agnostic — works with OpenAI, Anthropic, Google Gemini, Ollama, or any compatible provider via langchain init_chat_model
Runtime package install — pip install / npm install / install.packages() at session creation or via terminal
6 tools — create_session, execute_code, execute_terminal, import_files, export_files, stop_session
MCP server — expose the same tools via Model Context Protocol (stdio transport)
REST API — full LangGraph Platform API via Aegra with OpenAPI docs, streaming, thread management
Input validation — Pydantic schemas validate all tool inputs before execution, returning structured errors on failure
React frontend — SPA with chat, tool visualization, file upload/download, settings dialog (React 19 + Vite + Tailwind CSS)
File upload — upload files to the API for import into sandbox sessions (POST /threads/{id}/files/upload)
File export — register files for download (no host copy); download via API or use in cross-session import
File import — import from host paths, inline content, or from another session (files exported in same conversation)
Cross-session transfer — export from session A, import into session B with {session_id, path}
Session garbage collection — idle timeout, max lifetime, thread eviction, orphan container cleanup
Auto-cleanup — all containers are stopped and removed when the agent exits

Related MCP server: Docker MCP Server

Prerequisites

Python 3.11+
Docker Engine
API key for your LLM provider (CHAT_MODEL_API_KEY)
PostgreSQL (for API/CLI mode — checkpointer + Aegra)
Node.js 18+ and npm (for the React frontend)

Setup

# Docker — installs (if needed), configures permissions, and builds all 3 images
sudo ./setup-docker.sh

# Install Python dependencies (open a new terminal so the docker group is active)
uv sync

# Install frontend dependencies
cd frontend && npm install && cd ..

# Configure environment
cp .env.example .env
# Edit .env with your CHAT_MODEL_API_KEY, POSTGRES_PASSWORD, and other settings

# Docker images are also built automatically on first use if not already present

PostgreSQL (required for CLI, API, and UI)

PostgreSQL is auto-started via Docker Compose when using localhost. The CLI detects if PostgreSQL is reachable and starts it automatically:

# Manual start (if needed)
docker compose up postgres -d

Or point to an existing PostgreSQL instance via POSTGRES_* env vars in .env.

Usage

All commands use the unified sandbox-agent entry point:

uv run sandbox-agent cli       # Interactive CLI (default)
uv run sandbox-agent mcp       # MCP server (Cursor, Claude Desktop)
uv run sandbox-agent api       # REST API (Aegra, no reload)
uv run sandbox-agent api dev   # REST API with hot reload
uv run sandbox-agent ui        # React UI (auto-starts API if needed)

CLI

uv run sandbox-agent cli
# or simply
uv run sandbox-agent

The CLI operates as a thin client on top of the Aegra REST API. Requires the API to be running (uv run sandbox-agent api). Features:

Rich panels with syntax-highlighted tool I/O (per-runtime lexer)
Streaming agent output with Markdown rendering
Persistent thread across restarts (~/.local/state/sandbox-agent/cli-thread.json)
/new command to start a fresh conversation
Passes model/provider/key settings to the API via configurable

MCP Server

Run the MCP server (stdio transport) for integration with Cursor, Claude Desktop, or any MCP-compatible client:

uv run sandbox-agent mcp

Cursor or Claude Desktop

Add the following MCP config:

{
  "mcpServers": {
    "sandbox-agent": {
      "command": "uv",
      "args": ["--directory", "/path/to/sandbox-agent", "run", "sandbox-agent", "mcp"]
    }
  }
}

The MCP server exposes the same 6 tools as the CLI agent with identical behavior. It maintains a persistent thread_id in ~/.local/state/sandbox-agent/mcp-thread.json for export URL consistency.

The import_files tool accepts file content directly (as text or base64 via file_content/encoding keys), host paths (via source/destination), or cross-session references (session_id+path). The export_files tool registers files for download via GET /threads/{thread_id}/files/download?session_id=...&path=....

REST API (Aegra)

Run the agent as a REST API via Aegra (self-hosted LangGraph Platform alternative):

uv run sandbox-agent api       # Production mode (no reload, auto-starts PostgreSQL)
uv run sandbox-agent api dev   # Development mode (hot reload via aegra dev)

The production command auto-starts PostgreSQL via Docker Compose if it's not reachable on localhost. The server runs at http://localhost:8000 with OpenAPI docs at /docs. Use the LangGraph SDK or curl to create assistants, threads, and stream runs. Compatible with Agent Chat UI, LangGraph Studio, and CopilotKit.

Custom endpoints:

GET /threads/{thread_id}/files/download?session_id=...&path=... — streams exported files from containers
POST /threads/{thread_id}/files/upload — uploads files to be available for import into sandbox sessions
DELETE /threads/{thread_id} — also cleans up Docker sessions and storage for that thread (via middleware)
GET /settings — returns persisted frontend settings merged over backend .env defaults
PUT /settings — persist frontend settings to PostgreSQL (encrypted)

React Frontend

A web UI for chatting with the agent via the Aegra API (React 19 + Vite + Tailwind CSS):

# Install frontend dependencies (if not done during setup)
cd frontend && npm install && cd ..

# Start the UI (auto-starts API + PostgreSQL if needed)
uv run sandbox-agent ui

The frontend runs at http://localhost:5173 (Vite dev server with API proxy to :8000). Features:

Thread management (create, resume, delete conversations) via sidebar
Streaming responses with expandable tool blocks (syntax-highlighted per runtime)
File upload and download support
Thinking block visualization
Settings dialog (model, provider, API key, base URL, vision toggle)
Persistent settings via server-side API (GET/PUT /settings), with backend .env defaults as fallback

Programmatic

from sandbox_agent.sandbox import SandboxManager

manager = SandboxManager()

info = manager.create_session(
    runtime="python",
    dependencies={"pandas": "2.2.3", "matplotlib": ""},
)
sid = info.session_id

r1 = manager.execute_code(sid, """
import pandas as pd
df = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]})
print(df.describe())
""")
print(r1.stdout)

# Variables persist between calls
r2 = manager.execute_code(sid, "df.shape")
print(r2.result)

# Export files from the sandbox (registers for download, no host copy)
manager.execute_code(sid, "df.to_csv('/workspace/output.csv', index=False)")
export = manager.export_files(sid, [{"source": "output.csv"}])
print(export.files[0].session_id, export.files[0].path)

manager.stop_session(sid)

Exporting Files

export_files registers files for download and cross-session import (no host copy). Files become available via the API (GET /threads/{thread_id}/files/download?session_id=...&path=...) and for import_files in other sessions:

# Export a single file
result = manager.export_files(sid, [{"source": "report.pdf"}])

# Export an entire directory
result = manager.export_files(sid, [{"source": "results/"}])

# Export multiple files at once
result = manager.export_files(sid, [
    {"source": "data.csv"},
    {"source": "chart.png"},
    {"source": "/workspace/logs/"},
])

for f in result.files:
    print(f"{f.session_id}:{f.path} ({'OK' if f.success else f.error})")

Cross-Session File Transfer

Use export_files + import_files to move files between sessions (even across different runtimes):

# Session A (Python): produce data
sid_a = manager.create_session(runtime="python", dependencies={"pandas": ""}).session_id
manager.execute_code(sid_a, """
import pandas as pd
df = pd.DataFrame({'x': [1,2,3], 'y': [4,5,6]})
df.to_csv('/workspace/data.csv', index=False)
""")
export = manager.export_files(sid_a, [{"source": "data.csv"}])
path = export.files[0].path  # /workspace/data.csv

# Session B (R): consume the same data
sid_b = manager.create_session(runtime="r", dependencies={"readr": ""}).session_id
manager.import_files(sid_b, [{"session_id": sid_a, "path": path, "destination": "data.csv"}])
manager.execute_code(sid_b, 'df <- readr::read_csv("/workspace/data.csv"); summary(df)')

Importing Files

import_files copies files into the sandbox from the host or from another session:

# Import from host
result = manager.import_files(sid, [
    {"source": "/home/user/data.csv", "destination": "data.csv"},
    {"source": "/home/user/project/", "destination": "project/"},
])

# Import from another session (file must have been exported first)
result = manager.import_files(sid, [
    {"session_id": "abc123", "path": "/workspace/out.csv", "destination": "out.csv"},
])

Other runtimes work the same way — pass runtime="node" or runtime="r" to create_session.

Async Code

Node.js — if the last expression returns a Promise, the kernel awaits it before collecting output. Top-level await is also supported (falls back to an async IIFE wrapper when needed).

const axios = require('axios');
async function fetchData() {
    const resp = await axios.get('https://api.example.com/data');
    console.log(resp.data);
}
fetchData(); // Promise is awaited automatically

Python — IPython's autoawait handles top-level await. If a cell returns an unawaited coroutine, the kernel detects it and runs it with asyncio.run().

import aiohttp

async def fetch_data():
    async with aiohttp.ClientSession() as session:
        resp = await session.get('https://api.example.com/data')
        print(await resp.text())

fetch_data()  # coroutine is detected and executed automatically

Container Security

Each container is created with the following protections:

Protection	Setting	Effect
Memory limit	`2048m` (no swap)	OOM-kill on overflow, host unaffected
PID limit	`512`	Fork bombs are contained and killed
CPU quota	`2` cores	Prevents CPU starvation on host
Writable dirs	tmpfs (`/workspace`, `/tmp`, `/home/sandbox`)	tmpfs dirs never touch host disk
tmpfs size	`200m` per mount	Limits in-container disk usage
User	`sandbox` (UID 65532)	No root inside container
Privileges	`no-new-privileges`	Cannot escalate via setuid/setgid
Network	Configurable (enabled by default)	Can be disabled per session

When a container crashes, the agent receives a clear CONTAINER_DIED error with the reason (OOM-killed, SIGKILL, segfault, etc.) and a hint to recreate the session.

Session Lifecycle

Sessions are automatically managed with garbage collection:

Behavior	Default	Setting
Idle timeout	30 min	`SESSION_IDLE_TTL_SECONDS`
Max lifetime	2 hours	`SESSION_MAX_LIFETIME_SECONDS`
GC interval	60 sec	`SESSION_GC_INTERVAL_SECONDS`
Max active threads	10	`SESSION_MAX_ACTIVE_THREADS`
Max sessions (global)	5	`CONTAINER_MAX_SESSIONS`
Max sessions per thread	3	`CONTAINER_MAX_SESSIONS_PER_THREAD`
Orphan cleanup age	5 min	`CONTAINER_ORPHAN_MIN_AGE_SECONDS`

On startup, the manager removes orphan containers older than the minimum age. On exit, all containers are stopped and removed via atexit and signal handlers (SIGTERM/SIGINT).

Configuration

All settings can be overridden via environment variables or .env. Defaults are shown from settings.py:

# ── LLM (provider-agnostic) ──
CHAT_MODEL=gpt-4o                    # Model name
CHAT_MODEL_PROVIDER=openai           # Provider: openai, anthropic, google_genai, ollama
CHAT_MODEL_API_KEY=sk-...            # API key (required)
CHAT_MODEL_BASE_URL=                 # Custom API base URL (optional)
CHAT_MODEL_SUPPORTS_VISION=          # Override vision detection (true/false, empty = auto)

# ── Container Limits ──
CONTAINER_MEMORY_LIMIT=2048m         # Docker memory limit (no swap)
CONTAINER_CPU_QUOTA=200000           # CPU quota (100000 = 1 core)
CONTAINER_PIDS_LIMIT=512             # Max PIDs per container
CONTAINER_TMPFS_SIZE=200m            # tmpfs size for writable dirs
CONTAINER_EXECUTION_TIMEOUT_SECONDS=30  # Default code execution timeout
CONTAINER_MAX_SESSIONS=5             # Max concurrent sessions (global)
CONTAINER_MAX_SESSIONS_PER_THREAD=3  # Max sessions per conversation
CONTAINER_EXECUTE_AS_ROOT=False      # Run terminal commands as root
CONTAINER_NETWORK_ENABLED=True       # Enable container networking (disable per session)
CONTAINER_ORPHAN_MIN_AGE_SECONDS=300 # Min age before orphan cleanup (5 min)

# ── Session Lifecycle / GC ──
SESSION_IDLE_TTL_SECONDS=1800        # Idle timeout (30 min)
SESSION_MAX_LIFETIME_SECONDS=7200    # Hard lifetime cap (2 hours)
SESSION_GC_INTERVAL_SECONDS=60       # GC check interval
SESSION_MAX_ACTIVE_THREADS=10        # Max active threads before eviction

# ── Output Truncation (characters) ──
MAX_STDOUT_CHARS=50000
MAX_STDERR_CHARS=120000
MAX_RESULT_CHARS=30000
MAX_TRACEBACK_CHARS=8000

# ── Encryption ──
ENCRYPTION_KEY=                      # Fernet key for settings encryption (optional)

# ── Storage ──
STORAGE_DIR=./storage                # Base dir for uploads
IMPORT_ALLOWED_DIRS=                 # Comma-separated host dirs allowed for import (empty = all)

# ── API ──
API_BASE_URL=http://127.0.0.1:8000   # API URL (for export download URLs)

# ── Agent ──
MAX_ITERATIONS=25                    # Max LangGraph iterations (recursion limit)

# ── PostgreSQL (checkpointer + Aegra) — all required, no defaults ──
POSTGRES_USER=sandbox_agent
POSTGRES_PASSWORD=sandbox_agent_secret
POSTGRES_DB=sandbox_agent
POSTGRES_HOST=localhost
POSTGRES_PORT=5432

Runtimes

Runtime	Base Image	Kernel	IPC	Pre-installed
Python	`python:3.12-slim`	IPython shell	UNIX socket	IPython + system libs
Node.js	`node:22-slim`	`vm.createContext`	UNIX socket	Bare runtime
R	`rocker/r-ver:4`	Dedicated R env	TCP `:8765`	jsonlite, base64enc, tidyverse, data.table, readxl, haven, httr2, DBI, RSQLite, rmarkdown, knitr, devtools, glmnet, randomForest

The R container uses a compiled C client binary for IPC, while Python and Node.js use native clients.

Architecture

flowchart TB
    CLI["CLI · Rich REPL"]
    MCP["MCP Server · FastMCP (stdio)"]
    UI["React · Frontend"]

    CLI --> API["Aegra REST API
    (LangGraph Platform)"]
    UI --> API
    API --> Agent["LangGraph ReAct Agent"]
    Agent --> Tools["LangChain Tools"]
    MCP --> Core["Core Tool Functions"]

    Tools --> Core
    Core --> SM["SandboxManager
    Docker SDK"]

    SM -->|"docker exec -i + JSON pipe"| Docker

    subgraph Docker ["Docker Containers
    isolated, hardened"]
        direction LR
        PY["Python
        IPython · UNIX socket"]
        JS["Node.js
        vm.createContext · UNIX socket"]
        R["R
        R env · TCP :8765"]
    end

    subgraph Storage ["Persistence"]
        PG["PostgreSQL
        checkpoints, exports"]
    end

    API --> PG
    SM --> PG

Inside each container, a persistent kernel (PID 1) holds execution state, and an ephemeral client connects to it via UNIX socket (Python/Node.js) or TCP (R) for each docker exec call:

flowchart TB
    SM["SandboxManager"] -->|"docker exec -i"| Client["Client (ephemeral)"]

    subgraph container ["Container"]
        Client -->|"UNIX socket / TCP"| Kernel["Kernel (PID 1, persistent)"]
        Kernel --- State["State
        variables, imports, data"]
    end

Testing

# Unit tests (no Docker required)
uv run pytest tests/test_cli.py tests/test_http_app.py -v

# Integration tests (requires Docker)
uv run pytest tests/test_manager.py tests/test_tools.py tests/test_export_files.py tests/test_mcp.py -v

# LangGraph debug trace (requires Docker + LLM API key)
uv run pytest tests/test_langgraph_debug.py -v -s

# API integration tests (requires Docker + running API: uv run sandbox-agent api dev)
uv run pytest tests/test_api.py -v -s

# Full suite
uv run pytest tests/ -v

Production Deployment

A production Dockerfile and docker-compose.yml are included:

# Start PostgreSQL + API
docker compose up -d

# Or build and run manually
docker build -t sandbox-agent-api .
docker run -p 8000:8000 --env-file .env sandbox-agent-api

The production image uses aegra serve with a non-root app user.

License

MIT — Eduardo Ramon Resser

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

5dRelease cycle

6Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Related MCP Servers

Sandbox MCP Server
Code Execution Virtualization Developer Tools
Tsuchijo
F
license
-
quality
D
maintenance
Provides isolated Docker environments for code execution, enabling users to create and manage containers, execute multi-language code, save and reproduce development environments, ensuring security and isolation.
Last updated 2025-01-11
17
Docker MCP Server
Code Execution Virtualization
cevatkerim
A
license
-
quality
D
maintenance
Enables LLMs to safely execute code in isolated Docker containers with resource limits and security controls, supporting session management and automatic dependency installation.
Last updated 2025-08-08
MIT
CodeBox-AI
Code Execution Developer Tools
tomconte
A
license
-
quality
D
maintenance
MCP server for secure, session-based Python code execution in Docker containers, enabling LLM applications to run code, manage state, and access files.
Last updated 2025-05-20
9
MIT
mcp-gateway-sandbox
Code Execution API Testing
Abacop6999
F
license
-
quality
C
maintenance
Enables executing arbitrary code and Postman collections in isolated Docker sandboxes via an MCP server, with a real-time web dashboard for management.
Last updated 2026-06-20

View all related MCP servers

Related MCP Connectors

Workflow
Create, browse, remix, collaborate on, and run durable AI workflow nodes from MCP hosts.
ContextStream
Persistent memory and cross-session learning for AI coding assistants (hosted remote MCP).
neverinfamous-memory-journal-mcp
A MCP server built for developers enabling Git based project management with project and personal…

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/eduresser/sandbox-agent'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

Sandbox Agent

Features

Prerequisites

Setup

PostgreSQL (required for CLI, API, and UI)

Usage

CLI

MCP Server

Cursor or Claude Desktop

REST API (Aegra)

React Frontend

Programmatic

Exporting Files

Cross-Session File Transfer

Importing Files

Async Code

Container Security

Session Lifecycle

Configuration

Runtimes

Architecture

Testing

Production Deployment

License

Maintenance

Resources

Looking for Admin?

Tools

Related MCP Servers

Sandbox MCP Server

Docker MCP Server

CodeBox-AI

mcp-gateway-sandbox

Related MCP Connectors

Latest Blog Posts

MCP directory API