MCP Server Code Execution Mode

Overview Schema Related Servers Score Discussions

run_python

Execute Python code in a persistent sandbox environment to run computations, analyze data, or build multi-tool workflows while maintaining state across executions.

Instructions

The Code Execution MCP engine. Executes Python code in a stateful, persistent rootless sandbox environment similar to a Jupyter notebook. Variables, functions, and imports are preserved across calls. Use this tool for general code execution, data analysis, or when the user asks to 'run code'. Supports loading additional MCP servers via the 'servers' array.

Input Schema

TableJSON Schema

Name	Required	Description
`code`	Yes	Python source code to execute. Call runtime.capability_summary() inside the sandbox for this digest. Persistent Python Sandbox (state retained between tool calls). 1. DISCOVER: `runtime.discovered_servers()`, `runtime.search_tool_docs('query')`. Use `discovered_servers(detailed=True)` for descriptions. 2. CALL: `await mcp_server.tool()`. 3. PERSIST: `save_tool(func)` for functions, `save_memory(key, value)` for data. 4. MEMORY: `load_memory(key)`, `list_memories()`, `update_memory(key, fn)`. Run `print(runtime.capability_summary())` for the full manual.
`servers`	No	Optional list of MCP servers to make available as mcp_<name> proxies
`timeout`	No	Execution timeout in seconds

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure and does so effectively. It describes key behavioral traits: stateful/persistent execution environment, sandboxed nature, preservation of variables/functions/imports across calls, and support for loading MCP servers. It doesn't mention rate limits, authentication needs, or error handling, but provides substantial operational context beyond basic functionality.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded with the core purpose in the first sentence. Each subsequent sentence adds valuable information about persistence, usage scenarios, and MCP server support. There's minimal redundancy, though the final sentence about MCP servers could be integrated more smoothly with the preceding content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (stateful code execution with persistence) and lack of annotations/output schema, the description provides substantial context about the execution environment, persistence model, and MCP server integration. It doesn't describe return values or error formats, but covers the operational model well. For a tool with this complexity and no structured behavioral annotations, it's quite comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description mentions the 'servers' array parameter but doesn't add significant semantic meaning beyond what's in the schema descriptions. It provides context about MCP server availability but doesn't enhance understanding of parameter usage beyond the comprehensive schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('executes Python code') and resources ('stateful, persistent rootless sandbox environment similar to a Jupyter notebook'). It distinguishes the tool's unique capabilities by mentioning variable persistence across calls and MCP server loading, which would differentiate it from any hypothetical siblings. The description goes beyond just restating the name to explain the execution environment.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('for general code execution, data analysis, or when the user asks to run code'), which gives clear context for its application. However, since there are no sibling tools mentioned, it cannot provide guidance on when to use alternatives. The guidance is comprehensive within the given context but lacks sibling differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

run_pythonA

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/elusznik/mcp-server-code-execution-mode'

If you have feedback or need assistance with the MCP directory API, please join our Discord server