MCP Hub

ShallowCodeResearch_code_runner_wrapper

Execute code in a managed sandbox environment with async execution and warm pool support. Provides user-friendly error handling for code execution results.

Instructions

Wrapper for CodeRunnerAgent that uses async execution with warm pool. Ensures a sandbox is spawned if not already present, waits for readiness, and then executes the code. Provides user-friendly error messages. Returns: The execution result or user-friendly error message

Input Schema

TableJSON Schema

Name	Required	Description	Default
`code_or_obj`	No	The code string or object to be executed

Implementation Reference

app.py:789-829 (handler)

The main handler function for the code runner tool. It wraps the CodeRunnerAgent's async execution, ensures the sandbox pool is ready, handles warmup delays, and provides user-friendly error messages.

def code_runner_wrapper(code_or_obj) -> str:
    """
    Wrapper for CodeRunnerAgent that uses async execution with warm pool.

    Ensures a sandbox is spawned if not already present, waits for readiness,
    and then executes the code. Provides user-friendly error messages.

    Args:
        code_or_obj: The code string or object to be executed

    Returns:
        str: The execution result or user-friendly error message
    """
    try:
        import asyncio

        async def ensure_and_run():
            # Ensure the sandbox pool is initialized and ready
            await code_runner._ensure_pool_initialized()
            # Wait for at least one sandbox to be available
            pool_status = await get_sandbox_pool_status()
            user_message = pool_status.get("user_message", "")
            if pool_status.get("status") == "warming_up":
                return f"{user_message}\n\nPlease try again in a moment once the environment is ready."
            # Run the code in the sandbox
            return await code_runner.run_code_async(code_or_obj)

        return asyncio.run(ensure_and_run())

    except CodeExecutionError as e:
        error_msg = str(e)
        if "Failed to get sandbox" in error_msg or "timeout" in error_msg.lower():
            return (
                "🔄 The code execution environment is still starting up. Please wait a moment and try again.\n\n"
                "This is normal for the first execution after startup (can take 1-2 minutes)."
            )
        return error_msg
    except Exception as e:
        logger.error(f"Code runner wrapper error: {e}")
        return f"Error: {str(e)}"

app.py:1028-1036 (registration)

Gradio Interface registration that exposes the code_runner_wrapper as an MCP tool via api_name="agent_code_runner_service". This is likely prefixed or named as ShallowCodeResearch_code_runner_wrapper in the MCP context.

with gr.Tab("Agent: Code Runner", scale=1):
    gr.Interface(
        fn=code_runner_wrapper,
        inputs=[gr.Textbox(label="Code to Execute", lines=12, placeholder="Enter Python code to run…")],
        outputs=gr.Textbox(label="Execution Output", lines=12),
        title="Code Runner Agent",
        description="Executes Python code in a secure environment and returns the output.",
        api_name="agent_code_runner_service",
    )

mcp_hub/agents/code_runner.py:198-302 (helper)

Core asynchronous code execution logic in CodeRunnerAgent. Handles code preparation, package installation, safety shims, sandbox execution via Modal, and retry logic on sandbox failure.

@track_performance(operation_name="async_code_execution")
@rate_limited("modal")
async def run_code_async(self, code_or_obj) -> str:
    """
    Execute Python code or a code object in a Modal sandbox asynchronously.
    This method supports both string code and compiled code objects, ensuring
    that the code is executed in a secure, isolated environment with safety checks.
    Args:
        code_or_obj (str or types.CodeType): The Python code to execute, either as a string
                                             or a compiled code object
    Returns:
        str: The output of the executed code, including any print statements
    """
    await self._ensure_pool_initialized()

    if isinstance(code_or_obj, str):
        payload = code_or_obj
    elif isinstance(code_or_obj, types.CodeType):
        b64 = base64.b64encode(marshal.dumps(code_or_obj)).decode()
        payload = textwrap.dedent(f"""
            import base64, marshal, types, traceback
            code = marshal.loads(base64.b64decode({b64!r}))
            try:
                exec(code, {{'__name__': '__main__'}})
            except Exception:
                traceback.print_exc()
        """).lstrip()
    else:
        raise CodeExecutionError("Input must be str or types.CodeType")

    # Analyze code for required packages
    start_analysis = time.time()
    required_packages = self._analyze_code_dependencies(payload)
    analysis_time = time.time() - start_analysis
    if analysis_time > 0.1:  # Only log if analysis takes significant time
        logger.info(f"Code dependency analysis took {analysis_time:.2f}s")

    # Add safety shim
    safe_code = self._add_safety_shim(payload)
    filename = "temp_user_code.py"
    write_cmd = f"cat > {filename} <<'EOF'\n{safe_code}\nEOF"

    try:
        async with self.sandbox_pool.get_sandbox() as sb:
            try:
                # Install additional packages if needed
                if required_packages:
                    install_start = time.time()
                    await self._install_packages_in_sandbox(sb, required_packages)
                    install_time = time.time() - install_start
                    logger.info(f"Package installation took {install_time:.2f}s")

                logger.info(f"Writing code to sandbox file: {filename}")
                sb.exec("bash", "-c", write_cmd)
                logger.info(f"Executing code from file: {filename}")
                exec_start = time.time()
                proc = sb.exec("python", filename)
                exec_time = time.time() - exec_start
                logger.info(f"Code execution took {exec_time:.2f}s")

                output = ""
                if hasattr(proc, "stdout") and hasattr(proc.stdout, "read"):
                    output = proc.stdout.read()
                    if hasattr(proc, "stderr") and hasattr(proc.stderr, "read"):
                        output += proc.stderr.read()
                else:
                    output = str(proc)
                logger.info("Async code execution completed successfully (warm pool)")
                return output
            except Exception as e:
                if "finished" in str(e) or "NOT_FOUND" in str(e):
                    logger.warning(f"Sandbox died during use, terminating: {e}")
                    try:
                        result = sb.terminate()
                        if asyncio.iscoroutine(result):
                            await result
                    except Exception as term_e:
                        logger.warning(f"Failed to terminate sandbox after error: {term_e}")
                    async with self.sandbox_pool.get_sandbox() as new_sb:
                        # Re-install packages if needed for retry
                        if required_packages:
                            await self._install_packages_in_sandbox(new_sb, required_packages)
                        new_sb.exec("bash", "-c", write_cmd)
                        proc = new_sb.exec("python", filename)
                        output = ""
                        if hasattr(proc, "stdout") and hasattr(proc.stdout, "read"):
                            output = proc.stdout.read()
                            if hasattr(proc, "stderr") and hasattr(proc.stderr, "read"):
                                output += proc.stderr.read()
                        else:
                            output = str(proc)
                    logger.info("Async code execution completed successfully on retry")
                    return output
                else:
                    logger.error(f"Async code execution failed: {e}")
                    raise CodeExecutionError(f"Error executing code in Modal sandbox: {str(e)}")
    except CodeExecutionError:
        raise
    except asyncio.TimeoutError:
        logger.error("Async code execution timed out")
        raise CodeExecutionError("Code execution timed out after 30 seconds")
    except Exception as e:
        logger.error(f"Async code execution failed: {str(e)}")
        raise CodeExecutionError(f"Error executing code in Modal sandbox: {str(e)}")

app.py:22-152 (helper)

Import and global instantiation of the CodeRunnerAgent used by the wrapper.

from mcp_hub.agents import (
    QuestionEnhancerAgent,
    WebSearchAgent,
    LLMProcessorAgent,
    CitationFormatterAgent,
    CodeGeneratorAgent,
    CodeRunnerAgent,
    OrchestratorAgent,
)

# Import advanced features with graceful fallback
ADVANCED_FEATURES_AVAILABLE = False
try:
    from mcp_hub.performance_monitoring import metrics_collector, track_performance, track_api_call
    from mcp_hub.health_monitoring import health_monitor
    ADVANCED_FEATURES_AVAILABLE = True
    logger.info("Advanced features loaded successfully")
    
except ImportError as e:
    logger.info(f"Advanced features not available: {e}")
    logger.info("Running with basic features only")
    
    # Create dummy decorators for backward compatibility
    def track_performance(operation_name: str = None):
        def decorator(func): 
            return func
        return decorator
    
    def track_api_call(service_name: str):
        def decorator(func): 
            return func
        return decorator
    
    def rate_limited(service: str = "default", timeout: float = 10.0):
        def decorator(func): 
            return func
        return decorator
    
    def circuit_protected(service: str = "default"):
        def decorator(func): 
            return func
        return decorator
    
    def cached(ttl: int = 300):
        def decorator(func): 
            return func
        return decorator

# Performance tracking wrapper
def with_performance_tracking(operation_name: str):
    """
    Add performance tracking and metrics collection to any function (sync or async).

    This decorator wraps both synchronous and asynchronous functions to collect
    execution time, success/failure metrics, and error counts. It integrates with
    the advanced monitoring system when available.

    Args:
        operation_name (str): The name of the operation to track in metrics

    Returns:
        function: A decorator function that can wrap sync or async functions
    """
    def decorator(func):
        if asyncio.iscoroutinefunction(func):
            @wraps(func)
            async def async_wrapper(*args, **kwargs):
                start_time = time.time()
                try:
                    result = await func(*args, **kwargs)
                    success = True
                    error = None
                except Exception as e:
                    success = False
                    error = str(e)
                    raise
                finally:
                    duration = time.time() - start_time
                    if ADVANCED_FEATURES_AVAILABLE:
                        metrics_collector.record_metric(f"{operation_name}_duration", duration, 
                                                        {"success": str(success), "operation": operation_name})
                        if not success:
                            metrics_collector.increment_counter(f"{operation_name}_errors", 1, 
                                                              {"operation": operation_name, "error": error})
                    logger.info(f"Operation {operation_name} completed in {duration:.2f}s (success: {success})")
                return result
            return async_wrapper
        else:
            @wraps(func)
            def wrapper(*args, **kwargs):
                start_time = time.time()
                try:
                    result = func(*args, **kwargs)
                    success = True
                    error = None
                except Exception as e:
                    success = False
                    error = str(e)
                    raise
                finally:
                    duration = time.time() - start_time
                    if ADVANCED_FEATURES_AVAILABLE:
                        metrics_collector.record_metric(f"{operation_name}_duration", duration, 
                                                        {"success": str(success), "operation": operation_name})
                        if not success:
                            metrics_collector.increment_counter(f"{operation_name}_errors", 1, 
                                                              {"operation": operation_name, "error": error})
                    logger.info(f"Operation {operation_name} completed in {duration:.2f}s (success: {success})")
                return result
            return wrapper
    return decorator


# Import all agents from the new modular structure
from mcp_hub.agents import (
    QuestionEnhancerAgent,
    WebSearchAgent,
    LLMProcessorAgent,
    CitationFormatterAgent,
    CodeGeneratorAgent,
    CodeRunnerAgent,
    OrchestratorAgent
)

# Initialize individual agents
question_enhancer = QuestionEnhancerAgent()
web_search = WebSearchAgent()
llm_processor = LLMProcessorAgent()
citation_formatter = CitationFormatterAgent()
code_generator = CodeGeneratorAgent()
code_runner = CodeRunnerAgent()

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It adds useful context: it ensures sandbox spawning and readiness, uses async execution with warm pool, and provides user-friendly error messages. However, it lacks details on permissions, rate limits, or what happens in edge cases like timeouts or resource constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded and efficient, using three sentences to cover purpose, process, and return value. Each sentence adds value, with no redundant information, though it could be slightly more structured for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description is moderately complete. It explains the execution process and return values (result or error message), but for a code execution tool, it lacks details on sandbox environment, security implications, or output format specifics, which are important for contextual understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage, documenting the single parameter 'code_or_obj' as a string for code or object execution. The description doesn't add any semantic details beyond this, such as examples or constraints on the code format. Baseline 3 is appropriate since the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: it's a wrapper for CodeRunnerAgent that executes code using async execution with warm pool. It specifies the action (executes code) and resource (CodeRunnerAgent wrapper), though it doesn't explicitly differentiate from sibling tools like the code generator or LLM processor, which might have overlapping functions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions it's a wrapper for CodeRunnerAgent but doesn't explain when this wrapper is preferred over direct execution or other code-related tools in the sibling list, such as the code generator or LLM processor.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/CodeHalwell/gradio-mcp-agent-hack'

If you have feedback or need assistance with the MCP directory API, please join our Discord server