How do I use nano-vm-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@nano-vm-mcp run a program to fetch weather data" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

nano-vm-mcp

by Ale007XD

Overview Schema Related Servers Score Discussions

Python

Hybrid

What nano-vm-mcp Is

nano-vm-mcp is an MCP gateway that turns the Model Context Protocol into a governance-bound execution environment. It wraps the llm-nano-vm execution kernel and exposes it to any MCP client — Claude Desktop, Claude Code, custom agents, or API callers — through stdio or SSE transport.

Most MCP servers expose stateless tools. nano-vm-mcp exposes stateful, governed, auditable workflows.

Capability	Typical MCP Server	nano-vm-mcp
Tool execution	✅	✅
Stateful workflows	❌	✅
Deterministic FSM	❌	✅
Replayable traces	❌	✅
Suspend / resume	❌	✅
LLM output enforcement	❌	✅
Capability enforcement (double gate)	❌	✅
Append-only audit trail	❌	✅
GDPR tombstoning	❌	✅
Evaluator blindness by design	❌	✅
Inter-session idempotency	❌	✅

Core invariant: the gateway does not own execution logic — the FSM kernel does.

δ(S, E) → S'

  S  — current execution state
  E  — validated event
  S' — next deterministic state

Related MCP server: Nervora

Architecture

MCP Client / Claude Code
        ↓
  nano-vm-mcp (Gateway)    ← decides how execution is allowed to proceed
      → GovernedRunProgramHandler   ← PolicySnapshot, idempotency_key, CapabilityRef
          → llm-nano-vm (Kernel)    ← deterministic FSM, ASTEngine, ProjectionLayer
      → GovernanceEnvelope store    ← SQLite WAL, append-only audit log
      → idempotency_keys store      ← idempotent re-execution across restarts
        ↓
  deterministic FSM        ← guarantees correctness
        ↓
  GovernanceEnvelope       ← proves it happened

Strict isolation: the gateway never touches execution logic. The kernel never touches persistence or policy. Each layer has a single responsibility and cannot cross the boundary.

Install

pip install nano-vm-mcp
pip install 'nano-vm-mcp[litellm]'   # for llm steps

MCP Tools

Tool	Description
`run_program`	Execute a `Program` dict → returns `trace_id`, status, step count, cost
`get_trace`	Retrieve full `Trace` JSON by `trace_id`
`list_programs`	List saved programs (`id`, `name`, `created_at`)
`get_program`	Retrieve saved `Program` JSON by `program_id`
`delete_program`	Delete a program and all its traces

Quick Start

stdio — Claude Desktop / local MCP client

nano-vm-mcp --transport stdio

claude_desktop_config.json or .mcp.json:

{
  "mcpServers": {
    "nano-vm-mcp": {
      "command": "nano-vm-mcp",
      "args": ["--transport", "stdio"]
    }
  }
}

SSE — VPS / remote clients

NANO_VM_MCP_API_KEY=your-secret-token nano-vm-mcp --transport sse --port 8080

MCP client URL: http://<host>:8080/sse
Auth header: Authorization: Bearer your-secret-token

Docker Compose

services:
  nano-vm-mcp:
    image: ghcr.io/ale007xd/nano-vm-mcp:latest
    ports:
      - "8080:8080"
    volumes:
      - ./data:/data
    environment:
      NANO_VM_MCP_DB: /data/nano_vm_mcp.db
      NANO_VM_MCP_PORT: 8080
      NANO_VM_MCP_API_KEY: your-secret-token
    command: ["nano-vm-mcp", "--transport", "sse"]

Claude Code Dynamic Workflows

Claude Code decides what to do. nano-vm-mcp decides how execution is allowed to proceed.

Claude Code Dynamic Workflows give you parallel subagents and dynamic orchestration. They don't give you deterministic step execution, replayable audit trails per step, or idempotent re-execution across restarts. nano-vm-mcp closes exactly that gap.

Claude Code          ← decides what to do
    ↓
nano-vm-mcp          ← enforces how execution proceeds
    ↓
deterministic FSM    ← guarantees correctness
    ↓
GovernanceEnvelope   ← proves it happened

	Claude Code Dynamic Workflows	+ nano-vm-mcp
Parallel subagents	✅	✅
Dynamic orchestration	✅	✅
Deterministic step execution	❌	✅
Replayable audit trail per step	❌	✅
LLM output enforcement	❌	✅
Inter-session idempotency	❌	✅
GDPR tombstoning	❌	✅
Evaluator blindness	❌	✅

Use this combination when a workflow subagent must execute a governed process — payment pipeline, approval chain, compliance check — where correctness and auditability matter beyond the LLM layer.

Example: governed payment step inside a Claude Code workflow

# Claude Code subagent calls this tool directly
result = await session.call_tool(
    "run_program",
    {
        "program": {
            "name": "payment_pipeline",
            "steps": [
                {"id": "validate",  "type": "tool", "tool": "validate_amount"},
                {"id": "reserve",   "type": "tool", "tool": "reserve_funds"},
                {"id": "capture",   "type": "tool", "tool": "capture_payment"},
                {"id": "receipt",   "type": "tool", "tool": "send_receipt",
                 "is_terminal": True},
            ]
        },
        "idempotency_key": "order-abc-123",
    }
)
# Returns: trace_id, status, step count, cost
# Every step: GovernanceEnvelope in SQLite — tamper-evident, append-only

The subagent cannot skip steps, reorder execution, or bypass capability checks — regardless of what the LLM decides at the orchestration layer.

Retrieve the audit trail

trace = await session.call_tool("get_trace", {"trace_id": result["trace_id"]})
# Returns: per-step status, duration_ms, usage, state_snapshots

Traces persist across sessions in SQLite WAL. trace_id is UUID4-stable for OTel propagation.

Idempotency — Inter-session Re-execution Safety

Pass idempotency_key to run_program to guarantee that a program executes at most once per key, even across process restarts:

# First call — executes normally, result cached
result = await session.call_tool("run_program", {
    "program": program,
    "idempotency_key": "payment-order-xyz-001",
})

# Second call with same key — returns cached result immediately, no re-execution
result = await session.call_tool("run_program", {
    "program": program,
    "idempotency_key": "payment-order-xyz-001",
})

Crash recovery: if the process crashes after program start but before completion (status=pending), the next call with the same key overwrites the pending entry and re-executes. Once the result is written as status=success, it is immutable for that key.

Note on "exactly-once": the FSM guarantees idempotent re-execution — the same key never triggers a second run after success. External side effects (payment capture, webhook delivery) are only as idempotent as the tools you register. This is the same contract Temporal and Cadence operate under.

Governance Layer

GovernanceEnvelope

Each successful execution step produces an immutable GovernanceEnvelope stored in the governance_envelopes table. Envelopes are written only on error=None — they form a tamper-evident, append-only audit trail of successful transitions only.

Field	Type	Description
`execution_id`	`str`	Session / trace identifier
`step_id`	`int`	Step index within the execution
`policy_hash`	`str`	SHA-256 of the active `PolicySnapshot`
`canonical_snapshot_hash`	`str`	Merkle/delta hash of `CanonicalState` at this step
`payload`	`dict \| list`	Projected (sanitized) step output

PolicySnapshot and CapabilityRef

PolicySnapshot is a frozen Pydantic model created once per session. It carries the set of allowed tool names and is hashed (SHA-256) before execution starts. Every GovernanceEnvelope records this hash — post-hoc modification of the policy is detectable.

from nano_vm.contracts import PolicySnapshot, CapabilityRef

policy = PolicySnapshot(
    tool_capabilities={"reserve_funds", "capture_payment", "send_receipt"},
)
# policy.hash() → SHA-256 hex, stored in every GovernanceEnvelope.policy_hash

CapabilityRef wraps sensitive values as opaque tokens (vault://secret/<id>) rather than storing raw plaintext in CanonicalState. The token is resolved JIT during tool execution and never written to the audit log.

ref = CapabilityRef(ref_id="card-4242", value="4242424242424242")
# Stored in state as: vault://secret/card-4242
# GovernanceEnvelope.payload contains the token, not the card number

On a GDPR erasure event:

Target ref is tombstoned (is_tombstone=True)
All subsequent projections return [REDACTED_TOMBSTONE]
The canonical_snapshot_hash chain remains valid — forensic auditability is preserved
The secret is permanently gone

vm.erase(ref_id="card-4242")
# Hash chain remains intact — the erasure itself is auditable

Execution Traces

Every step writes a TRACE projection to the execution_traces table — a sanitized snapshot of state with sensitive values replaced by CapabilityRef tokens, protected by a unique index on (execution_id, step_index):

steps = store.get_trace_steps(execution_id="exec-abc-123")
# [
#   {"step_index": 0, "step_id": "validate", "projected_json": "...", "canonical_hash": "..."},
#   {"step_index": 1, "step_id": "reserve",  "projected_json": "...", "canonical_hash": "..."},
# ]

Evaluator Blindness by Design

Benchmark research across 10 frontier models found that every model changes behavior when it detects it's being monitored. The best model scored 84% on "acts the same whether watched or not."

The ProjectionLayer gives the LLM only a target=LLM projection of state. Governance metadata — GovernanceEnvelope, canonical_hash, policy_hash, retry counters — never reaches the prompt. The model cannot observe its own audit trail.

Evaluator blindness is structural, not configured. It is a consequence of the strict isolation between the kernel and gateway layers.

Determinism and LLM Steps

State determinism — the FSM kernel guarantees execution order, no step skipping, and reproducible trace structure regardless of LLM output. The graph of transitions is fixed at program definition time. This is unconditional.

Semantic determinism — the text produced by an LLM step may differ across runs even at temperature=0.0. nano-vm does not guarantee semantic determinism and does not try to.

These are orthogonal concerns. The runtime enforces state determinism; you control semantic determinism through prompt engineering and allowed_outputs.

LLM output enforcement at the runtime level

allowed_outputs (v0.8.0) validates the model's raw output against an explicit enum before it enters the FSM context. This isn't a prompt hint — it's a runtime gate.

{
    "id": "classify",
    "type": "llm",
    "prompt": "Is this a valid refund request? Reply ONLY with: yes or no",
    "output_key": "decision",
    "allowed_outputs": ["yes", "no"],   # runtime enforcement — not a prompt hint
    "on_error": "skip",                 # output → "yes" (first element) on mismatch
}

Security

ASTEngine — sandboxed condition evaluation

Conditions are evaluated by the ASTEngine — a deterministic sandboxed interpreter with no access to Python builtins, attribute access, or callable invocation. eval() is not used anywhere in the production execution path.

Rules for safe use:

Condition logic must be authored by you, not generated from untrusted input at runtime.
LLM output may appear as a value being tested ('yes' in '$decision'), never as the condition expression itself.

Capability enforcement — double gate

Tool execution passes through two independent enforcement layers:

Layer	Mechanism
`GovernedToolExecutor`	Verifies tool name against `PolicySnapshot.tool_capabilities`; raises `CapabilityDeniedError` on violation
`ExecutionVM` (kernel)	Rejects any tool name not registered in the tool registry with `VMError`

Neither gate can be bypassed by LLM output.

SSE transport and auth

Set NANO_VM_MCP_API_KEY to enable bearer token authentication (secrets.compare_digest — timing-safe). If unset, a warning is logged and all requests are allowed — suitable for localhost only.

Do not expose the SSE endpoint to the public internet without NANO_VM_MCP_API_KEY set.

Configuration

Variable	Default	Description
`NANO_VM_MCP_DB`	`nano_vm_mcp.db`	SQLite WAL database path
`NANO_VM_MCP_HOST`	`0.0.0.0`	SSE bind host
`NANO_VM_MCP_PORT`	`8080`	SSE bind port
`NANO_VM_MCP_API_KEY`	(unset)	Bearer token for SSE auth
`NANO_VM_MCP_LLM_MODEL`	(unset)	LiteLLM model string for `llm` steps

Endpoints

Path	Auth	Description
`GET /health`	none	Liveness probe — always returns `{"status": "ok"}`
`GET /sse`	bearer	SSE transport entry point
`POST /messages`	bearer	MCP message endpoint

Performance

The FSM runtime introduces near-zero overhead. The bottleneck is always the LLM API or external I/O.

Sequential execution (single FSM instance): one step at a time per execution_id — deliberate design choice, makes traces deterministic and replayable.

Parallel execution across independent workflows: fan out across multiple execution_id instances. SQLite WAL handles concurrent writers without locking.

Benchmarks (v0.7.3, Mock adapter, QEMU/KVM · Intel Xeon E5-2697A v4 · 2 cores · Python 3.12)

Scenario	Mean TPS	p95
Refund pipeline (sequential)	2,300/s	0.66 ms
MCP store round-trip	3,000/s	0.42 ms
GovernanceEnvelope write	1,300/s	171 ms
Parallel throughput (`asyncio.gather`)	436/s	542 ms
Replay equivalence	1,300/s	1.30 ms
Long-horizon (30-step program)	30/s	3,606 ms

Observability

trace.trace_id          # UUID4 — stable for OTel propagation
trace.status            # SUCCESS | FAILED | SUSPENDED | BUDGET_EXCEEDED | STALLED
trace.final_output
trace.steps             # per-step: step_id, status, duration_ms, usage
trace.state_snapshots   # list[(step_index, sha256_hex)]

Traces are persisted to SQLite and retrievable by trace_id across sessions via get_trace.

Execution State Model

CREATED
  ↓
RUNNING ──── tool returns "PENDING" ──→ SUSPENDED
  │                                          │
  │                                    resume_with_program()
  │                                          │
  └──────────────────────────────────────────┘
  │
  ├── no more steps ──→ SUCCESS
  ├── tool error (on_error=fail) ──→ FAILED
  ├── max_steps / max_tokens exceeded ──→ BUDGET_EXCEEDED
  └── max_stalled_steps exceeded ──→ STALLED

Terminal states: SUCCESS, FAILED, BUDGET_EXCEEDED, STALLED. All are immutable.

Relationship to llm-nano-vm

Layer	Responsibility
`llm-nano-vm` (kernel)	Deterministic FSM execution, ASTEngine, ProjectionLayer, step lifecycle
`nano-vm-mcp` (gateway)	MCP transport, persistence, governance, idempotency, capability enforcement

The gateway never owns transition logic. The FSM kernel does.

The kernel is MIT-licensed, independently versioned on PyPI (llm-nano-vm), and fully documented. Either layer can be used standalone or replaced — the boundary between them is a stable Python interface.

Diagnostic Integration — Agent Debugger

#diagnostic-integration--agent-debugger

debug_trace is an opt-in MCP tool that sends a completed Trace to an external Agent Debugger service for automated failure diagnosis. It does not run by default — no token, no call.

Auto-diagnostic on FAILED: when a run_program execution ends with status=FAILED, GovernedRunProgramHandler automatically forwards the trace for diagnosis if AGENT_DEBUGGER_TOKEN is set. No extra call needed from the MCP client.

export AGENT_DEBUGGER_TOKEN=your-token
export AGENT_DEBUGGER_URL=https://agent-debugger-production.up.railway.app

Variable	Default	Description
`AGENT_DEBUGGER_TOKEN`	(unset)	Enables diagnostic calls; absent = no-op
`AGENT_DEBUGGER_URL`	(unset)	Agent Debugger service endpoint

# Manual call — diagnose any stored trace on demand
result = await session.call_tool("debug_trace", {"trace_id": result["trace_id"]})
# Returns: failure classification + suggested root cause from Agent Debugger

Without AGENT_DEBUGGER_TOKEN set: the diagnostic call is silently skipped — execution is never blocked by an unavailable or unconfigured debugger.

Contact & Support

Author: @ale007xd on Telegram · @ale007xd on X

USDT (TON)

USDT (TON): UQCakyytrEGBikOi3eYMpveGHXDB1-fd6lcuQC9VvKqMrI-9

License

MIT License.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

4dRelease cycle

13Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Ale007XD/nano-vm-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server