super-loop-mcp
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| SUPER_LOOP_HOME | No | Path to the directory where state is stored. Default: '<package>/.super-loop'. | |
| SUPER_LOOP_HOST | No | Host identifier or alias from the host registry (e.g., 'claude', 'codex', 'zcode', 'cursor', 'opencode'). Set this so the run hands the agent a host-correct setup checklist at start. | |
| SUPER_LOOP_ALLOW_EXEC | No | Set to '1' to enable live execution of frontier workers. Default: not set (off). |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": false
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| initialize_loop_runA | Ask-once gate. Confirms the task before any loop runs. If the task is underspecified, returns one brief explanation plus a few short questions once (goal; PATH — improve an existing loop / discover-or-find a loop, optionally scouting a public loop library / mine your whole history; the loop or domain to start from; corpus scope — whole history or a set number of loops, and best-first vs in-order; what "better" means; any task-specific hard limit; and a deeper-explanation offer); call again with { answers } to begin. It never asks the operator to choose the model, promotion mode, benchmark policy, deterministic-vs-subjective routing, or the standing guarantees — the supervisor decides those from the task. Stores every user message locally with a sha256 hash. After initialization it does not ask again or mark the campaign complete; the operator remains the stop condition and the dashboard stays available. |
| loop_startA | Begin phase-gated streaming of a bundled or custom local loop. Opens/activates the supervisor lane for that loop. Use "strip-miner" (The Strip Miner Loop / cross-agent source miner, 345 lines), "loop-de-loop" (Loop 2, the improvement loop, 75 lines), or any id registered with loop_register. Returns ONLY section 0; the full loop stays inside the supervisor. |
| request_next_phaseA | Stream the next loop section. BLOCKED (PHASE_SKIP) unless the current section already has recorded evidence. Prevents 300+ lines collapsing into the model before real decisions. |
| loop_nextD | Alias of request_next_phase. |
| observation_recordA | Record lightweight evidence for the current phase (what you actually did/observed). Attach { loop, phase } to satisfy the phase gate and unlock the next section. |
| artifact_recordB | Persist a raw artifact (run log, baseline copy) with a sha256 hash. role:"baseline" hash-locks the baseline (write-once; tampering refused). Pass measurement:{tokenCost,quality} so the artifact can serve as a tool-measured, reverifiable measurementRef. sourcePath reads are disabled; pass explicit content. |
| benchmark_proposeB | Propose one or more benchmark scorecards built from real prior uses/failures. Each needs ≥1 task-value dimension, ≥1 resource/cost dimension, and ≥1 concrete case, or it is rejected as a hand-waved benchmark. |
| benchmark_selectA | Freeze ONE proposed benchmark as the immutable scorecard for this cycle. Requires the baseline to be hash-locked first. Changing a frozen benchmark needs a new epoch + rationale. |
| benchmark_freeze_makerA | Bench-maker only: freeze a benchmark directly (benchSource:maker) without worker benchmark_propose. Defaults benchPartition to gate (held-out). Worker benchmark_propose becomes a no-op while this scorecard is frozen. |
| benchmark_runA | Record a tool-measured run of an arm through the frozen benchmark. arm:"baseline" sets the bar challengers must beat. Requires a measurementRef → a recorded raw artifact; model self-report never sets the bar. |
| register_hypothesesA | Register 3–5 challenger hypotheses, each on a frontier route. Requires baseline hash-lock + frozen benchmark + measured baseline bar (benchmark-first). Rejects <3 or >5, and any haiku/mini/nano/lite/prior-gen route. |
| test_hypothesisC | Record ONE full test of a hypothesis = 3–5 frontier agents that actually ran the loop end-to-end. Every agent run must carry a measurementRef (tool-measured). Aggregates vs the frozen baseline bar; a no-improvement run is NO_IMPROVEMENT, never "perfect", and bumps the failure counter. |
| execute_full_testA | SUPERVISOR-EXECUTED full test (off by default; opt in with env SUPER_LOOP_ALLOW_EXEC=1). Sling itself LAUNCHES 3-5 allowlisted frontier workers (claude/codex/glm/gemini binaries on PATH) via execFile (never a shell), captures each output, and feeds the tool-captured bytes through the same gate as test_hypothesis — so there is no model-supplied run-log to fabricate. A failed/timed-out/non-allowlisted launch is an invalid batch and does not count toward retirement. Without the opt-in this returns BLOCKED (EXEC_DISABLED) and you record run-logs via artifact_record + test_hypothesis instead. |
| reverify_runA | Deep re-verification: re-hash every raw artifact behind a full test and confirm the claimed metrics reproduce. Promotion is blocked until this passes (anti benchmark-gaming / baseline-tampering). |
| promotion_requestA | Request promotion of a hypothesis to internal champion. Requires a tool-measured, reverified full test on the frozen benchmark that moves the quality/cost frontier past threshold. Old green unit tests without a score matrix, model-reported metrics, or below-threshold results are BLOCKED. Never overwrites the operator’s canonical loop file. |
| cycle_decision_requestA | The supervisor decision hook. A worker proposes a transition packet; only a supervisor-accepted transition counts as progress. Reasoning alone is never proof. Allowed transition intents: promote | advance_phase | change_baseline | change_benchmark | saturate. Completion/stop-style intents are refused (the operator is the only stop condition). |
| run_campaignA | AUTONOMOUS SUPERVISOR (opt-in: SUPER_LOOP_ALLOW_EXEC=1). One call drives the whole campaign itself — intake → work the target queue (mine → improve) → for each improve target: hash-lock baseline → freeze benchmark → measure the bar on a real worker → FullTestBatches (3-5 frontier workers, each output VALIDATED through the enforcement boundary) → supervisor delta → reverify → promote (bank a Stone) → advance/retire → re-mine — and keeps going until the operator stop-file. Worker output is never trusted: summary-only / early-stop / fake-metric / self-promote / phase-skip / copied-public are rejected and re-entered, and invalid batches do not count toward retirement. maxBatches bounds the in-call MCP run (a safety cap, NOT completion); the standalone |
| report_saturationA | Tell the supervisor the current lane (e.g. the Strip Miner) has reached evidence-backed saturation. The supervisor AUTO-TRANSITIONS to the next lane (Strip Miner → Loop-de-loop, or the next improvement branch). It never pauses, awaits the operator, or treats "no re-mining warranted" as terminal — saturation is a pivot. The operator is the only stop condition. |
| campaign_statusB | Read-only supervisor status: the lane/target queue, auto-transitions, branch-retirement accounting (30 valid no-improvement batches), the 10-15 risk advisory band, and how many dashboard review items are pending. Pending review never blocks the campaign. |
| continue_runA | Record the next runnable improvement lane and first concrete action after reports, dashboards, saturation findings, no-improvement advisories, or refused terminal/checkpoint intents. This never asks the user and never marks the campaign complete. It does not clear the continuation obligation by itself; a real progress tool must run next. |
| human_review_requestA | Queue a change for the operator’s Approve/Sludge dashboard or list pending items. This tool CANNOT resolve human review; approval/sludge is dashboard-only. Never blocks deterministic lanes — the loop keeps running. |
| update_dashboardC | Render the always-available local dashboard.html (score matrix, phase progress, failure patience, Approve/Sludge, and the stop-condition notice). Human review happens only here; deterministic lanes do not wait on it. |
| report_exportC | Write a reproducible markdown report (baseline lock, frozen benchmark, score matrix, promotions, failure patience, campaign state) to the run dir. |
| export_trajectoriesB | Export a run's recorded tool trajectory as Hermes-format JSONL (one assistant/tool_call line per action, with supervisor verdict labels from recorded gate results). Read-only over the store. Refuses gate-partitioned (held-out) runs. |
| loop_registerA | Add YOUR OWN loop to this machine's local MCP, or register a skill (retrievable knowledge file) with role:"skill". Pass the full text as |
| loop_libraryA | List every loop and skill available to this local MCP: mandated hash-locked loops, custom loops you registered, and skills (metadata only — id, title, provenance, partition, section count). No full bodies. |
| skill_fetchA | Retrieve skill knowledge for the current task. Two modes: 'plan' returns the index (titles, purposes, token estimates) of skills matching your query — read these to decide which sections to fetch; never loads full skill bodies. 'section' fetches ONE section body by (skill_id, section_id) so you pull only the knowledge you need. Default partition is 'working'; 'reference' is opt-in only (held-out skills). Pass runId to pin the exact skill version this run used. |
| host_capability_preflightA | Local capability report: which known frontier-agent CLIs (claude, codex, gemini, opencode, optional glm) are installed on PATH, PLUS the resolved host profile (driverFamily, tier, setupHint, and the host matrix when SUPER_LOOP_HOST is unknown). Filesystem stat only — NEVER executes a command, NEVER probes arbitrary binaries, and is NOT web/SOTA research. Presence on PATH is not proof of working auth. |
| host_runtime_detectA | Advisory guess of which host runtime the agent is in, from which MCP config files exist on disk (per the host registry). READ-ONLY existence check — never reads file contents, never mutates config. SUPER_LOOP_HOST, if set, is authoritative. Returns a guess, the candidate hosts with evidence, and the CLI fallback; nothing is auto-applied. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/alexalexalex222/Loop-Factory-mcp-public'
If you have feedback or need assistance with the MCP directory API, please join our Discord server