execute_full_test
Runs a full test by spawning 3-5 allowlisted frontier workers to execute a prompt, capturing outputs for validation. Requires SUPER_LOOP_ALLOW_EXEC=1 to enable.
Instructions
SUPERVISOR-EXECUTED full test (off by default; opt in with env SUPER_LOOP_ALLOW_EXEC=1). Sling itself LAUNCHES 3-5 allowlisted frontier workers (claude/codex/glm/gemini binaries on PATH) via execFile (never a shell), captures each output, and feeds the tool-captured bytes through the same gate as test_hypothesis — so there is no model-supplied run-log to fabricate. A failed/timed-out/non-allowlisted launch is an invalid batch and does not count toward retirement. Without the opt-in this returns BLOCKED (EXEC_DISABLED) and you record run-logs via artifact_record + test_hypothesis instead.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| runId | Yes | ||
| prompt | Yes | the loop + task the launched worker should actually run | |
| routes | Yes | 3-5 frontier worker routes to launch (each must map to an allowlisted binary) | |
| timeoutMs | No | per-worker hard timeout (default 600000) | |
| hypothesisId | Yes |