trw_probe
Resolve disputed plan assumptions by running a bounded, sandboxed command experiment. Returns a verdict of supports, refutes, or inconclusive.
Instructions
Run a bounded, sandboxed experiment to resolve a disputed plan assumption.
Use when, during the PLAN phase, two plan branches disagree on a
load-bearing, empirically resolvable claim a rubric cannot adjudicate
(e.g. "this parser handles a 50MB JSONL stream without OOM"). The
command runs inside the shared SAFE-001 sandbox (subprocess +
seccomp + no-network default), bounded by timeout_s and
memory_mb, and a typed ProbeResult with verdict in
{supports, refutes, inconclusive} comes back.
Budget is enforced per planning_mode (DIRECT=0, DUAL_DRAFT=1,
TRIANGULATED=2, TRIANGULATED_WITH_PROBE=3); exhaustion returns a typed
budget error. Identical probes within a run are served from cache.
Returns: dict serialization of ProbeResult (or a typed error dict
on validation failure / budget exhaustion / feature-flag disabled).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| run_id | No | unknown | |
| command | Yes | ||
| memory_mb | No | ||
| timeout_s | No | ||
| hypothesis | Yes | ||
| allow_network | No | ||
| hypothesis_id | No | ||
| planning_mode | No | TRIANGULATED_WITH_PROBE |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||