run_autoresearch
Automates a bounded metric-improvement loop by measuring baselines, testing hypotheses, and validating changes with primary and holdout checks.
Instructions
Run a bounded metric-improvement loop: measure a baseline, test a hypothesis, require primary and holdout checks, then keep or discard the candidate mutation with proof.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| iterations | No | Number of iterations to run. Capped at 5 per call; default 1. | |
| targetName | No | Optional evolution target to mutate. | |
| nextValue | No | Optional explicit candidate value for the target. | |
| testCommand | No | Primary metric command. Defaults to npm test. | |
| holdoutCommands | No | Additional checks required before a candidate can be kept. | |
| timeoutMs | No | Per-command timeout in milliseconds. Capped at 600000; default 120000. | |
| cwd | No | Optional workspace directory for the evaluation commands. | |
| researchQuery | No | Optional research query used to build an autoresearch context brief. | |
| paperLimit | No | Maximum research papers to ingest when researchQuery is set. Capped at 10; default 5. |