check_success
Evaluate a campaign's success contract by comparing the latest metric value to the threshold, returning whether the contract is met along with iterations used.
Instructions
Evaluate whether a campaign has met its typed success contract.
A campaign's success contract (success_metric,
benchmark_command, scope, max_iterations) is set at
creation time by templates like autoresearch. This tool gives
the planner / coordinator a single scalar decision:
met=True— contract satisfied; coordinator should mark the campaigndone.met=False— keep iterating, or escalate if budget exhausted.iterations_used— count of DONE atomic steps on this campaign so far. Useful for comparing againstmax_iterations.metric_value— the most recently recorded metric fromcampaign_notestaggedmetric:<success_metric>(best-effort parse; see below).
How the metric is discovered. The runner / worker agents emit a note of the form::
add_note(campaign_id, f"metric={value}", tags=["metric:<name>"])on every benchmark run. check_success scans the most recent
such note, extracts the float after =, and compares it against
success_metric_threshold if set via steer_campaign(strategy=...).
In v0.3 we only surface the most-recent value — the planner
decides whether it's "good enough". A future revision may wire a
numeric threshold column.
Args: id: Campaign UUID.
Returns:
{met, metric_value, iterations_used, max_iterations, success_metric, notes_checked, reason}
``reason`` is a short string explaining the decision — useful
both for humans looking at logs and for the planner's own
chain-of-thought.A campaign without a success contract returns
{met: False, reason: "no_success_metric_configured"}.
Next: if met is True, call cancel_campaign(id) or
steer_campaign(id, "wrap up"). If False and
iterations_used >= max_iterations, escalate via the
notification channel.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| id | Yes |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||