register_probe
Register a self-test probe with a prompt and grader (regex, exact, or manual) to verify AI responses on known weak-spot tasks.
Instructions
Register a self-test probe: a known weak-spot task with a verifier.
grader: 'regex' (pattern match in response), 'exact' (substring), or
'manual' (claude self-grades — always counts as failure unless caller
explicitly confirms success via record_attempt). expected_pattern
optional for 'manual'.
Categories should be claude-shape: 'count_long_context', 'date_arithmetic', 'recall_verbatim_block', 'detect_contradiction', 'follow_negative_instruction', 'preserve_list_order', 'respect_length_limit', 'needle_mid_context', 'fact_vs_inference', 'notice_absence', 'strict_format_compliance', 'uncertainty_acknowledgment'.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| category | Yes | ||
| prompt | Yes | ||
| expected_pattern | No | ||
| grader | No | regex |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |