judge_receipt
Initiate auditable judgment of receipt outputs by creating Ed25519-signed pending receipts with structured evaluation prompts. Enables weighted rubric assessment beyond pass/fail with partial verdicts and confidence scoring.
Instructions
Start an AI judgment evaluation for a receipt by creating a pending judgment receipt and returning a structured evaluation prompt. The host model (you) evaluates the receipt's output against the provided rubric criteria and then calls complete_judgment with the results. Use to assess output quality beyond simple pass/fail constraints — supports weighted criteria, partial verdicts, and confidence scores. Judgment receipts are themselves Ed25519-signed for auditability.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| receipt_id | Yes | The receipt ID to evaluate — the original action receipt | |
| rubric | Yes | Evaluation rubric with criteria array. Each criterion needs: name (string), description (string), weight (0.0-1.0), and optional passing_threshold (0.0-1.0, default 0.7). Also set: passing_threshold (overall, default 0.7) and require_all (boolean, default false) | |
| output_summary_for_review | No | The actual output content to evaluate — provide if output_summary on the receipt is insufficient for evaluation |