evaluate_model_inference
Evaluates AI model inference accuracy by comparing predictions to ground truth. Computes precision, recall, F1, AUC, and pass/fail verdict based on enterprise thresholds.
Instructions
Evaluates the accuracy of AI model inferences against ground-truth labels. Computes precision, recall, F1, AUC approximation, and a pass/fail verdict against enterprise performance thresholds.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| threshold | No | Classification threshold for converting probabilities to binary (default 0.5) | |
| min_recall | No | Minimum acceptable recall | |
| predictions | Yes | Array of {id, predicted (0/1 or probability), actual (0/1)} records | |
| min_precision | No | Minimum acceptable precision |