delimit_prompt_drift
Record and compare prompt results across models to detect drift and find reliable performance per task type.
Instructions
Detect prompt drift - when the same task behaves differently across models.
Track how prompts perform across Claude, Codex, and Gemini. Find which model is best for each task type on YOUR codebase.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| action | No | "record", "check", or "rank". | check |
| prompt | No | The prompt text (for record/check). | |
| model | No | AI model name (for record). | |
| result_summary | No | Brief description of the result (for record). | |
| success | No | Whether the result was good ("true"/"false"). | true |
| task_type | No | Task category (refactoring/testing/debugging/docs). |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||