kopern_grade_prompt
Evaluate system prompts with inline test cases. Checks output, schema, tool usage, safety, custom scripts, and LLM judgment. Returns a score from 0 to 1.
Instructions
Grade a system prompt against inline test cases. Uses 6 criteria types (output_match, schema_validation, tool_usage, safety_check, custom_script, llm_judge). Returns score 0-1. Uses YOUR API keys.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Model ID. Default: provider default | |
| provider | No | LLM provider. Default: anthropic | |
| test_cases | Yes | Test cases: { name, input, expected } | |
| system_prompt | Yes | The system prompt to evaluate |