eval_tool_call_accuracy
Assess if an AI agent invoked the correct tool and supplied the expected arguments. Outputs a score, pass/fail status, and explanation.
Instructions
Evaluate whether an agent called the right tool with the right arguments.
Pure deterministic — no LLM judge needed. Compares the actual tool name + arguments against expected.
Args: expected_tool: Tool name the agent should have called. actual_tool: Tool name the agent actually called. expected_arguments: Dict of expected argument values (optional). actual_arguments: Dict of argument values the agent passed (optional).
Returns:
{"score": 0.0 or 1.0, "passed": bool, "reason": str}.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| expected_tool | Yes | ||
| actual_tool | Yes | ||
| expected_arguments | No | ||
| actual_arguments | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||