openclaw-output-vetter-mcp

Overview Schema Related Servers Score Discussions

verify_response_grounding

Check that every claim in an AI agent's response is supported by provided context, flagging unsupported or fabricated statements before they reach users. Returns per-claim verdicts and overall grounding score.

Instructions

Check that every claim in answer has support in context. Returns per-claim grounded/ungrounded + an overall verdict (CLEAN / PARTIALLY_GROUNDED / FABRICATED). Use inline during an agent conversation to flag hallucinated responses before they become user-facing facts. Sub-second, local, no API key.

Input Schema

TableJSON Schema

Name	Required	Description
`question`	Yes	The user question (used for context-binding; v1.0 stores but doesn't use)
`context`	Yes	Retrieval / source context the answer should be grounded in
`answer`	Yes	The agent's response to verify
`threshold`	No	Jaccard overlap threshold for `grounded` (0.0–1.0). Default 0.30. Lower = more permissive, higher = stricter.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries the full burden. It discloses that the tool is sub-second, local, no API key, returns per-claim and overall verdict, and notes a quirk about the question parameter (stored but unused in v1.0). It lacks details on error handling or edge cases but provides substantial behavioral insight.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each earning its place: first defines action and output, second gives usage context, third adds performance and privacy traits. No fluff, no repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains return format (per-claim + verdict). It covers usage, parameters, behavior, and performance. For a verification tool with simple inputs and outputs, this is fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds meaningful context beyond the schema: it explains the threshold parameter (Jaccard overlap, default 0.30, effect of lower/higher values) and the question parameter's current behavior (stored but unused). This fully compensates for any schema gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Check' and resource 'every claim in answer has support in context'. It explicitly names the return values (per-claim grounded/ungrounded + overall verdict) and distinguishes itself from sibling tools like find_swallowed_exceptions or review_transcript by focusing on factual grounding.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises to 'Use inline during an agent conversation to flag hallucinated responses before they become user-facing facts'. This gives a clear when-to-use scenario. It does not explicitly list when not to use or alternatives, but the context is sufficient for an agent to understand its purpose.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/temurkhan13/openclaw-output-vetter-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server