xai-toolkit

record_feedback

Collect expert evaluations on AI-generated explanations to assess quality and improve narrative clarity for specific audiences and business contexts.

Instructions

Record expert feedback on a toolkit narrative.

Call this when an expert evaluates the quality of an explanation.
The narrative is hashed to create an auditable link between the
feedback and the exact output that was evaluated.

Valid ratings: excellent, useful, too_technical, too_vague, missing_context

Args:
    model_id: Which model produced the output being evaluated.
    tool_name: Which tool produced it (e.g., "explain_prediction").
    narrative: The exact narrative text being evaluated (will be hashed).
    rating: Expert's assessment (excellent/useful/too_technical/too_vague/missing_context).
    audience_role: Evaluator's role (e.g., "reliability_engineer").
    business_line: Business context (e.g., "lubricants").
    expert_comment: Optional free-text elaboration.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`model_id`	Yes
`tool_name`	Yes
`narrative`	Yes
`rating`	Yes
`audience_role`	Yes
`business_line`	Yes
`expert_comment`	No

Tool Definition Quality

A4.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses that narratives are hashed for auditability (a behavioral trait) and lists valid rating values, but doesn't mention whether this is a write operation (implied by 'Record'), permission requirements, rate limits, or what happens after recording. It adds some context but leaves significant behavioral aspects unspecified.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded with the core purpose in the first sentence. Each subsequent sentence adds necessary information about hashing, valid ratings, and parameter explanations. While efficient, the parameter section could be slightly more structured (e.g., bullet points) but remains clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters with no schema descriptions and no annotations, the description does an excellent job explaining parameter semantics and usage context. However, it lacks information about what happens after recording (no output schema) and doesn't address potential error conditions or system behavior. For a tool with this complexity, it's nearly complete but has minor gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description fully compensates by providing detailed parameter explanations. It explains what each parameter represents (e.g., 'model_id: Which model produced the output'), lists valid rating values, specifies which parameters are optional, and clarifies that narrative text will be hashed. This adds substantial meaning beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Record expert feedback') on a specific resource ('toolkit narrative'), distinguishing it from sibling tools like 'explain_prediction' or 'list_models' which serve different purposes. It provides a concrete verb-resource pair with additional context about hashing for auditability.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'Call this when an expert evaluates the quality of an explanation.' This provides clear context for invocation, though it doesn't specify when NOT to use it or name alternatives among siblings. The guidance is direct and sufficient for proper usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/florenciakabas/xai-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server