Skip to main content
Glama

verify_and_repair

Verifies LLM-generated code for hallucinated identifiers and suggests repairs using real APIs from the context.

Instructions

Verify LLM-generated code and suggest repairs for hallucinations.

Combines BIPT verification with rejection analysis to identify hallucinated identifiers and suggest which real APIs/symbols from the context should be used instead.

This is a single-shot verification + feedback tool — it does NOT call an LLM. For the full repair loop (FORGE), use the Python SDK: from entroly.verifiers import forge_loop

Args: prompt: The original user request that generated the code code: The LLM-generated code to verify context: The repository context provided to the LLM

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
codeYes
promptYes
contextNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It transparently states the tool combines BIPT verification and rejection analysis, identifies hallucinated identifiers, and suggests real APIs—without calling an LLM or performing direct modifications. No side effects or destructive actions are implied, but permissions or limits are not discussed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with three clear paragraphs: purpose, technical details, and a note on alternatives. Every sentence adds value—no fluff, no repetition of schema fields. Front-loaded with the core verb and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (verification with 3 parameters, output schema exists but not shown), the description adequately explains the verification process and output types (hallucinated identifiers, suggested real APIs). It does not detail the return format, but the presence of an output schema implies that is redundant. Minor gap: no mention of error conditions or prerequisites.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage, so the description adds complete meaning for all three parameters: prompt as 'original user request', code as 'LLM-generated code', and context as 'repository context'. This fully compensates for the schema's lack of descriptions, making parameter roles unambiguous.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool verifies LLM-generated code and suggests repairs for hallucinations, specifying the verb ('verify and suggest') and resource ('LLM-generated code'). It distinguishes itself by noting it is single-shot and does not call an LLM, contrasting with sibling tools like 'eicv_suppress_hallucinations' or the full FORGE loop.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises when not to use this tool (for full repair loop) and provides an alternative (Python SDK). However, it does not differentiate from other verification siblings like 'verify_beliefs' or 'verify_response', which could lead to confusion despite the specific focus on hallucinations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/juyterman1000/entroly'

If you have feedback or need assistance with the MCP directory API, please join our Discord server