Skip to main content
Glama

Reality Graph Verification Tools

Server Details

Free read-only AI coding verification tools: verification-debt calculator, task-spec lint, search.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.5/5 across 7 of 7 tools scored.

Server CoherenceA
Disambiguation5/5

Each tool has a distinct purpose: compute metrics, fetch documents, retrieve templates, lint tasks, search, and validate contracts. No overlap; even the two template tools are clearly differentiated by their target artifact.

Naming Consistency5/5

All tool names follow a consistent lowercase_with_underscores pattern, predominantly verb_noun structure (e.g., check_verification_debt, validate_task_contract). Exceptions like 'fetch' and 'search' are single verbs but fit the pattern and are not confusing.

Tool Count5/5

7 tools is an ideal size for this domain. Each tool addresses a specific need in the verification workflow without redundancy. The scope is narrow enough to avoid feature bloat but broad enough to be useful.

Completeness5/5

The set covers the full verification lifecycle: retrieving templates, linting task specs, validating contracts, computing verification debt, and accessing the knowledge base. There are no obvious gaps; the tools feel complete for their stated purpose.

Available Tools

7 tools
check_verification_debtCheck verification debtA
Read-only
Inspect

Estimate a software team's verification debt from team parameters. Computes the four published metrics (generation-to-verification ratio, review depth, unverified-merge rate, two-week churn) and an annual cost estimate, with the full calculation path, labeled assumptions, thresholds, and sources (GitClear, Sonar, Faros, Veracode). Deterministic arithmetic from published models — no benchmark claims. Only team_size is required; every additional parameter refines the estimate. Set lang='de' for a German report.

ParametersJSON Schema
NameRequiredDescriptionDefault
langNoReport language (default: en)
team_sizeYesNumber of developers on the team (required)
prs_per_monthNoTotal merged PRs per month (default: derived from team size)
hourly_rate_eurNoLoaded cost per engineer hour in EUR (default: 75, assumption)
ai_share_percentNoShare of merges that are AI-assisted, in percent (default: 60, assumption)
ai_merges_per_monthNoAI-assisted merges per month (enables the unverified-merge rate)
merged_loc_per_weekNoMerged changed lines of code per week (enables the GVR and review-depth metrics)
two_week_churn_percentNoShare of new lines revised or reverted within 14 days, in percent (default: published GitClear trend delta as assumption)
reviewer_hours_per_weekNoReviewer hours actually spent per week (enables the GVR metric)
hours_per_reworked_changeNoAverage hours per reworked change (default: 6, assumption)
incident_allowance_eur_per_yearNoAnnual incident allowance in EUR (default: 20000, widest error bar)
ai_merges_with_evidence_per_monthNoAI-assisted merges per month with recorded validation evidence (enables the unverified-merge rate)
review_reconstruction_hours_per_prNoAverage reviewer hours spent reconstructing intent per AI-assisted PR (default: 0.5, assumption)
substantive_review_comments_per_weekNoSubstantive review comments per week, excluding bots and nitpicks (enables the review-depth metric)
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, and the description adds significant behavioral context: 'Deterministic arithmetic from published models — no benchmark claims,' and mentions the full calculation path, labeled assumptions, thresholds, and sources. This goes well beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured. It starts with the main action, lists outputs, clarifies determinism, and ends with usage guidance. Every sentence adds value, with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains what the output includes (metrics and cost estimate) and the nature (full calculation path, assumptions). It does not specify the exact format (e.g., JSON vs. textual report), but the mention of a 'report' and language option implies a readable format, which may be sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description adds overarching context ('every additional parameter refines the estimate') and the lang parameter hint, but no further semantic enrichment for individual parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Estimate a software team's verification debt from team parameters.' It lists the four metrics computed and the annual cost estimate, distinguishing it from siblings like 'get_verification_report_template' which likely provide templates rather than computation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains that only team_size is required and additional parameters refine the estimate, guiding usage. It also mentions setting lang='de' for a German report. However, it does not explicitly contrast with alternatives or state when not to use the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fetchFetch a knowledge base documentA
Read-only
Inspect

Fetch a document from the Reality Graph knowledge base by id (as returned by search, e.g. '/verification-debt') or by full realitygraph.dev URL. Returns the document's summary, definitions, key facts, FAQ, and sources as text, plus the canonical URL.

ParametersJSON Schema
NameRequiredDescriptionDefault
idYesDocument id from search results, or a realitygraph.dev URL

Output Schema

ParametersJSON Schema
NameRequiredDescription
idYes
urlYes
textYes
titleYes
metadataNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false. Description adds value by detailing the returned content (summary, definitions, key facts, FAQ, sources, canonical URL).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no wasted words, front-loaded with verb and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema exists for return values; description explains what is returned. Complete for a retrieval tool with good annotations and single parameter.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description covers 100% of parameter, but description adds example of id format ('/verification-debt') and clarifies that full URL is also accepted, adding meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Fetch a document from the Reality Graph knowledge base by id or by full URL' with example format, distinguishing it from search and other siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage after search (by id as returned by search) but does not explicitly state when not to use or provide alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_task_contract_templateGet the verifiable task contract templateA
Read-only
Inspect

Returns Reality Graph's free fill-in template (v0) for a verifiable task contract: goal, non-goals, boundaries (may change / must not change / forbidden), 3-7 yes/no acceptance criteria, validation plan, expected evidence, assumptions, open questions — with a filled example and fill-in guidance. Write the contract before an AI agent runs; verify the result against it after. format='json' returns a machine-fillable JSON structure; default is a compact markdown skeleton. Set lang='de' for German. Static content, nothing stored.

ParametersJSON Schema
NameRequiredDescriptionDefault
langNoLanguage (default: en)
formatNoTemplate format (default: markdown)
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds that content is static and nothing is stored, aligning with annotations and providing extra context about the tool's non-destructive, read-only behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, three sentences, no redundant information. It front-loads the main purpose and each sentence contributes useful information without waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description details the template contents (goal, non-goals, boundaries, etc.) and mentions version (v0). It is sufficiently complete for a template retrieval tool with clear annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions for lang and format. The description adds value by explaining the effect of each parameter (e.g., 'format='json' returns a machine-fillable JSON structure; default is a compact markdown skeleton') and using them in context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns a fill-in template for a verifiable task contract, listing its contents (goal, non-goals, etc.) and distinguishing it from sibling tools like validate_task_contract. It uses specific verbs and resources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear usage guidance: 'Write the contract before an AI agent runs; verify the result against it after.' It explains format and language options, though it does not explicitly exclude alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_verification_report_templateGet the verification report templateA
Read-only
Inspect

Returns the free fill-in template (v0) for a verification report — the artifact you write right after an AI-assisted run: task recap, files changed AND files confirmed untouched, validation results per acceptance criterion (not authored by the generating model), what was skipped, limitations, and the explicit decision. format='json' for a machine-fillable structure; default is a compact markdown file. Static content, nothing stored. lang='de' for German.

ParametersJSON Schema
NameRequiredDescriptionDefault
langNoLanguage (default: en)
formatNoTemplate format (default: markdown)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false. Description adds 'Static content, nothing stored' reinforcing safe behavior, and explains format options. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences front-load purpose, then detail. Every sentence adds value: purpose, contents, format, language. No redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple static template retrieval with 2 params and no output schema, description fully explains what is returned, usage context, and parameter options. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema provides enum values with basic descriptions. Description adds meaning: 'format='json' for a machine-fillable structure; default is a compact markdown file' and 'lang='de' for German', enriching parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Returns the free fill-in template (v0) for a verification report' with specific contents. Distinguishes from sibling tool 'get_task_contract_template' which is for contracts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Describes usage context ('right after an AI-assisted run') and template content, but does not explicitly state when not to use or mention alternative tools like 'check_verification_debt'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lint_task_specLint a task specificationA
Read-only
Inspect

Check whether a free-text work order for an AI coding agent is verifiable BEFORE handing it over. Heuristic, deterministic lint of the task's form against the four building blocks of a checkable task (goal, boundaries, acceptance criteria, validation plan) plus rule checks (vague adjectives without numbers, unnamed unhappy paths, missing file anchors). Returns a status table with evidence, the concrete questions that close each gap, and a fill-in skeleton. It checks form, not content — no LLM, nothing stored. Set lang='de' for a German report.

ParametersJSON Schema
NameRequiredDescriptionDefault
langNoReport language (default: en)
taskYesThe work order / task text you intend to give an AI coding agent (English or German)
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, and the description adds value by disclosing that the tool is deterministic, uses no LLM, and stores nothing. This goes beyond the annotations, providing full transparency about behavior and side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph of about 100 words, front-loading the core purpose. Each sentence adds essential information: the function, what it checks, what it returns, and key qualifiers (no LLM, nothing stored). No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema, but the description explicitly details the return value: 'a status table with evidence, the concrete questions that close each gap, and a fill-in skeleton.' This compensates well. The sibling tools are related but distinct, and the description leaves no major gaps for a lint operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, setting a baseline of 3. The description adds extra meaning by specifying that 'task' is the work order for an AI agent and that setting 'lang=\'en\'' or 'lang=\'de\'' controls the report language. This provides helpful context beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Check whether a free-text work order for an AI coding agent is verifiable BEFORE handing it over.' It uses specific verbs ('Check') and resources ('task specification'), and distinguishes itself from sibling tools by emphasizing form-checking vs. content validation or debt checking.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says to use the tool 'BEFORE handing [the task] over' to an AI agent, providing clear context. It also notes what it checks (form, not content) and that it's deterministic with no LLM, implicitly excluding scenarios requiring content verification. However, it does not explicitly name alternatives or conditions to avoid using this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_task_contractValidate a filled task contractA
Read-only
Inspect

Deterministically validates a FILLED task contract (the JSON structure from get_task_contract_template): completeness of goal/non-goals/boundaries, decidability of each acceptance criterion (vague words, missing measurable markers), automated checks in the validation plan, expected evidence, and leftover placeholders. Returns a verdict (PASS / PASS WITH WARNINGS / FAIL), four dimension scores, and a concrete fix per finding. Validates form and completeness, not correctness. No LLM, nothing stored. lang='de' for German.

ParametersJSON Schema
NameRequiredDescriptionDefault
langNoReport language (default: en)
contractYesThe filled task contract as a JSON string (structure from get_task_contract_template, format='json')
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Disclosures beyond annotations: deterministic, no LLM, nothing stored. Also describes return values (verdict, scores, fix). No contradictions with readOnlyHint=true.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single paragraph with all key information, but could be slightly more structured for quick scanning. Still concise and no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Describes return values despite no output schema, covers all aspects of tool behavior. No missing context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of params but description adds meaning: contract must be from get_task_contract_template and format='json', lang 'de' for German. Adds value beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it validates a filled task contract, specifies what it checks (goal/non-goals, acceptance criteria, etc.), and distinguishes from siblings like lint_task_spec which is separate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says it validates a FILLED contract, mentions it validates form and completeness not correctness, but does not explicitly state when to use vs alternatives or scenarios to avoid.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources