xProof

Name: xProof
Author: jasonxkensei

by io.github.jasonxkensei

Server Details

Proof primitive for AI agents on MultiversX. Anchor file hashes on-chain as verifiable proofs.

Status: Healthy
Last Tested: 2026-05-25 16:30
Transport: Streamable HTTP
URL
Repository: jasonxkensei/xProof
GitHub Stars: 1

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A4/5.0

Tool DescriptionsA

Average 4.1/5 across 8 of 8 tools scored. Lowest: 3.4/5.

Server CoherenceA

Disambiguation4/5

Most tools have distinct purposes, such as certify_file for file certification and check_attestations for verifying agent credentials. However, certify_agent_session and certify_with_confidence could be confused, as both involve certification processes for agent actions, though the latter adds a staged confidence feature.

Naming Consistency4/5

Tool names follow a consistent verb_noun pattern throughout, such as audit_agent_session and verify_proof, with clear and descriptive naming. Minor deviations include discover_services, which uses a verb_noun format but is less action-oriented, and investigate_proof, which is slightly longer but still fits the pattern.

Tool Count5/5

With 8 tools, the server is well-scoped for blockchain certification and audit services, covering key operations like certification, verification, discovery, and investigation. Each tool serves a specific function without redundancy, making the count appropriate for the domain.

Completeness5/5

The tool set provides comprehensive coverage for blockchain certification workflows, including creation (certify_file, certify_with_confidence), verification (verify_proof, check_attestations), retrieval (get_proof), investigation (investigate_proof), and discovery (discover_services). It supports full lifecycle management from initial audit to post-execution analysis, with no obvious gaps.

Available Tools

10 tools

audit_agent_sessionAInspect

Certify an AI agent's work session on the MultiversX blockchain BEFORE executing any critical action (trade, deploy, data access). Returns a proof_id compliance certificate. The agent MUST call this tool before acting — certification failure blocks execution. Governance: set reversibility_class='irreversible' for actions that cannot be undone — a policy violation is flagged if confidence_level < 0.95 on irreversible decisions. Schema: https://xproof.app/.well-known/agent-audit-schema.json

ParametersJSON Schema

Name	Required	Description
`context`	No	Optional additional context (model version, environment, tool chain, etc.)
`agent_id`	Yes	Identifier of the agent making the decision
`decision`	Yes	Agent's decision about whether to proceed
`timestamp`	Yes	ISO 8601 timestamp of when the decision was made
`risk_level`	Yes	Assessed risk level of the action
`session_id`	Yes	Unique session identifier (UUID or timestamp-based)
`action_type`	Yes	Category of the action being certified
`inputs_hash`	Yes	SHA-256 of all inputs analyzed before making the decision
`risk_summary`	No	Optional brief risk analysis justifying the decision
`action_description`	Yes	Human-readable description of the specific action
`reversibility_class`	No	Governance: how reversible is this action? 'reversible' = can be undone at low cost, 'costly' = reversible but expensive (fees, slippage, delay), 'irreversible' = cannot be undone (on-chain settlement, data deletion, email sent). When 'irreversible', a confidence_level >= 0.95 is required to be policy-compliant.

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses return value (proof_id compliance certificate) and blocking behavior on failure, but omits mutation nature (writes to blockchain?), permission requirements, or reversibility that would be critical for a mandatory audit tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences that front-load purpose, specify return value, mandate usage, and provide schema reference. No tautology or redundant phrasing, though schema URL could be excluded from natural language description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high complexity (10 params, blockchain domain, no output schema, no annotations), description covers basic operation and failure modes but lacks detail on certificate lifecycle, compliance standards, or integration depth with MultiversX chain that would aid agent decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage; description provides no parameter-specific guidance beyond the schema URL reference. Baseline 3 is appropriate as schema sufficiently documents the 10 parameters including enums and validation patterns.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verb 'Certify' with clear resource 'AI agent's work session on the MultiversX blockchain' and distinguishes from siblings by specifying session-level pre-action certification versus file-level (certify_file) or proof verification (verify_proof).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states temporal requirement 'BEFORE executing any critical action' with enumerated examples (trade, deploy, data access), and mandates 'MUST call this tool before acting' with consequence 'certification failure blocks execution'. Lacks explicit sibling differentiation though the pre-action timing is distinctive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

certify_fileAInspect

Create a blockchain certification for a file. Records the SHA-256 hash on MultiversX blockchain as immutable proof of existence and ownership. Cost: $0.05 per certification, paid in EGLD.

ParametersJSON Schema

Name	Required	Description
`filename`	Yes	Original filename with extension (max 255 chars)
`file_hash`	Yes	SHA-256 hash of the file (64 hex characters)
`author_name`	No	Name of the certifier (default: AI Agent, max 128 chars)
`webhook_url`	No	Optional HTTPS URL for on-chain confirmation callback

Tool Definition Quality

A4.2/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Excellently discloses critical behavioral traits: financial cost ($0.05/EGLD), immutability guarantee, specific blockchain (MultiversX), and mechanism (SHA-256 hash recording). This financial side effect is essential for an agent to know before invocation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three information-dense clauses, optimally ordered: action (create certification), mechanism (SHA-256/MultiversX/immutable), and cost constraint ($0.05/EGLD). No redundant words. The cost disclosure at the end appropriately weights the financial implication.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Strong coverage for a mutation tool with no output schema: covers cost, blockchain specifics, and immutability. Minor gap: doesn't explicitly mention the async nature implied by webhook_url parameter or failure modes (e.g., insufficient EGLD balance), though cost disclosure partially covers economic failure modes.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing baseline 3. The description mentions 'SHA-256 hash' which reinforces the file_hash parameter description, but doesn't add syntax details, validation rules, or semantic context beyond what the schema already provides for filename, author_name, or webhook_url.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent specificity: 'Create a blockchain certification' clearly identifies the action and resource. Implicitly distinguishes from verification-focused siblings (verify_proof, check_attestations, investigate_proof) by emphasizing 'Create' and 'Records the hash' which signals this is a write operation, not a read/check operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides implied usage through clear action naming (create vs verify), but lacks explicit guidance on when to choose this over siblings like verify_proof or when certification is unnecessary. No mention of prerequisites (e.g., file must exist, hash must be pre-calculated).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

certify_with_confidenceAInspect

Create a staged blockchain certification with a confidence score. Use this when your decision builds progressively — certify at 60% (initial assessment), 80% (pre-commitment), and 100% (final decision). Each stage shares the same decision_id, creating an on-chain audit trail of the decision process. Governance: set reversibility_class='irreversible' for actions that cannot be undone — xproof will flag a policy violation if confidence_level < 0.95. Cost: $0.05 per certification.

ParametersJSON Schema

Name	Required	Description
`who`	No	Agent identity (wallet address, name, or agent ID)
`why`	No	Reason or instruction hash driving this decision
`filename`	Yes	Original filename with extension (e.g. decision.json, max 255 chars)
`file_hash`	Yes	SHA-256 hash of the decision or output file (64 hex characters)
`author_name`	No	Name of the certifying agent (default: AI Agent, max 128 chars)
`decision_id`	Yes	Shared UUID linking all confidence stages for the same decision. Generate once and reuse across all stages.
`threshold_stage`	Yes	Named stage of the decision: initial (first assessment), partial (gathering info), pre-commitment (almost certain), final (committed).
`confidence_level`	Yes	Confidence score from 0.0 to 1.0. Typical values: 0.6 (initial), 0.8 (pre-commitment), 1.0 (final).
`reversibility_class`	No	Governance: how reversible is this action? 'reversible' = can be undone, 'costly' = reversible but expensive, 'irreversible' = cannot be undone (on-chain settlement, data deletion, sent email). When 'irreversible', confidence_level must be >= 0.95 or xproof flags a policy violation.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by disclosing key behavioral traits: it creates an on-chain audit trail, mentions cost ($0.05 per certification), and explains the staging process. However, it doesn't mention error conditions, rate limits, or authentication requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with three sentences that each add value: purpose statement, usage guidance with specific percentages, and behavioral details about audit trail and cost. No wasted words or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex certification tool with 8 parameters and no annotations or output schema, the description does well by explaining the staging concept, typical usage patterns, and cost. However, it doesn't describe the return value or what happens after certification, which would be helpful given no output schema exists.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds some context about typical confidence values (0.6, 0.8, 1.0) and the decision_id reuse across stages, but doesn't provide additional parameter semantics beyond what's already well-documented in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a staged blockchain certification with a confidence score, specifying the verb 'create' and resource 'staged blockchain certification'. It distinguishes from siblings like certify_file by emphasizing the progressive staging aspect with confidence levels.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states 'Use this when your decision builds progressively' and provides specific usage scenarios with confidence percentages (60%, 80%, 100%). It distinguishes from alternatives by focusing on staged certification rather than one-time certification.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_attestationsAInspect

Check domain-specific attestations for an AI agent wallet on xproof. Returns active attestations issued by third-party certifying bodies (healthcare, finance, legal, security, research). Each active attestation adds +50 to the agent's trust score (max +150 from 3 attestations). Use this to verify an agent's credentials before delegating a sensitive task.

ParametersJSON Schema

Name	Required	Description	Default
`wallet`	Yes	MultiversX wallet address (erd1...) of the agent to check

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden and delivers exceptionally: it reveals the trust scoring mechanics (+50 per attestation, max +150), filters (returns only 'active' attestations), data sources (third-party certifying bodies), and specific domain categories covered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four tightly crafted sentences with zero waste. Front-loaded with purpose ('Check domain-specific attestations...'), followed by return value specification, behavioral mechanics (scoring), and usage guidance. Every sentence earns its place with no redundant or generic filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool lacking output schema and annotations, the description is comprehensive. It compensates for missing output specification by detailing what gets returned (active attestations from specific domains) and the practical impact (trust score calculation), providing sufficient context for invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% establishing a baseline of 3. The description adds semantic value by specifying this is for an 'AI agent wallet' (not just any wallet) and contextualizing it 'on xproof', providing meaningful scope beyond the schema's technical address format description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states a specific verb ('Check') and resource ('domain-specific attestations for an AI agent wallet on xproof'), clearly distinguishing it from proof-verification siblings through terminology emphasizing third-party certifying bodies and specific domains (healthcare, finance, legal, security, research).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear when-to-use guidance ('before delegating a sensitive task') and explains the verification purpose, but lacks explicit alternatives or when-not-to-use clauses that would guide selection against similar tools like verify_proof or audit_agent_session.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_servicesAInspect

Discover available xproof certification services, pricing, and capabilities. No authentication required.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. It successfully communicates the authentication requirements (none needed), but omits other behavioral traits like rate limits, caching behavior, or pagination for the returned service list.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, zero waste. First sentence establishes purpose and scope, second provides critical auth context. Every sentence earns its place with no filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Absence of output schema requires description to hint at return values, which it does by listing 'services, pricing, and capabilities'. Does not explain response structure, but sufficiently complete for a simple discovery tool with no parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema contains zero parameters. Per calibration rules, 0 params warrants baseline score of 4. Description appropriately does not invent parameters where none exist.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb 'Discover' paired with resource 'xproof certification services' and supplements with 'pricing, and capabilities'. Clearly distinguishes from action-oriented siblings (certify_file, verify_proof, etc.) by indicating this is a metadata/discovery operation rather than a proof operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides one concrete guideline ('No authentication required'), indicating prerequisites. However, lacks explicit when-to-use guidance relative to siblings (e.g., does not state 'use this before certify_file to select a service'). Usage relative to alternatives is only implied.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_calibrationAInspect

Query an agent's calibration quality over time: mean confidence gap, variance, bias label (overconfident / underconfident / calibrated), and per-decision time series. Fully public — use this to evaluate another agent before trusting it. agentId accepts a MultiversX wallet address (erd1...) or internal user id.

ParametersJSON Schema

Name	Required	Description	Default
`n`	No	Number of recent outcomes to include (default 50, max 200)
`agent_id`	Yes	Agent wallet address (erd1...) or internal user id

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. States 'Fully public' implying no auth needed, and mentions parameter format. However, lacks details on read-only nature, rate limits, or data freshness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences cover purpose, outputs, and usage. Efficient with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Explains outputs (mean confidence gap, etc.) despite no output schema. Parameters fully described. Missing error handling or response format, but adequate for a query tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. Description adds no new meaning beyond schema; just restates that agentId accepts wallet or internal id. Baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool queries calibration quality over time, listing specific outputs (confidence gap, variance, bias label, time series). This distinguishes it from siblings like audit_agent_session or certify_file.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'use this to evaluate another agent before trusting it', giving clear context. Lacks explicit when-not or alternative tools but is still helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_proofBInspect

Retrieve a proof in structured JSON or Markdown format. Use JSON for machine processing, Markdown for LLM consumption.

ParametersJSON Schema

Name	Required	Description	Default
`format`	No	Output format: json or md	json
`proof_id`	Yes	UUID of the certification

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden. It explains the format options (behavioral trait) but fails to confirm idempotency, safety (read-only nature), error behaviors, or content structure of the proof.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely efficient two-sentence structure. Main action stated immediately, followed by parameter guidance. No redundancy or wasted words; every phrase earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate but incomplete given the tool ecosystem. With no output schema, the description mentions formats but not proof content structure. Critically, it fails to clarify this tool's role relative to certify_file and other proof-related siblings in the attestation workflow.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing a baseline of 3. The description adds significant value by explaining the semantic rationale for format selection (machine processing vs LLM consumption), which aids agent decision-making beyond the schema's basic type definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb ('Retrieve') and resource ('proof'), and specifies output formats. However, it does not explicitly differentiate from similar sibling tools like verify_proof or investigate_proof, which could cause selection confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides excellent guidance on format parameter selection (JSON for machines, Markdown for LLMs), but completely omits when to use this retrieval tool versus siblings like verify_proof or investigate_proof.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

investigate_proofAInspect

Reconstruct the full 4W audit trail for a contested agent action. Returns WHO (agent identity + SIGIL), WHAT (SHA-256 hash on-chain), WHEN (MultiversX block timestamp), WHY (decision chain anchored before acting). Includes verification summary with intent_preceded_execution flag, chronological timeline of WHY/WHAT proofs, and session heartbeat anchor. Requires x402 payment ($0.05 USDC on Base via X-PAYMENT header) or API key authentication. Without payment, returns payment requirements with USDC address and amount.

ParametersJSON Schema

Name	Required	Description	Default
`wallet`	Yes	Agent wallet address (erd1...) that owns the proof
`proof_id`	Yes	UUID of any proof in the action pair — WHY (reasoning), WHAT (action), or heartbeat session proof

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden and excels: it details the return structure (4W breakdown, intent_preceded_execution flag, chronological timeline, session heartbeat), explains authentication options (x402 $0.05 USDC payment OR API key), and specifies fallback behavior when payment is missing (returns requirements with address/amount). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Despite high complexity (payment systems, blockchain anchoring, 4W framework), the description maintains tight structure across four sentences: (1) core purpose, (2) return structure details, (3) auth/payment requirements, (4) fallback behavior. Every sentence conveys essential information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description comprehensively compensates by detailing exactly what gets returned (WHO/WHAT/WHEN/WHY components, verification summary, timeline, heartbeat anchor) and the dual authentication paths. The payment fallback behavior is also fully specified. Complete for a complex audit tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (both wallet and proof_id have detailed descriptions in the schema). The description focuses on return values and payment logic rather than repeating parameter documentation. With complete schema coverage, baseline 3 is appropriate as the description adds no param semantics but doesn't need to.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with the specific verb 'Reconstruct' and clearly identifies the resource as the 'full 4W audit trail' for 'contested agent actions.' This distinguishes it from siblings like get_proof (simple retrieval) and verify_proof (verification) by emphasizing the comprehensive 4W reconstruction (WHO/WHAT/WHEN/WHY) aspect.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The phrase 'for a contested agent action' provides clear contextual guidance on when to use this tool (dispute resolution scenarios). While it doesn't explicitly name sibling alternatives or state 'when not to use,' the 'contested' qualifier effectively signals the specific use case. Payment requirements also guide usage (requires x402 payment or API key).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

submit_outcomeAInspect

Submit the actual outcome for a decision previously anchored with metadata.confidence_level. Computes the confidence gap (anchored − actual) and stores it for calibration tracking. Operator-only — you must own the proof. Each proof can only have one outcome.

ParametersJSON Schema

Name	Required	Description	Default
`proof_id`	Yes	UUID of the certification that was anchored with metadata.confidence_level
`visibility`	No	Whether this outcome is publicly visible (default: public)	public
`outcome_score`	Yes	Actual outcome quality (0.0 = complete failure, 1.0 = fully successful)

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description discloses key behaviors: it computes a confidence gap, stores it for calibration, and enforces a single outcome per proof. It implies a write operation and idempotency constraints, though it doesn't detail error states or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with no redundancy; the most critical information (what it does, constraints) is front-loaded. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with three parameters and no output schema, the description covers the main process and key constraint (single outcome). It could mention expected response or error conditions, but the given information is sufficient for basic usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters have schema descriptions, so the baseline is 3. The description adds minimal extra clarity about how outcome_score relates to the anchored confidence, but does not elaborate on visibility or proof_id beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the action: submitting the actual outcome for a decision previously anchored with confidence_level. It uses a specific verb ('submit') and resource ('outcome'), and distinguishes it from sibling tools by referencing the preceding anchoring step and calibration tracking.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It states operator-only access, ownership requirement, and the one-outcome-per-proof constraint. While it doesn't explicitly list when not to use it or compare to siblings, the context is sufficient for an agent to understand the prerequisite (anchored decision) and limitation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_proofAInspect

Verify an existing xproof certification. Returns proof details including file hash, timestamp, blockchain transaction, and verification status.

ParametersJSON Schema

Name	Required	Description	Default
`proof_id`	Yes	UUID of the certification to verify

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Adds valuable context by disclosing blockchain involvement ('blockchain transaction') and specific return fields (file hash, timestamp, status). However, omits operational details like rate limits, caching behavior, or what 'verification status' values mean.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, zero waste. First sentence declares operation; second sentence declares return payload. Efficiently front-loaded with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool without output schema, description adequately compensates by enumerating return fields (hash, timestamp, blockchain data, status). Lacks error case documentation but sufficient for tool selection given low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% ('proof_id' fully described as UUID). Description implies input by referencing 'existing certification' but adds no syntax semantics beyond the schema. Baseline 3 appropriate when schema does heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific verb ('Verify') and resource ('xproof certification') clearly. Implies distinction from sibling 'get_proof' (retrieval) and 'investigate_proof' (analysis) through the verification semantic, though explicit contrast is absent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Usage is implied by the verb 'Verify' — suggests use when validating proof authenticity rather than just retrieving it. However, lacks explicit when-to-use guidance or contrast with 'get_proof'/'investigate_proof' alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

xProof

Server Details

Tool Definition Quality

Available Tools

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Discussions

Your Connectors

Resources