Skip to main content
Glama

Server Details

Proof primitive for AI agents on MultiversX. Anchor file hashes on-chain as verifiable proofs.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
jasonxkensei/xProof
GitHub Stars
1

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4/5 across 10 of 10 tools scored. Lowest: 3.3/5.

Server CoherenceA
Disambiguation4/5

Most tools have distinct purposes, but there is some overlap among certification tools (certify_file vs. certify_with_confidence) and proof retrieval/verification/investigation tools. Descriptions help differentiate them, so overall ambiguity is low.

Naming Consistency5/5

Tool names predominantly follow a verb_noun pattern (e.g., audit_agent_session, certify_file, verify_proof). Even 'certify_with_confidence' is a clear extension. No inconsistent styles like camelCase mixed with snake_case.

Tool Count5/5

With 10 tools, the count falls well within the ideal 3-15 range. Each tool addresses a specific aspect of blockchain certification and auditing, with no obvious redundancies, making the set well-scoped.

Completeness4/5

The tool set covers the essential lifecycle: creation (certify_file, certify_with_confidence, audit_agent_session), verification (verify_proof), retrieval (get_proof), investigation (investigate_proof), calibration, and attestations. Minor gaps like listing all proofs are absent but not critical for the core functionality.

Available Tools

10 tools
audit_agent_sessionAInspect

Certify an AI agent's work session on the MultiversX blockchain BEFORE executing any critical action (trade, deploy, data access). Returns a proof_id compliance certificate. The agent MUST call this tool before acting — certification failure blocks execution. Governance: set reversibility_class='irreversible' for actions that cannot be undone — a policy violation is flagged if confidence_level < 0.95 on irreversible decisions. Schema: https://xproof.app/.well-known/agent-audit-schema.json

ParametersJSON Schema
NameRequiredDescriptionDefault
contextNoOptional additional context (model version, environment, tool chain, etc.)
agent_idYesIdentifier of the agent making the decision
decisionYesAgent's decision about whether to proceed
timestampYesISO 8601 timestamp of when the decision was made
risk_levelYesAssessed risk level of the action
session_idYesUnique session identifier (UUID or timestamp-based)
action_typeYesCategory of the action being certified
inputs_hashYesSHA-256 of all inputs analyzed before making the decision
risk_summaryNoOptional brief risk analysis justifying the decision
action_descriptionYesHuman-readable description of the specific action
reversibility_classNoGovernance: how reversible is this action? 'reversible' = can be undone at low cost, 'costly' = reversible but expensive (fees, slippage, delay), 'irreversible' = cannot be undone (on-chain settlement, data deletion, email sent). When 'irreversible', a confidence_level >= 0.95 is required to be policy-compliant.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses that certification failure blocks execution, governance flags policy violations for low confidence on irreversible actions, and returns a proof_id certificate. This adds significant behavioral context beyond the input schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph of six sentences, front-loading the core purpose. It is concise with no wasted words, though it could be broken into sections for readability. Efficient overall.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 11 parameters, nested objects, and no output schema, the description covers the core use case, governance, and references an external schema. It lacks explicit error handling or details on the proof_id response, but is largely complete for the intended workflow.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so baseline is 3. The description adds governance context for reversibility_class and notes that inputs_hash is SHA-256, but does not elaborate on most parameters. Marginal additional value over schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool's purpose: to certify an AI agent's session before executing critical actions like trades, deploys, or data access. It uses specific verbs and resources, and the context of MultiversX blockchain differentiates it from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description clearly mandates when to use the tool ('BEFORE executing any critical action') and that failure blocks execution. It includes governance rules for reversibility_class and confidence_level. However, it does not explicitly exclude alternatives or mention when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

certify_fileBInspect

Create a blockchain certification for a file. Records the SHA-256 hash on MultiversX blockchain as immutable proof of existence and ownership. Cost: $0.01 per certification, paid in EGLD.

ParametersJSON Schema
NameRequiredDescriptionDefault
filenameYesOriginal filename with extension (max 255 chars)
file_hashYesSHA-256 hash of the file (64 hex characters)
author_nameNoName of the certifier (default: AI Agent, max 128 chars)
webhook_urlNoOptional HTTPS URL for on-chain confirmation callback
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the burden. It discloses the cost, immutability, and blockchain recording, which is helpful. However, it does not mention any potential side effects, authorization needs, or failure modes.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two sentences, no redundant information, and front-loads the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description should explain what the tool returns (e.g., certification ID, transaction hash) or confirmation behavior. It does not provide this, leaving the agent unsure about the output.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the parameters are already documented. The description adds some context (e.g., 'Cost: $0.01 per certification') but does not significantly enhance meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Create' and resource 'blockchain certification for a file', and explains the purpose (recording SHA-256 hash on MultiversX blockchain). However, it does not explicitly differentiate from sibling tools like 'certify_with_confidence' or 'verify_proof'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions cost and payment in EGLD, but provides no guidance on when to use this tool versus alternatives, nor when not to use it. It lacks explicit usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

certify_with_confidenceAInspect

Create a staged blockchain certification with a confidence score. Use this when your decision builds progressively — certify at 60% (initial assessment), 80% (pre-commitment), and 100% (final decision). Each stage shares the same decision_id, creating an on-chain audit trail of the decision process. Governance: set reversibility_class='irreversible' for actions that cannot be undone — xproof will flag a policy violation if confidence_level < 0.95. Cost: $0.01 per certification.

ParametersJSON Schema
NameRequiredDescriptionDefault
whoNoAgent identity (wallet address, name, or agent ID)
whyNoReason or instruction hash driving this decision
filenameYesOriginal filename with extension (e.g. decision.json, max 255 chars)
file_hashYesSHA-256 hash of the decision or output file (64 hex characters)
author_nameNoName of the certifying agent (default: AI Agent, max 128 chars)
decision_idYesShared UUID linking all confidence stages for the same decision. Generate once and reuse across all stages.
threshold_stageYesNamed stage of the decision: initial (first assessment), partial (gathering info), pre-commitment (almost certain), final (committed).
confidence_levelYesConfidence score from 0.0 to 1.0. Typical values: 0.6 (initial), 0.8 (pre-commitment), 1.0 (final).
reversibility_classNoGovernance: how reversible is this action? 'reversible' = can be undone, 'costly' = reversible but expensive, 'irreversible' = cannot be undone (on-chain settlement, data deletion, sent email). When 'irreversible', confidence_level must be >= 0.95 or xproof flags a policy violation.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description discloses cost ($0.01 per certification), staging behavior (decision_id reuse), and governance rules (reversibility_class conditions). This adds significant behavioral context beyond the input schema. However, it omits potential side effects like blockchain delays or error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, front-loading the purpose and usage context, followed by governance and cost. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having 9 parameters and no output schema, the description does not mention what the tool returns (e.g., certification ID or confirmation). This leaves a significant gap for an agent to understand the complete interaction.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema documentation covers 100% of parameters with meaningful descriptions. The description adds narrative context (e.g., typical confidence values) but does not provide new semantic information beyond what the schema already offers, resulting in a baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: creating staged blockchain certifications with confidence scores. It distinguishes itself from sibling tools like 'certify_file' by emphasizing progressive staging and threshold stages.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit when-to-use guidance: 'Use this when your decision builds progressively' with concrete confidence percentages. It also includes governance instructions for irreversibility. However, it does not explicitly state when not to use this tool or name alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_attestationsAInspect

Check domain-specific attestations for an AI agent wallet on xproof. Returns active attestations issued by third-party certifying bodies (healthcare, finance, legal, security, research). Each active attestation adds +50 to the agent's trust score (max +150 from 3 attestations). Use this to verify an agent's credentials before delegating a sensitive task.

ParametersJSON Schema
NameRequiredDescriptionDefault
walletYesMultiversX wallet address (erd1...) of the agent to check
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden and delivers exceptionally: it reveals the trust scoring mechanics (+50 per attestation, max +150), filters (returns only 'active' attestations), data sources (third-party certifying bodies), and specific domain categories covered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four tightly crafted sentences with zero waste. Front-loaded with purpose ('Check domain-specific attestations...'), followed by return value specification, behavioral mechanics (scoring), and usage guidance. Every sentence earns its place with no redundant or generic filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool lacking output schema and annotations, the description is comprehensive. It compensates for missing output specification by detailing what gets returned (active attestations from specific domains) and the practical impact (trust score calculation), providing sufficient context for invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% establishing a baseline of 3. The description adds semantic value by specifying this is for an 'AI agent wallet' (not just any wallet) and contextualizing it 'on xproof', providing meaningful scope beyond the schema's technical address format description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states a specific verb ('Check') and resource ('domain-specific attestations for an AI agent wallet on xproof'), clearly distinguishing it from proof-verification siblings through terminology emphasizing third-party certifying bodies and specific domains (healthcare, finance, legal, security, research).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear when-to-use guidance ('before delegating a sensitive task') and explains the verification purpose, but lacks explicit alternatives or when-not-to-use clauses that would guide selection against similar tools like verify_proof or audit_agent_session.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_servicesAInspect

Discover available xproof certification services, pricing, and capabilities. No authentication required.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. It successfully communicates the authentication requirements (none needed), but omits other behavioral traits like rate limits, caching behavior, or pagination for the returned service list.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, zero waste. First sentence establishes purpose and scope, second provides critical auth context. Every sentence earns its place with no filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Absence of output schema requires description to hint at return values, which it does by listing 'services, pricing, and capabilities'. Does not explain response structure, but sufficiently complete for a simple discovery tool with no parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema contains zero parameters. Per calibration rules, 0 params warrants baseline score of 4. Description appropriately does not invent parameters where none exist.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb 'Discover' paired with resource 'xproof certification services' and supplements with 'pricing, and capabilities'. Clearly distinguishes from action-oriented siblings (certify_file, verify_proof, etc.) by indicating this is a metadata/discovery operation rather than a proof operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides one concrete guideline ('No authentication required'), indicating prerequisites. However, lacks explicit when-to-use guidance relative to siblings (e.g., does not state 'use this before certify_file to select a service'). Usage relative to alternatives is only implied.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_calibrationAInspect

Query an agent's calibration quality over time: mean confidence gap, variance, bias label (overconfident / underconfident / calibrated), and per-decision time series. Fully public — use this to evaluate another agent before trusting it. agentId accepts a MultiversX wallet address (erd1...) or internal user id.

ParametersJSON Schema
NameRequiredDescriptionDefault
nNoNumber of recent outcomes to include (default 50, max 200)
agent_idYesAgent wallet address (erd1...) or internal user id
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. States 'Fully public' implying no auth needed, and mentions parameter format. However, lacks details on read-only nature, rate limits, or data freshness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences cover purpose, outputs, and usage. Efficient with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Explains outputs (mean confidence gap, etc.) despite no output schema. Parameters fully described. Missing error handling or response format, but adequate for a query tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. Description adds no new meaning beyond schema; just restates that agentId accepts wallet or internal id. Baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool queries calibration quality over time, listing specific outputs (confidence gap, variance, bias label, time series). This distinguishes it from siblings like audit_agent_session or certify_file.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'use this to evaluate another agent before trusting it', giving clear context. Lacks explicit when-not or alternative tools but is still helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_proofBInspect

Retrieve a proof in structured JSON or Markdown format. Use JSON for machine processing, Markdown for LLM consumption.

ParametersJSON Schema
NameRequiredDescriptionDefault
formatNoOutput format: json or mdjson
proof_idYesUUID of the certification
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden. It explains the format options (behavioral trait) but fails to confirm idempotency, safety (read-only nature), error behaviors, or content structure of the proof.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely efficient two-sentence structure. Main action stated immediately, followed by parameter guidance. No redundancy or wasted words; every phrase earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate but incomplete given the tool ecosystem. With no output schema, the description mentions formats but not proof content structure. Critically, it fails to clarify this tool's role relative to certify_file and other proof-related siblings in the attestation workflow.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing a baseline of 3. The description adds significant value by explaining the semantic rationale for format selection (machine processing vs LLM consumption), which aids agent decision-making beyond the schema's basic type definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb ('Retrieve') and resource ('proof'), and specifies output formats. However, it does not explicitly differentiate from similar sibling tools like verify_proof or investigate_proof, which could cause selection confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides excellent guidance on format parameter selection (JSON for machines, Markdown for LLMs), but completely omits when to use this retrieval tool versus siblings like verify_proof or investigate_proof.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

investigate_proofAInspect

Reconstruct the full 4W audit trail for a contested agent action. Returns WHO (agent identity + SIGIL), WHAT (SHA-256 hash on-chain), WHEN (MultiversX block timestamp), WHY (decision chain anchored before acting). Includes verification summary with intent_preceded_execution flag, chronological timeline of WHY/WHAT proofs, and session heartbeat anchor. Requires x402 payment ($0.01 USDC on Base via X-PAYMENT header) or API key authentication. Without payment, returns payment requirements with USDC address and amount.

ParametersJSON Schema
NameRequiredDescriptionDefault
walletYesAgent wallet address (erd1...) that owns the proof
proof_idYesUUID of any proof in the action pair — WHY (reasoning), WHAT (action), or heartbeat session proof
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It fully discloses authentication/payment behavior, the fallback response (payment requirements), and the output contents (verification summary, timeline, heartbeat anchor). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Five sentences, each adding unique value. Front-loaded with purpose. No redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers key aspects: purpose, output (despite no output schema), authentication/payment. Lacks mention of rate limits or error handling, but given the moderate complexity and good annotations coverage, it is largely complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds minimal extra meaning beyond the schema – it contextualizes proof_id and wallet within the 4W trail but does not explain formats or constraints further.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Reconstruct the full 4W audit trail for a contested agent action' and lists the specific components (WHO, WHAT, WHEN, WHY). This is a specific verb+resource that distinguishes it from siblings like get_proof and verify_proof.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains authentication/payment requirements and what happens without payment. However, it does not explicitly contrast with sibling tools or state when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

submit_outcomeAInspect

Submit the actual outcome for a decision previously anchored with metadata.confidence_level. Computes the confidence gap (anchored − actual) and stores it for calibration tracking. Operator-only — you must own the proof. Each proof can only have one outcome.

ParametersJSON Schema
NameRequiredDescriptionDefault
proof_idYesUUID of the certification that was anchored with metadata.confidence_level
visibilityNoWhether this outcome is publicly visible (default: public)public
outcome_scoreYesActual outcome quality (0.0 = complete failure, 1.0 = fully successful)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description discloses key behaviors: it computes a confidence gap, stores it for calibration, and enforces a single outcome per proof. It implies a write operation and idempotency constraints, though it doesn't detail error states or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with no redundancy; the most critical information (what it does, constraints) is front-loaded. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with three parameters and no output schema, the description covers the main process and key constraint (single outcome). It could mention expected response or error conditions, but the given information is sufficient for basic usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters have schema descriptions, so the baseline is 3. The description adds minimal extra clarity about how outcome_score relates to the anchored confidence, but does not elaborate on visibility or proof_id beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the action: submitting the actual outcome for a decision previously anchored with confidence_level. It uses a specific verb ('submit') and resource ('outcome'), and distinguishes it from sibling tools by referencing the preceding anchoring step and calibration tracking.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It states operator-only access, ownership requirement, and the one-outcome-per-proof constraint. While it doesn't explicitly list when not to use it or compare to siblings, the context is sufficient for an agent to understand the prerequisite (anchored decision) and limitation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_proofAInspect

Verify an existing xproof certification. Returns proof details including file hash, timestamp, blockchain transaction, and verification status.

ParametersJSON Schema
NameRequiredDescriptionDefault
proof_idYesUUID of the certification to verify
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Adds valuable context by disclosing blockchain involvement ('blockchain transaction') and specific return fields (file hash, timestamp, status). However, omits operational details like rate limits, caching behavior, or what 'verification status' values mean.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, zero waste. First sentence declares operation; second sentence declares return payload. Efficiently front-loaded with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool without output schema, description adequately compensates by enumerating return fields (hash, timestamp, blockchain data, status). Lacks error case documentation but sufficient for tool selection given low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% ('proof_id' fully described as UUID). Description implies input by referencing 'existing certification' but adds no syntax semantics beyond the schema. Baseline 3 appropriate when schema does heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific verb ('Verify') and resource ('xproof certification') clearly. Implies distinction from sibling 'get_proof' (retrieval) and 'investigate_proof' (analysis) through the verification semantic, though explicit contrast is absent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Usage is implied by the verb 'Verify' — suggests use when validating proof authenticity rather than just retrieving it. However, lacks explicit when-to-use guidance or contrast with 'get_proof'/'investigate_proof' alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.