VaultCrux Platform
Server Details
VaultCrux Platform — 60 tools: retrieval, proof, intel, economy, watch, org
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 3.1/5 across 63 of 63 tools scored.
Every tool's description includes a directive to use 'cuecrux_session' instead, making all tools effectively indistinguishable from one another. They are all backward-compatibility wrappers with no distinct purpose in the intended workflow.
Tool names are diverse (e.g., 'accept_handoff_package', 'browse_bundles', 'query_vault') and follow a rough verb_noun pattern, but the repeated deprecation boilerplate undermines the naming as they are not meant to be used directly. The inconsistency between the implied direct use and the actual recommended single tool makes naming misleading.
63 tools is far too many for a server that instructs agents to use only one tool ('cuecrux_session'). The vast majority are redundant and clutter the surface, making it extremely inappropriate for the server's stated purpose.
While the tool list covers a wide range of operations, the directive to prefer 'cuecrux_session' suggests that these tools should not be used. The actual capability surface is unknown and likely incomplete since only one tool is recommended, creating dead ends for agents.
Available Tools
63 toolsaccept_handoff_packageAccept Handoff PackageCInspect
Accept an incoming handoff package.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| package_id | Yes | The handoff package ID to accept. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden but provides minimal behavioral insight. It implies a mutation (accepting changes state) but doesn't disclose permissions needed, side effects, whether it's reversible, or what happens to the package post-acceptance. This leaves critical gaps for an agent to understand the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's front-loaded with the core action and resource, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is inadequate. It doesn't explain what 'accept' means operationally, what the expected outcome is, or any error conditions. Given the complexity implied by handoff workflows, more context is needed for the agent to use this effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 100% description coverage, with the single parameter 'package_id' clearly documented. The description adds no additional parameter context beyond what the schema provides, so it meets the baseline of 3 for high schema coverage without compensating value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('accept') and resource ('incoming handoff package'), making the tool's purpose immediately understandable. However, it doesn't differentiate from sibling tools like 'create_handoff_package' or explain what acceptance entails versus other handoff-related operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. There's no mention of prerequisites (e.g., needing an existing handoff package), what happens after acceptance, or how it differs from other handoff-related tools like 'create_handoff_package' in the sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
action_journal_queryQuery Action Journal ReceiptsAInspect
Search action journal receipts with filters for time range, outcome, and tool name.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of receipts to return. | |
| since | No | ISO 8601 datetime for the start of the query window. | |
| until | No | ISO 8601 datetime for the end of the query window. | |
| offset | No | Number of receipts to skip for pagination. | |
| outcome | No | Filter by outcome status. | |
| tool_name | No | Filter by the name of the tool that produced the receipt. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description does not disclose behavioral traits such as side effects, authentication requirements, rate limits, or any other important behavioral details. It only describes the query function.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is relatively concise, with the first sentence stating the core functionality. The second paragraph about cuecrux_session is somewhat lengthy but relevant for guidance. No superfluous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description lacks details about return values, pagination behavior, or what a receipt looks like. Without an output schema, the agent is left uninformed about what the tool returns, making the tool definition incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema covers 100% of parameters with descriptions, so the description adds no extra meaning. The description merely mentions filters without additional semantic value, resulting in a baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches action journal receipts with specific filters (time range, outcome, tool name). It is specific enough to convey the purpose, though it doesn't explicitly differentiate from sibling query tools like query_vault or get_journal.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance: prefer cuecrux_session for routing, and this tool is for backward compatibility. This clearly tells when to use this tool vs. alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
annotate_sessionAnnotate SessionCInspect
Add an annotation to the current session.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| content | Yes | Annotation content. | |
| session_id | No | Session ID (defaults to 'default'). | |
| annotation_type | No | Type of annotation. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. While 'Add an annotation' implies a write/mutation operation, it doesn't specify permissions needed, whether annotations are editable/deletable, rate limits, or what happens on success/failure. The description is minimal and lacks important operational context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that communicates the core purpose without any wasted words. It's appropriately sized for a simple annotation tool and front-loads the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is insufficient. It doesn't explain what an 'annotation' means in this context, how annotations are stored/retrieved, or what the tool returns. The minimal description leaves too many operational questions unanswered.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all three parameters (content, session_id, annotation_type). The description adds no parameter-specific information beyond what's in the schema, meeting the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Add an annotation') and target resource ('to the current session'), making the purpose immediately understandable. However, it doesn't differentiate this tool from potential annotation-related siblings (none are listed, but the description doesn't explicitly address uniqueness).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, prerequisites, or contextual constraints. It simply states what the tool does without indicating appropriate scenarios or limitations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
browse_bundlesBrowse BundlesCInspect
List available credit bundles for purchase.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of bundles to return. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions listing bundles but does not cover key traits like whether this is a read-only operation, if it requires authentication, rate limits, pagination behavior, or what the return format looks like. This leaves significant gaps in understanding how the tool behaves beyond its basic function.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and wastes no space, making it easy to parse quickly while conveying the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is incomplete for a tool that likely returns a list of bundles. It does not explain the return format, error conditions, or behavioral aspects like authentication needs. For a tool with no structured metadata, the description should provide more context to be fully actionable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage for the single parameter 'limit', so the schema already documents it fully. The description does not add any parameter-specific details beyond what the schema provides, such as default values or usage context. With high schema coverage, a baseline score of 3 is appropriate, as the description adds no extra semantic value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'List' and resource 'available credit bundles for purchase', making the purpose specific and understandable. However, it does not explicitly differentiate from sibling tools like 'get_credit_balance' or 'purchase_bundle', which might handle related credit operations, leaving some ambiguity in distinguishing its exact scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as 'get_credit_balance' for checking current balances or 'purchase_bundle' for buying bundles. It lacks context on prerequisites, exclusions, or specific scenarios, offering only a basic functional statement without usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
change_seat_roleChange Seat RoleCInspect
Change the role assigned to an existing organisation seat.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| role | Yes | The new role to assign. | |
| seat_id | Yes | The ID of the seat to update. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states this is a mutation operation ('Change'), but doesn't mention required permissions, whether changes are reversible, potential side effects, or what happens on success/failure. For a mutation tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that gets straight to the point with zero wasted words. It's appropriately sized for a simple mutation operation and front-loads the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is incomplete. It doesn't address behavioral aspects like permissions, side effects, or response format. While concise, it lacks the contextual information needed for safe and effective tool invocation given the complexity of role changes in organizational systems.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already fully documents both parameters (seat_id and role). The description doesn't add any parameter-specific information beyond what's in the schema, such as role format constraints or seat_id validation rules. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Change the role') and target resource ('an existing organisation seat'), making the purpose immediately understandable. It doesn't specifically differentiate from sibling tools like 'invite_seat' or 'revoke_seat', but the verb 'change' versus 'invite' or 'revoke' provides some implicit distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'invite_seat' (for new seats) or 'revoke_seat' (for removal). It mentions 'existing organisation seat' which implies a prerequisite but doesn't explicitly state when this tool is appropriate or inappropriate compared to siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_coalitionCreate CoalitionCInspect
Create a multi-agent coalition to address a knowledge gap.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| expires_at | No | ISO 8601 expiry timestamp. | |
| budget_cap_crux | No | Budget cap in crux credits (defaults to 10). | |
| gap_description | Yes | Description of the knowledge gap. | |
| initial_pledge_crux | No | Initial pledge in crux credits (defaults to 1). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but offers minimal behavioral insight. 'Create' implies a write/mutation operation, but the description doesn't disclose permission requirements, what happens after creation, whether coalitions are persistent, or any side effects. It mentions the purpose but not the operational behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the core purpose without unnecessary words. It's appropriately sized for a creation tool and front-loads the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a creation tool with 4 parameters, no annotations, and no output schema, the description is insufficient. It doesn't explain what a 'coalition' entails operationally, what happens after creation, success/failure conditions, or return values. The context signals indicate this is a non-trivial tool that needs more complete documentation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so all parameters are documented in the schema. The description adds no additional parameter context beyond what's in the schema - it doesn't explain relationships between parameters (e.g., how budget_cap_crux relates to initial_pledge_crux) or provide usage examples. Baseline 3 is appropriate when schema does the documentation work.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Create') and resource ('multi-agent coalition') with a specific purpose ('to address a knowledge gap'). It distinguishes from obvious siblings like 'join_coalition' but doesn't explicitly differentiate from other creation tools like 'create_handoff_package'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides minimal guidance - it implies this tool should be used when there's a knowledge gap to address, but offers no explicit when/when-not criteria, prerequisites, or alternatives. No comparison to sibling tools like 'get_knowledge_gaps' or 'join_coalition' is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_handoff_packageCreate Handoff PackageCInspect
Create a handoff package for multi-agent session transfer.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Scope object for the handoff. | |
| session_id | No | Session ID (defaults to agent ID). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions creation but doesn't specify whether this is a write operation, what permissions are needed, if it's idempotent, what happens on failure, or what the output looks like. For a creation tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without any unnecessary words or structural complexity. It's appropriately sized and front-loaded with the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a creation tool with no annotations and no output schema, the description is insufficient. It doesn't explain what a handoff package contains, how it's used, what the creation process entails, or what happens after creation. The context signals indicate complexity (nested objects) that isn't addressed in the minimal description.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description adds no additional parameter information beyond what's in the schema, maintaining the baseline score of 3 for adequate but not enhanced parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'create' and the resource 'handoff package', specifying it's for 'multi-agent session transfer'. This is specific enough to understand the tool's function, though it doesn't explicitly differentiate from its sibling 'accept_handoff_package' beyond the action direction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites, timing considerations, or exclusions. It simply states what the tool does without contextual usage information.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cuecrux_sessionCueCrux SessionAInspect
Opens a CueCrux session and returns a typed capability plan (retrieval, proofing, memory, journaling, audit) across VaultCrux and MemoryCrux. Call this first, once. Every subsequent action routes through the channels the plan returns — do not browse the legacy per-service tool list when a plan channel already covers the capability. Identical behaviour for local Crux CE and hosted CueCrux. Hosted deployments stage v1 flat-list or v2 typed-graph plan shapes behind feature flags; callers treat the returned plan as the single source of routing truth. Bulk-capable agents transparently use the HTTP/2 binary channel; MCP-only agents use the MCP fallback URLs the plan provides. Implements RCX-Protocol v1.0.
| Name | Required | Description | Default |
|---|---|---|---|
| hints | No | Optional routing / shaping hints. | |
| model | No | Optional model declaration used for capability-model policy gating. | |
| intent | No | Optional intent hint (e.g., 'audit_review', 'document_ingest'). Lets the capability graph be reordered to put intent-relevant capabilities first. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behavioral traits: the tool returns a capability plan with typed routing hints, works identically for local and hosted deployments, and specifies protocol implementation ('Implements RCX-Protocol v1.0'). It also explains how different agents (bulk-capable vs. MCP-only) interact with the plan. However, it lacks details on error handling or performance characteristics, preventing a perfect score.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately sized and front-loaded, starting with the core purpose and usage guidelines. Every sentence adds value, such as explaining deployment compatibility and agent interactions. However, some sentences are complex and could be slightly streamlined (e.g., the one about hosted deployments and feature flags), making it very good but not perfectly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (session initialization with routing plans) and no annotations or output schema, the description does a strong job of covering context. It explains the tool's role, return value (capability plan), deployment scenarios, and agent interactions. However, it lacks details on the plan's structure or example outputs, which would be helpful for a tool with no output schema, keeping it from a perfect score.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so the baseline is 3. The description adds value by contextualizing the parameters: it mentions 'hints' as 'routing / shaping hints' and explains that 'hosted deployments can stage a legacy-compatible handshake shape or the newer v2 graph shape behind feature flags,' which relates to the 'hints' parameter's effects. However, it doesn't elaborate on 'model' or 'intent' parameters beyond what the schema provides, keeping it from a score of 5.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Opens a CueCrux session' and specifies it 'Returns a capability plan covering retrieval, proofing, memory, journaling, and audit across VaultCrux and MemoryCrux.' It distinguishes itself from siblings by being the foundational tool that must be 'Call[ed] this first' before using other tools, making the purpose specific and well-differentiated.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidelines: 'Call this first. All subsequent work flows through the plan's channels.' It also distinguishes when to use this tool versus alternatives by noting that 'hosted deployments can stage a legacy-compatible handshake shape or the newer v2 graph shape behind feature flags, but callers should always treat the returned plan as the source of routing truth,' guiding users on how to handle different deployment scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
declare_revenue_willingnessDeclare Revenue WillingnessBInspect
Declare willingness to pay for a feature or category, helping prioritize the product roadmap.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| notes | No | Free-form notes about the declaration. | |
| category | No | Category of the declaration (default: other). | |
| metadata | No | Additional metadata to attach. | |
| confidence | No | Confidence level in the willingness declaration (default: medium). | |
| request_id | No | The ID of a specific feature request this declaration relates to. | |
| billing_cycle | No | Preferred billing cycle (default: monthly). | |
| willingness_band | No | Price band the agent is willing to pay (default: lt_100). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool helps 'prioritize the product roadmap,' suggesting it's a write operation that may influence product decisions, but it doesn't disclose critical behavioral traits such as authentication requirements, rate limits, whether the declaration is reversible, or what happens after submission (e.g., confirmation, impact). For a tool with potential product impact and no annotations, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence: 'Declare willingness to pay for a feature or category, helping prioritize the product roadmap.' It is front-loaded with the core action and outcome, with no wasted words. Every part of the sentence contributes to understanding the tool's purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (7 parameters, no output schema, no annotations), the description is adequate but incomplete. It explains the high-level purpose but lacks details on behavioral aspects (e.g., mutability, side effects) and doesn't leverage the rich parameter schema to guide usage. For a tool that could influence product decisions, more context on outcomes and constraints would be beneficial.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with detailed parameter descriptions (e.g., 'Free-form notes about the declaration,' 'Category of the declaration (default: other)'). The description doesn't add any parameter-specific information beyond what's in the schema, such as examples or usage tips. With high schema coverage, the baseline score is 3, as the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Declare willingness to pay for a feature or category, helping prioritize the product roadmap.' It specifies the verb ('declare') and resource ('willingness to pay'), and mentions the outcome ('prioritize the product roadmap'). However, it doesn't explicitly differentiate from sibling tools like 'submit_feature_request' or 'vote_feature_request', which might have overlapping purposes in product feedback.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides minimal usage guidance. It implies this tool is for declaring payment willingness to influence roadmap prioritization, but it doesn't specify when to use it versus alternatives like 'submit_feature_request' or 'vote_feature_request' (which are sibling tools). No exclusions, prerequisites, or explicit alternatives are mentioned, leaving the agent with little context for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
diff_receiptsDiff ReceiptsCInspect
Compare two provenance receipts and highlight differences.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| receipt_id_a | Yes | First receipt ID. | |
| receipt_id_b | Yes | Second receipt ID. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool compares and highlights differences, but doesn't disclose behavioral traits such as whether it's read-only, what format the output takes, if there are rate limits, or if it requires specific permissions. This is inadequate for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core functionality. There is no wasted verbiage, and it directly communicates the tool's purpose without unnecessary details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no annotations and no output schema, the description is insufficient. It doesn't explain what a 'provenance receipt' is, what 'highlight differences' entails (e.g., output format), or any behavioral constraints. This leaves significant gaps for an agent to understand and use the tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with both parameters ('receipt_id_a' and 'receipt_id_b') documented as receipt IDs. The description adds no additional semantic meaning beyond this, such as explaining what a 'provenance receipt' is or how IDs should be formatted. Baseline 3 is appropriate given the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Compare two provenance receipts and highlight differences.' It specifies the verb ('compare'), resource ('provenance receipts'), and outcome ('highlight differences'). However, it doesn't explicitly differentiate from sibling tools like 'find_contradictions' or 'get_proof_receipt', which might have overlapping functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, context, or exclusions. Given sibling tools like 'find_contradictions' and 'get_proof_receipt', the lack of differentiation leaves the agent without clear usage criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
explain_last_answerExplain Last AnswerCInspect
Explain how the last answer was derived.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| answer_id | No | The answer ID to explain (defaults to last answer). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool explains derivations but doesn't disclose behavioral traits such as whether it's read-only, requires specific permissions, has rate limits, or what happens if no last answer exists. For a tool with zero annotation coverage, this leaves significant gaps in understanding its operation and constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's function. It's front-loaded with the core purpose and avoids unnecessary details, making it easy to parse. Every word earns its place, with no waste or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of explaining answer derivations, no annotations, and no output schema, the description is incomplete. It doesn't cover what the explanation includes (e.g., steps, sources, confidence), error conditions, or output format. For a tool that likely returns detailed reasoning, this lack of context makes it inadequate for effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with one parameter 'answer_id' documented as defaulting to the last answer. The description adds no additional meaning beyond this, as it only mentions 'last answer' without elaborating on parameter usage or semantics. Baseline 3 is appropriate since the schema adequately covers the parameter, but no extra value is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the tool's purpose as explaining how an answer was derived, which is clear but vague. It specifies 'last answer' as the default target, but doesn't detail what constitutes an 'answer' or 'explanation' in this context. Compared to siblings like 'get_counterfactual_summary' or 'get_proof_status', it's distinguishable but lacks specificity about the explanation format or content.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites (e.g., requiring a prior answer), exclusions, or related tools like 'get_reasoning_profile' or 'get_proof_status' that might offer similar insights. Usage is implied only by the tool name and description, with no explicit context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
find_contradictionsFind ContradictionsCInspect
Scan for contradictions across knowledge sources.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| depth | No | Scan depth (e.g. 'shallow', 'deep'). | |
| scope | No | Scope object to narrow the scan. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but offers minimal behavioral insight. It mentions 'scan' but doesn't clarify what 'contradictions' mean operationally, potential side effects, performance characteristics, or output format. For a tool with no annotation coverage, this is inadequate disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's appropriately sized and front-loaded, directly stating the tool's function without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, no output schema, and a tool that presumably performs non-trivial analysis ('scan for contradictions'), the description is incomplete. It lacks details on what constitutes a contradiction, how results are returned, or any behavioral context, making it insufficient for an agent to use effectively without additional information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('depth' and 'scope'). The description adds no additional parameter semantics beyond what's in the schema, such as examples of 'knowledge sources' or how parameters affect the scan. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('scan') and target ('contradictions across knowledge sources'), making the purpose understandable. It doesn't explicitly differentiate from sibling tools, but given the unique nature of 'find_contradictions' among the listed siblings, the purpose is sufficiently clear without direct comparison.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, prerequisites, or specific contexts. It lacks any mention of related tools or scenarios where this scanning operation is appropriate, leaving usage entirely implicit.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
forecast_obsolescenceForecast ObsolescenceCInspect
Forecast which artefacts are likely to become obsolete.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | No | Domain to scope the forecast. | |
| artefacts | No | Artefacts to evaluate. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. It fails to describe what the forecast returns (e.g., scores, rankings, explanations), how it's computed, any limitations (e.g., data recency, confidence intervals), or side effects. This is inadequate for a forecasting tool with no structured behavioral hints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's appropriately sized and front-loaded, with zero wasted content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a forecasting tool. It lacks details on return values, behavioral traits, and usage context, leaving significant gaps for an agent to understand how to effectively invoke and interpret results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are fully documented in the schema. The description adds no additional meaning about 'domain' or 'artefacts' beyond what the schema provides, such as domain examples or artefact types. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('forecast') and resource ('artefacts'), and specifies the forecast target ('likely to become obsolete'). It doesn't distinguish from sibling tools, but none appear to be direct alternatives for obsolescence forecasting, making the purpose adequately clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, prerequisites, or exclusions. It simply states what the tool does without contextual usage information, leaving the agent to infer appropriate scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_active_policyGet Active PolicyCInspect
Get the currently active policy.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| policy_name | No | Policy name to retrieve. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions 'currently active policy,' which hints at real-time or latest data, but doesn't disclose behavioral traits like permissions needed, rate limits, error conditions, or what 'active' means (e.g., effective date, user-specific). The description is minimal and lacks operational context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. It's front-loaded with the core action and resource, making it easy to parse. Every word earns its place, though it could benefit from more detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, no output schema, and a simple input schema, the description is incomplete. It doesn't explain what a 'policy' is in this context, what data is returned, or how 'active' is determined. For a tool with potential complexity (e.g., policy management), more context is needed to guide effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'policy_name' documented in the schema. The description adds no parameter-specific information beyond implying retrieval of an active policy. Since the schema does the heavy lifting, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the action ('Get') and resource ('currently active policy'), which is clear but vague. It doesn't specify what a 'policy' entails in this context or differentiate from sibling tools like 'set_policy' beyond the verb difference. The purpose is understandable but lacks specificity about the policy domain or content.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. With sibling tools like 'set_policy' and 'get_policy_status' (implied by context), the description doesn't indicate if this retrieves a default policy, the latest policy, or how it relates to other policy-related tools. Usage is implied only by the verb 'Get'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_beliefsGet BeliefsBInspect
List registered beliefs with optional filters.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results. | |
| since | No | ISO 8601 timestamp to filter beliefs after. | |
| offset | No | Offset for pagination. | |
| answer_ids | No | Filter by answer IDs. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but offers minimal behavioral insight. It implies a read-only operation ('List') but doesn't disclose pagination behavior (beyond the 'offset' parameter in schema), rate limits, authentication needs, or what 'registered beliefs' entails. The agent must infer behavior from the tool name and sparse description.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It front-loads the core purpose ('List registered beliefs') and adds only essential qualification ('with optional filters'). This is appropriately sized for a list tool with well-documented parameters.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a list tool with no annotations and no output schema, the description is minimally adequate. It states what the tool does but lacks context about the belief system, return format, or error conditions. Given the 4 parameters and 100% schema coverage, it's complete enough for basic use but leaves behavioral aspects to inference.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all four parameters. The description adds no additional meaning beyond mentioning 'optional filters', which is already implied by the parameter names. This meets the baseline of 3 where the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List') and resource ('registered beliefs'), and mentions optional filtering. It distinguishes from sibling tools like 'register_belief' by focusing on retrieval rather than creation. However, it doesn't explicitly differentiate from other list-style tools like 'get_watches' or 'get_watch_alerts' beyond the resource type.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, context for filtering, or compare it to other list/query tools in the sibling set. The phrase 'with optional filters' is generic and doesn't help the agent choose between this and tools like 'query_vault' or 'get_journal'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_blast_radiusGet Blast RadiusBInspect
Estimate the impact radius if an artefact or receipt changes.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | No | Domain scope for the analysis. | |
| receipt_id | No | The receipt ID to analyze. | |
| artefact_id | No | The artefact ID to analyze. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'Estimate' which implies a read-only calculation, but doesn't specify whether this requires specific permissions, what format the estimation returns, whether it's a real-time or cached analysis, or any rate limits. For an analysis tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that gets straight to the point with zero wasted words. It's appropriately sized for a tool with three parameters and no complex behavioral nuances to explain, making it perfectly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with three parameters, 100% schema coverage, but no annotations and no output schema, the description is minimally adequate. It explains what the tool does but lacks crucial context about the estimation methodology, output format, and when to use it versus similar analysis tools. The absence of output schema means the description should ideally explain what 'impact radius' estimation returns, which it doesn't.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, so all three parameters are documented in the schema. The description mentions 'artefact or receipt changes' which aligns with the artefact_id and receipt_id parameters, but doesn't add meaningful semantic context beyond what the schema already provides. The baseline score of 3 reflects adequate but minimal value addition.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('Estimate') and resource ('impact radius'), and specifies the trigger condition ('if an artefact or receipt changes'). However, it doesn't distinguish this tool from potential siblings like 'get_break_analysis' or 'forecast_obsolescence' that might also analyze impacts, which prevents a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools like 'get_break_analysis' and 'forecast_obsolescence' that might serve similar analytical purposes, there's no indication of when this specific impact estimation is appropriate or what distinguishes it from other analysis tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_break_analysisGet Break AnalysisCInspect
Analyze what would break if a given answer or receipt is invalidated.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | No | Domain scope for the analysis. | |
| answer_id | No | The answer ID to analyze. | |
| receipt_id | No | The receipt ID to analyze. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool performs analysis but doesn't describe what 'break' means, whether this is a read-only operation, what the output format is, or any side effects. This is inadequate for a tool with potential complexity.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence with zero waste. It's appropriately sized and front-loaded, efficiently conveying the core purpose without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, no output schema, and a potentially complex analysis tool, the description is incomplete. It doesn't explain what 'break' entails, the scope of analysis, or the return format, leaving significant gaps for the agent to understand the tool's behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all three parameters (domain, answer_id, receipt_id). The description doesn't add any meaning beyond the schema, such as explaining relationships between parameters or usage constraints. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Analyze what would break if a given answer or receipt is invalidated.' It specifies the action ('analyze') and the resource ('what would break'), but doesn't explicitly differentiate from siblings like 'get_blast_radius' or 'forecast_obsolescence' which might have related functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, context, or exclusions, leaving the agent to infer usage based on the purpose alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_counterfactual_summaryGet Counterfactual SummaryCInspect
Generate a counterfactual summary for an answer.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| answer_id | No | The answer ID to summarize. | |
| receipt_id | No | The receipt ID to summarize. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'Generate a counterfactual summary,' implying a read-only operation, but does not specify if it requires permissions, affects data, has rate limits, or what the output format might be. For a tool with no annotations, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words, making it easy to parse. However, it lacks front-loaded critical details like context or differentiation, which slightly reduces its effectiveness despite its brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of generating a 'counterfactual summary' and the absence of annotations and output schema, the description is incomplete. It does not explain what a counterfactual summary is, how it differs from other summary tools, or what the output entails. This leaves significant gaps for an AI agent to understand and use the tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with parameters 'answer_id' and 'receipt_id' clearly documented. The description does not add any meaning beyond the schema, such as explaining the relationship between these IDs or which is prioritized. Given the high schema coverage, a baseline score of 3 is appropriate, as the schema handles the parameter documentation adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the tool's purpose as 'Generate a counterfactual summary for an answer,' which is clear but vague. It specifies the action (generate) and target (counterfactual summary for an answer), but does not define what a 'counterfactual summary' entails or how it differs from similar tools like 'explain_last_answer' or 'get_proof_receipt.' This lack of differentiation from siblings reduces its effectiveness.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It does not mention prerequisites, context, or exclusions, such as whether it requires specific answer states or how it relates to other tools like 'explain_last_answer' or 'get_reasoning_profile.' This absence of usage context leaves the agent without clear direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_credit_balanceGet Credit BalanceBInspect
Retrieve the current credit balance for the agent, including receipt verification and passport data.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions 'retrieve' which implies a read-only operation, but doesn't disclose behavioral traits like authentication needs, rate limits, or what 'including receipt verification and passport data' entails in terms of output or side effects. This leaves gaps in understanding how the tool behaves beyond its basic function.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose. However, the phrase 'including receipt verification and passport data' could be more precise or structured to avoid ambiguity, slightly reducing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (0 parameters, no output schema, no annotations), the description is minimally adequate but incomplete. It lacks details on output format, error handling, or how 'receipt verification and passport data' integrate, leaving the agent with unanswered questions about the tool's full context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0 parameters and 100% schema description coverage, the baseline is high. The description adds value by specifying that the retrieval includes 'receipt verification and passport data', which provides context beyond the empty schema, though it doesn't fully explain what these inclusions mean semantically.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'retrieve' and resource 'current credit balance for the agent', making the purpose understandable. It distinguishes from siblings like 'get_credit_escrow' by focusing on balance rather than escrow, though it doesn't explicitly contrast with all siblings. The inclusion of 'including receipt verification and passport data' adds specificity but slightly muddles the primary purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'get_credit_escrow' or 'get_spend_receipt'. It lacks context such as prerequisites, frequency of use, or scenarios where it's most applicable, leaving the agent without clear usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_credit_escrowGet Credit EscrowBInspect
List active escrow holds for the tenant.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states 'List active escrow holds', which implies a read-only operation, but does not specify details like pagination, error handling, authentication needs, or rate limits. This leaves significant gaps for a tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and appropriately sized for a simple tool, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (0 parameters, no output schema, no annotations), the description is minimally adequate but lacks depth. It does not explain what 'active escrow holds' entail or the return format, which could be important for an AI agent to understand the output fully. With no annotations, more context would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has 0 parameters, and the schema description coverage is 100%, so there is no need for parameter details in the description. The description does not add or detract from parameter semantics, aligning with the baseline for zero parameters, but it could have mentioned any implicit filters (e.g., 'active' status) for a higher score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('active escrow holds for the tenant'), making the purpose specific and understandable. However, it does not explicitly differentiate from sibling tools like 'get_credit_balance', which might be related, so it lacks sibling distinction for a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as other 'get_' tools in the sibling list. It implies usage for listing escrow holds but offers no context on prerequisites, exclusions, or comparisons to similar tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_daily_briefingGet Daily BriefingCInspect
Get the daily knowledge briefing.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but offers minimal insight. It implies a read-only operation ('Get') but doesn't specify if it requires authentication, has rate limits, returns structured or unstructured data, or involves any side effects. This leaves significant behavioral gaps for a tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. It's appropriately sized for a simple tool and front-loaded with the core action, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's apparent simplicity (0 parameters) but lack of annotations and output schema, the description is incomplete. It doesn't explain what the briefing contains, its format, or how it's generated, leaving the agent unsure of the tool's utility or return value in a context with many sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has 0 parameters, and the input schema has 100% description coverage (though empty). The description doesn't need to compensate for missing parameter info, so it meets the baseline for a parameterless tool. No additional semantic value is required or provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Get the daily knowledge briefing' is essentially a tautology that restates the tool name 'get_daily_briefing' without adding meaningful specificity. It doesn't explain what constitutes a 'knowledge briefing' or what content it contains, making the purpose vague despite the clear verb+resource structure.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description doesn't mention frequency (e.g., once per day), prerequisites, or how it differs from sibling tools like 'get_journal' or 'get_session_context', leaving the agent without contextual usage cues.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_domain_affinityGet Domain AffinityBInspect
Get the agent's domain affinity scores.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool retrieves scores but doesn't explain what 'domain affinity' means, how scores are calculated, whether this is a read-only operation, or what the output format might be. For a tool with zero annotation coverage, this leaves critical behavioral traits unspecified.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and wastes no space, making it easy for an AI agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 0 parameters and no output schema, the description is minimally adequate but lacks depth. It doesn't explain the concept of 'domain affinity' or provide context on usage, which could hinder an AI agent's ability to invoke it correctly in complex scenarios. The absence of annotations exacerbates this gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0 parameters with 100% coverage, so no parameter documentation is needed. The description doesn't add any parameter information, which is appropriate. Baseline 4 is applied as per the rules for 0 parameters, indicating the description doesn't need to compensate for schema gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('domain affinity scores'), specifying what the tool does. It distinguishes from siblings by focusing on domain affinity scores, which is a unique concept among the listed tools. However, it doesn't explicitly differentiate from similar-sounding tools like 'get_trust_level' or 'get_reasoning_profile', which might also involve scoring or assessment.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, context, or exclusions, such as whether it requires specific permissions or is only applicable in certain scenarios. Given the many sibling tools, this lack of differentiation is a significant gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_economy_dashboardGet Economy DashboardBInspect
Retrieve the economy dashboard for the agent, showing balances, recent transactions, and spending summaries.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states 'retrieve' which implies a read-only operation, but doesn't disclose behavioral traits such as authentication needs, rate limits, data freshness, or whether it's cached. For a tool with zero annotation coverage, this is insufficient, as it misses key operational context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the purpose and details the data shown. Every word earns its place with no redundancy or fluff, making it highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity is low (0 parameters, no output schema), the description is complete enough to understand the basic purpose. However, without annotations or output schema, it lacks details on behavioral aspects and return format, leaving gaps for the agent to operate effectively in a real context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has 0 parameters with 100% schema description coverage, so the baseline is 4. The description doesn't need to add parameter details, and it doesn't introduce any confusion about inputs, making this adequate for a parameterless tool.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'retrieve' and the resource 'economy dashboard', specifying what data it shows: balances, recent transactions, and spending summaries. It distinguishes from siblings like get_credit_balance by providing a broader dashboard view rather than a single metric. However, it doesn't explicitly differentiate from all potential dashboard-related tools, keeping it at 4.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, timing, or compare to siblings like get_credit_balance or get_spend_receipt, leaving the agent to infer usage based on the name alone. This lack of explicit when/when-not statements results in a score of 2.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_feature_requestsGet Feature RequestsBInspect
List feature requests with optional filtering by category and status.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of requests to return. | |
| cursor | No | Pagination cursor from a previous response. | |
| status | No | Filter by feature request status. | |
| category | No | Filter by feature request category. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. While 'List' implies a read-only operation, it doesn't explicitly state whether this is safe, whether it requires authentication, what the return format looks like, or if there are rate limits. For a tool with 4 parameters and no annotation coverage, this leaves significant behavioral questions unanswered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the core action and key capabilities without any wasted words. It's appropriately sized for a straightforward list operation and gets directly to the point.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a list tool with no annotations and no output schema, the description is minimally adequate. It covers the basic purpose and hints at filtering, but doesn't address behavioral aspects like pagination (implied by 'cursor' parameter), return format, or error conditions. Given the 4 parameters and lack of structured metadata, it should provide more context about what the tool actually returns.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, meaning all parameters are documented in the schema itself. The description mentions 'optional filtering by category and status,' which aligns with two of the four parameters but doesn't add meaningful semantic context beyond what the schema already provides. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('feature requests'), making the purpose immediately understandable. It also mentions optional filtering capabilities, which adds specificity. However, it doesn't distinguish this tool from potential sibling list operations (like 'get_watches' or 'get_watch_alerts'), so it doesn't reach the highest score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention any prerequisites, context for filtering, or relationship to other tools like 'submit_feature_request' or 'vote_feature_request' that appear in the sibling list. The agent receives no help in choosing between this and other list operations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_journalGet JournalCInspect
Fetch journal entries for the active agent, with optional filtering by time range, entry type, and pagination.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of entries to return. | |
| since | No | ISO 8601 datetime to fetch entries after. | |
| types | No | Entry types to filter by. | |
| cursor | No | Pagination cursor from a previous response. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool fetches entries with filtering and pagination, but doesn't cover critical aspects like whether this is a read-only operation, potential rate limits, authentication requirements, or what the return format looks like. This leaves significant gaps for an agent to understand the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose ('fetch journal entries for the active agent') followed by key capabilities. There's no wasted language or redundancy, making it appropriately concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 4 parameters, no annotations, and no output schema, the description is insufficient. It doesn't explain what journal entries contain, how results are structured, whether there are default limits, or error conditions. The agent lacks critical context to use this tool effectively despite the good parameter documentation in the schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description mentions filtering by time range, entry type, and pagination, which maps to the 'since', 'types', and 'cursor' parameters. However, with 100% schema description coverage, the schema already documents all four parameters thoroughly. The description adds minimal value beyond what's in the schema, meeting the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('fetch journal entries') and target resource ('for the active agent'), making the purpose unambiguous. However, it doesn't differentiate from sibling tools like 'action_journal.query' or 'get_session_context' which might have overlapping functionality, preventing a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions optional filtering by time range, entry type, and pagination, which implies some usage context, but provides no explicit guidance on when to use this tool versus alternatives like 'action_journal.query' or 'get_session_context'. There's no mention of prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_knowledge_gapsGet Knowledge GapsCInspect
Identify knowledge gaps across domains.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results. | |
| domain | No | Filter by domain. | |
| offset | No | Offset for pagination. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. 'Identify knowledge gaps' suggests a read-only analysis operation, but doesn't disclose whether this requires specific permissions, what format the results take, whether it's computationally expensive, or how 'knowledge gaps' are determined. The description lacks behavioral details about the tool's operation beyond its basic purpose.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise at just one sentence with 5 words. While this is efficient, it may be too brief given the complexity of identifying 'knowledge gaps' - a concept that likely requires more explanation. Every word earns its place, but the description might benefit from slightly more elaboration to be truly helpful.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool that presumably analyzes complex knowledge structures across domains, the description is insufficient. With no annotations, no output schema, and a vague purpose statement, an agent would struggle to understand what this tool actually returns or how to interpret its results. The description doesn't compensate for the lack of structured metadata about the tool's behavior and outputs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all three parameters (limit, domain, offset) having clear descriptions in the schema. The tool description adds no parameter information beyond what's already documented in the structured schema. According to scoring rules, when schema coverage is high (>80%), the baseline is 3 even with no param info in the description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Identify knowledge gaps across domains' states a general purpose but lacks specificity about what constitutes a 'knowledge gap' or how they are identified. It mentions 'across domains' which provides some scope, but doesn't distinguish this tool from potential siblings that might analyze knowledge in other ways. The verb 'identify' is clear but the object 'knowledge gaps' is vague without operational definition.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools like 'get_beliefs', 'get_domain_affinity', 'get_reasoning_profile', and 'find_contradictions' that might relate to knowledge analysis, there's no indication of when this specific gap identification tool is appropriate versus those other tools. No context, prerequisites, or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_passportGet PassportBInspect
Retrieve the agent's trust passport.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the action ('Retrieve') but doesn't describe what a 'trust passport' entails, how it's formatted, any authentication requirements, rate limits, or error conditions. This leaves significant gaps for a tool that likely returns sensitive or structured data.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without any fluff or redundancy. It's front-loaded and wastes no words, making it highly concise and well-structured for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is incomplete. It doesn't explain what a 'trust passport' is, what data it returns, or how to interpret the result. For a tool that likely provides critical agent information, more context is needed to guide effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has 0 parameters, and the schema description coverage is 100%, so there are no parameters to document. The description doesn't need to add parameter semantics, and it appropriately avoids unnecessary details, earning a baseline score of 4 for this dimension.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Retrieve') and resource ('the agent's trust passport'), making the purpose specific and understandable. It distinguishes this from sibling tools like 'verify_passport' by focusing on retrieval rather than verification, though it doesn't explicitly contrast with all siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'get_trust_level' or 'verify_passport'. It lacks context about prerequisites, timing, or scenarios where this tool is appropriate, leaving usage entirely implicit.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_pricingGet PricingBInspect
Retrieve current pricing information for the tenant.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states 'Retrieve' which implies a read operation, but doesn't disclose behavioral traits like whether this requires authentication, returns real-time or cached data, has rate limits, or what format the pricing information comes in. This leaves significant gaps for a tool that likely involves sensitive financial data.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that gets straight to the point with no wasted words. It's appropriately sized for a simple retrieval tool and front-loads the essential information without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no annotations, no output schema, and potentially complex financial data, the description is insufficient. It doesn't explain what 'pricing information' includes (e.g., plans, rates, tiers), how current the data is, or what format it returns. The agent would need to guess about the tool's behavior and output structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has 0 parameters with 100% schema description coverage, so the schema already fully documents the input requirements. The description doesn't need to add parameter information, and it appropriately doesn't mention any parameters. A baseline of 4 is appropriate for parameterless tools.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Retrieve') and resource ('current pricing information for the tenant'), making the purpose understandable. However, it doesn't differentiate from potential sibling tools like 'get_credit_balance' or 'get_spend_receipt' that might also involve financial information, leaving some ambiguity about scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With sibling tools like 'get_credit_balance' and 'get_spend_receipt' present, there's no indication of whether this tool is for general pricing, subscription plans, or specific cost calculations, leaving the agent to guess about appropriate contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_proof_chunksGet Proof ChunksAInspect
Retrieve the chunk-level hashes for a completed proof job. Supports cursor-based pagination.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor for the next page of chunks | |
| proof_job_id | Yes | The proof job ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It adds useful context by specifying that it 'supports cursor-based pagination,' which clarifies how results are handled. However, it lacks details on permissions, rate limits, error conditions, or the format of returned hashes, leaving gaps in behavioral understanding for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose in the first clause and adds a useful behavioral detail in the second sentence. It is appropriately sized with zero waste, making it efficient and easy to parse for an AI agent.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (retrieving hashes with pagination), no annotations, and no output schema, the description provides basic context but lacks completeness. It covers the purpose and pagination behavior but misses details on output format, error handling, or prerequisites, which could hinder effective tool invocation without further information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with clear documentation for both parameters ('proof_job_id' and 'cursor'). The description does not add any additional meaning beyond what the schema provides, such as explaining parameter interactions or constraints. According to the rules, with high schema coverage, the baseline score is 3, as the schema adequately handles parameter semantics without extra description input.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('retrieve') and resource ('chunk-level hashes for a completed proof job'), making the purpose specific and understandable. However, it does not explicitly differentiate this tool from sibling tools like 'get_proof_receipt' or 'get_proof_status', which might also relate to proof jobs, leaving some ambiguity in sibling context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage by mentioning 'completed proof job,' suggesting it should be used after a proof job is done, but it does not provide explicit guidance on when to use this tool versus alternatives like 'get_proof_receipt' or 'get_proof_status.' No exclusions or clear alternatives are stated, leaving usage context somewhat vague.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_proofpackGet ProofpackCInspect
Download the full proofpack bundle for a receipt. Includes all chunk hashes, Merkle tree, signature, and verification instructions.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| receipt_id | Yes | The receipt ID to get the proofpack for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the bundle contents (chunk hashes, Merkle tree, etc.) and that it's for verification, but doesn't cover critical aspects like whether this is a read-only operation, authentication requirements, rate limits, error conditions, or what the download format/response looks like. For a tool with zero annotation coverage, this leaves significant gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise - a single sentence that efficiently communicates the core functionality and bundle contents without any wasted words. It's front-loaded with the main action and resource, making it immediately understandable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no annotations and no output schema, the description is insufficiently complete. While it explains what a proofpack contains, it doesn't describe the return format, error handling, authentication needs, or operational constraints. Given the complexity implied by terms like 'Merkle tree' and 'verification instructions', more context would be helpful for an AI agent to use this tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the single parameter 'receipt_id' fully documented in the schema. The description doesn't add any additional parameter semantics beyond what the schema already provides (e.g., format examples, validation rules, or context about what constitutes a valid receipt_id). The baseline score of 3 reflects adequate coverage through the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Download') and resource ('full proofpack bundle for a receipt'), specifying what the tool does. It distinguishes from siblings like 'get_proof_chunks' or 'get_proof_receipt' by emphasizing the comprehensive bundle nature. However, it doesn't explicitly contrast with these siblings in the description text itself.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'get_proof_chunks' or 'get_proof_receipt'. The description implies it's for obtaining a complete verification package, but lacks explicit context about prerequisites, timing, or comparisons to sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_proof_receiptGet Proof ReceiptCInspect
Retrieve the cryptographic proof receipt for a specific answer. Contains the Merkle root, signature, and verification metadata.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| answer_id | Yes | The answer ID to get the proof receipt for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions the content of the receipt but does not cover critical aspects like whether this is a read-only operation, authentication requirements, rate limits, error handling, or response format. This leaves significant gaps for a tool that likely involves cryptographic data retrieval.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently conveys the purpose and key components without any wasted words. It is front-loaded with the main action and resource, making it highly concise and effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of cryptographic proof receipts and the lack of annotations and output schema, the description is incomplete. It does not explain the return values, error conditions, or behavioral traits, leaving the agent with insufficient context for reliable invocation in a potentially sensitive domain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the single parameter 'answer_id' clearly documented. The description does not add any additional meaning beyond the schema, such as format examples or constraints, but the schema is sufficient, so the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Retrieve') and resource ('cryptographic proof receipt for a specific answer'), with specific components listed (Merkle root, signature, verification metadata). However, it does not explicitly differentiate from sibling tools like 'get_proof_chunks' or 'get_proof_status', which may have overlapping domains, so it falls short of a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., 'get_proof_chunks' or 'get_proof_status'), nor does it mention any prerequisites or exclusions. Usage is implied by the purpose but lacks explicit context for selection among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_proof_statusGet Proof StatusAInspect
Poll the status of a proof job. Returns the current state (queued, processing, complete, failed) and progress details.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| proof_job_id | Yes | The proof job ID to check |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions the tool 'polls' status and returns state/progress details, which implies a read-only operation, but it does not disclose critical behaviors such as rate limits, authentication requirements, or whether it's idempotent. For a status-checking tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently conveys the tool's action ('poll'), target ('proof job'), and return information ('state and progress details'). It is front-loaded with the core purpose and avoids unnecessary details, making it highly concise and effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (one parameter, no output schema, no annotations), the description is minimally complete but lacks depth. It covers the basic purpose and return types, but without annotations or output schema, it misses details like error handling or response structure. This is adequate for a simple status tool but could be more informative.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'proof_job_id' fully documented in the schema. The description does not add any additional meaning beyond what the schema provides (e.g., format examples or constraints), so it meets the baseline of 3 for adequate but not enhanced parameter semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('poll') and resource ('proof job'), distinguishing it from siblings like 'get_proof_chunks' or 'get_proof_receipt' by focusing on status monitoring rather than content retrieval. It explicitly mentions what information is returned (state and progress details), making the function unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for checking the status of a proof job, but it does not explicitly state when to use this tool versus alternatives (e.g., 'get_proof_receipt' for results or 'get_proof_chunks' for content). No exclusions or prerequisites are mentioned, leaving the agent to infer context from the tool name and description alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_reasoning_profileGet Reasoning ProfileBInspect
Get the agent's current reasoning profile.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states this is a retrieval operation but provides no information about authentication requirements, rate limits, error conditions, or what format the reasoning profile returns. For a tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that communicates the core purpose without any wasted words. It's appropriately sized for a simple retrieval tool and front-loads the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, no output schema, and a simple zero-parameter design, the description is insufficient. It doesn't explain what a 'reasoning profile' contains, how it's structured, or what the agent should expect as output. For a tool that presumably returns configuration data, more context about the return value would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has zero parameters, and schema description coverage is 100%, so there are no parameters to document. The description appropriately doesn't mention parameters, which is correct for a parameterless tool. A baseline of 4 reflects that the description doesn't need to compensate for any parameter documentation gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('agent's current reasoning profile'), making the purpose immediately understandable. However, it doesn't differentiate this tool from its sibling 'set_reasoning_profile' beyond the verb difference, missing an opportunity to clarify the read-only vs. write distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided about when to use this tool versus alternatives. While the name implies retrieval, there's no mention of prerequisites, typical use cases, or how it relates to sibling tools like 'set_reasoning_profile' or 'get_session_context'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_session_contextGet Session ContextBInspect
Retrieve the current session context for the active agent, including recent interactions and state.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool retrieves data, implying a read-only operation, but doesn't specify permissions, rate limits, data format, or whether it's real-time vs. cached. For a tool with zero annotation coverage, this leaves significant gaps in understanding its behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action ('retrieve') and resource. It wastes no words, making it easy to parse and understand quickly without unnecessary detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (0 parameters, no output schema, no annotations), the description is minimally adequate. It explains what the tool does but lacks details on behavior, output format, or usage context, which could be helpful for an agent despite the simple schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0 parameters with 100% coverage, so no parameter documentation is needed. The description appropriately doesn't discuss parameters, focusing on the tool's purpose instead, which aligns with the schema's completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('retrieve') and resource ('current session context for the active agent'), specifying what the tool does. It distinguishes from siblings by focusing on session context rather than other data types like beliefs, receipts, or policies, though it doesn't explicitly contrast with similar tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, timing, or compare it to siblings like 'get_journal' or 'annotate_session', leaving the agent to infer usage context without explicit direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_spend_receiptGet Spend ReceiptCInspect
Retrieve a specific spend receipt by ID.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| receipt_id | Yes | The receipt ID to look up. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It only states the retrieval action without mentioning permissions, rate limits, error handling, or what the output looks like (e.g., receipt details format). This is inadequate for a tool that likely involves sensitive financial data.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without any fluff. It's front-loaded and appropriately sized, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete. It lacks details on behavioral traits (e.g., authentication needs, response format) and doesn't provide enough context for a retrieval tool in a complex system with many sibling tools, leaving significant gaps for agent understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the input schema already documents the 'receipt_id' parameter fully. The description adds no additional meaning beyond implying it's for lookup, which aligns with the schema but doesn't compensate or enhance it, meeting the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Retrieve') and resource ('spend receipt'), specifying it's for a specific receipt by ID. However, it doesn't distinguish this from sibling tools like 'get_proof_receipt' or 'diff_receipts', which might handle similar receipt-related operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. For example, it doesn't mention if this is for viewing details after a purchase or how it differs from 'get_proof_receipt' or 'diff_receipts' in the sibling list, leaving the agent without context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_stale_pinsGet Stale PinsBInspect
List pinned items for the active agent that may be outdated and need refresh or removal.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of stale pins to return. | |
| cursor | No | Pagination cursor from a previous response. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the tool lists items 'that may be outdated and need refresh or removal,' hinting at a read-only diagnostic function, but doesn't clarify permissions, rate limits, response format, or whether the tool itself performs any actions on the pins. This leaves significant gaps for a tool with potential mutation implications.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose without unnecessary words. Every part of the sentence contributes meaning, making it appropriately sized and well-structured for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is minimally complete. It explains what the tool does but omits critical details like return format, error conditions, or how 'stale' is determined. For a tool with potential read/write implications and no structured safety hints, this leaves the agent under-informed about behavioral expectations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with clear documentation for 'limit' and 'cursor' parameters. The description adds no additional parameter semantics beyond what the schema provides, such as explaining what constitutes 'stale' or default behaviors. This meets the baseline score of 3 for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'List pinned items for the active agent that may be outdated and need refresh or removal.' It specifies the verb ('List'), resource ('pinned items'), and scope ('for the active agent'), but doesn't explicitly differentiate from sibling tools like 'get_watches' or 'get_watch_alerts' which might have overlapping functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, exclusions, or compare it to sibling tools like 'get_watches' or 'get_watch_alerts', leaving the agent to infer usage context solely from the tool name and description.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_trust_levelGet Trust LevelBInspect
Get the current agent's trust escalation level.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool retrieves the 'current' trust level, implying it's a read operation, but does not disclose behavioral traits like authentication needs, rate limits, or what 'current' entails (e.g., real-time vs. cached). This leaves significant gaps for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and wastes no space, making it highly concise and well-structured for its simplicity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has no parameters, no annotations, and no output schema, the description is minimally adequate but incomplete. It explains what the tool does but lacks details on return values, error conditions, or behavioral context, which are important for a tool that might involve sensitive trust data.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has 0 parameters, and the input schema has 100% description coverage (though empty). The description does not need to add parameter semantics, as there are none to explain. A baseline of 4 is appropriate since no parameters exist, and the description does not mislead about inputs.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and the resource ('current agent's trust escalation level'), making the purpose specific and understandable. However, it does not differentiate from sibling tools like 'get_active_policy' or 'get_credit_balance', which follow a similar 'get X' pattern, so it lacks explicit distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as other 'get_' tools like 'get_active_policy' or 'get_credit_balance'. It implies usage for retrieving trust level but offers no context on prerequisites, timing, or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_watch_alertsGet Watch AlertsCInspect
Retrieve alerts triggered by a specific watch, with optional filtering by time and pagination.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of alerts to return. | |
| since | No | ISO 8601 datetime to fetch alerts after. | |
| cursor | No | Pagination cursor from a previous response. | |
| watch_id | Yes | The ID of the watch to retrieve alerts for. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'optional filtering by time and pagination,' which hints at some behavior, but lacks critical details like whether this is a read-only operation, if it requires specific permissions, rate limits, or what the response format looks like. For a tool with no annotation coverage, this is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose and includes key optional features. Every word contributes to understanding without redundancy, making it appropriately concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of retrieving alerts with filtering and pagination, no annotations, and no output schema, the description is incomplete. It doesn't explain the return format, error conditions, or behavioral nuances like ordering of results, which are essential for effective tool use in this context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds minimal value beyond the input schema, which has 100% coverage. It implies filtering by time ('since') and pagination ('cursor', 'limit'), but doesn't provide additional context like default values, typical usage patterns, or how parameters interact. Since the schema already documents all parameters well, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Retrieve') and resource ('alerts triggered by a specific watch'), making the purpose immediately understandable. However, it doesn't explicitly differentiate from sibling tools like 'get_watches' or 'get_blast_radius', which might also involve alert-related operations, so it doesn't reach the highest score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as whether it's for real-time monitoring or historical analysis, or if other tools like 'query_vault' might be better for broader searches. It mentions optional filtering but doesn't specify use cases or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_watchesGet WatchesBInspect
List all active watches for the current agent.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of watches to return. | |
| cursor | No | Pagination cursor from a previous response. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool lists active watches but doesn't clarify if this is a read-only operation, whether it requires authentication, or details about rate limits or pagination behavior. This leaves significant gaps in understanding how the tool behaves beyond its basic function.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence that efficiently conveys the tool's purpose without unnecessary words. It is front-loaded with the core action and resource, making it easy to parse and understand quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (listing with pagination), no annotations, and no output schema, the description is minimally adequate but incomplete. It covers the basic purpose but lacks details on behavior, output format, or error handling, which are important for effective tool use in this context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so the schema fully documents the 'limit' and 'cursor' parameters. The description adds no additional meaning or context about these parameters, such as default values or usage examples. This meets the baseline score when schema coverage is high.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List') and resource ('all active watches for the current agent'), making the purpose specific and understandable. However, it doesn't explicitly differentiate from sibling tools like 'get_watch_alerts', which might serve a related but distinct purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as 'get_watch_alerts' or other sibling tools. It lacks context about prerequisites, exclusions, or typical scenarios for invocation, leaving the agent without usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
invite_seatInvite SeatCInspect
Invite a new member to the organisation by email address.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| role | No | Role to assign to the new member (default: member). | |
| Yes | Email address of the person to invite. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but only states the basic action. It doesn't disclose behavioral traits like whether this sends an email invitation, requires specific permissions, has rate limits, or what happens on success/failure. For a mutation tool with zero annotation coverage, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It's front-loaded with the core purpose and appropriately sized for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is incomplete. It doesn't cover behavioral aspects like permissions, side effects, or response format, leaving gaps that could hinder an AI agent's correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('email' and 'role'). The description adds no additional meaning beyond implying the 'email' parameter is used for invitation, which is already clear from the schema. Baseline 3 is appropriate as the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('invite') and resource ('new member to the organisation') with the mechanism ('by email address'). It's specific but doesn't explicitly distinguish from sibling tools like 'change_seat_role' or 'revoke_seat', which would require mentioning it's for adding new members rather than modifying existing ones.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'change_seat_role' for existing members or 'list_seats' for viewing members. It lacks context such as prerequisites (e.g., admin permissions) or exclusions (e.g., not for updating roles).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
join_coalitionJoin CoalitionCInspect
Join an existing coalition with a credit pledge.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| pledge_crux | No | Pledge amount in crux credits (defaults to 1). | |
| coalition_id | Yes | The coalition ID to join. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but lacks critical behavioral details. It mentions a 'credit pledge' but doesn't specify if this is irreversible, requires sufficient credit balance, triggers notifications, or has rate limits. The description implies a write operation ('join') but doesn't disclose permissions, side effects, or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action ('join an existing coalition') and includes the key constraint ('with a credit pledge'). There is no wasted verbiage or redundant information, making it highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is incomplete. It doesn't explain what happens after joining (e.g., confirmation message, updated membership list), error scenarios (e.g., invalid ID, insufficient credits), or dependencies on other tools. Given the complexity of joining a coalition with financial implications, more context is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description adds no additional semantic context beyond implying 'pledge_crux' relates to the 'credit pledge' mentioned. This meets the baseline of 3 when schema coverage is high, but doesn't enhance understanding of parameter interactions or constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('join') and resource ('existing coalition'), specifying it involves a 'credit pledge'. It distinguishes from sibling 'create_coalition' by focusing on joining rather than creating. However, it doesn't explicitly contrast with other potential sibling tools like 'invite_seat' or 'list_seats' that might involve coalition membership.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a coalition ID from another tool), exclusions (e.g., cannot join if already a member), or comparisons to related tools like 'invite_seat' or 'get_credit_balance' for checking eligibility.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_seatsList SeatsBInspect
List all seats (members) in the current organisation.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of seats to return. | |
| cursor | No | Pagination cursor from a previous response. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states 'List all seats' but doesn't mention pagination behavior (implied by 'limit' and 'cursor' parameters), permissions required, rate limits, or what 'all' entails (e.g., active vs. inactive seats). For a listing tool with zero annotation coverage, this leaves significant gaps in understanding how it behaves.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It's front-loaded with the core purpose and uses parentheses to clarify 'seats' as 'members'. Every word earns its place without redundancy or unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (listing operation), 100% schema coverage, and no output schema, the description is minimally adequate. However, it lacks context on pagination behavior, permissions, or return format, which would be helpful for an agent to use it correctly. Without annotations, it should do more to compensate for missing behavioral details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both 'limit' and 'cursor' parameters fully documented in the schema. The description adds no additional parameter semantics beyond what's in the schema (e.g., it doesn't explain default values, pagination flow, or typical usage patterns). Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List all seats') and resource ('seats (members) in the current organisation'), providing a specific verb+resource combination. However, it doesn't explicitly distinguish this tool from sibling tools like 'change_seat_role' or 'invite_seat', which would require mentioning that this is a read-only listing operation versus mutation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing organisation context), exclusions, or comparisons to similar tools like 'get_passport' or 'get_trust_level' that might retrieve related member information. Usage is implied but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
memory_reason_aboutMemory Reason AboutAInspect
Reason over previously retrieved memory chunks and optional curated facts using the cached Pattern B prompt.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| facts | No | Optional ESI facts returned by memory_retrieve. | |
| chunks | Yes | Chunks returned by one or more memory_retrieve calls. | |
| intent | Yes | Intent returned by memory_retrieve. | |
| question | Yes | The user question to answer. | |
| retrievalReceiptIds | Yes | Receipt IDs from prior memory_retrieve calls. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description only mentions 'using the cached Pattern B prompt' without explaining behavioral traits like state changes, permissions, or side effects. It does not add transparency beyond the basic purpose, leaving critical behavioral details unspecified.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences long, front-loading the primary purpose, then providing usage guidance. Each sentence adds value with no redundancy, making it highly efficient and structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description lacks specification of return value or output format, which is problematic given the absence of an output schema. It implies prerequisites (prior memory retrieval) but does not make them explicit. The tool's complexity with 5 parameters and reasoning task warrants more detail about what the tool produces.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already provides clear parameter definitions. The description adds no additional parameter-level meaning beyond noting the source of chunks and facts. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The tool's description clearly states its purpose: 'Reason over previously retrieved memory chunks and optional curated facts using the cached Pattern B prompt.' The verb 'reason' and resource 'memory chunks and facts' are specific. It distinguishes from related tools like memory_retrieve and prefers cuecrux_session, providing context for selection among siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly advises the agent to prefer cuecrux_session and notes that this tool is directly callable for backward compatibility. This provides clear guidance on when to use this tool versus the alternative, making it highly actionable.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
memory_retrieveMemory RetrieveAInspect
Retrieve memory chunks, optional curated ESI facts, and passport-driven engrams for Pattern B memory reasoning. The pre_logic field in the response is a ready-to-inject system prompt preamble containing structural data-shape facts calibrated to the calling model's capability class — insert it before reasoning.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | The memory question or retrieval query. | |
| groupId | No | Optional enrichment config group. | |
| modelId | No | The LLM model ID making this call (e.g. 'claude-sonnet-4-6'). Used to calibrate which engrams are dispatched and how pre_logic is formatted. Omit if unknown. | |
| iteration | No | 1-based retrieval iteration number. | |
| sessionId | No | Optional session identifier for receipt grouping. | |
| topicHints | No | Optional topic hints. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Describes pre_logic field behavior and calibration, but lacks details on side effects, auth, or rate limits. Without annotations, description carries burden but is incomplete.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two paragraphs, first sets purpose, second adds usage guidance. Efficient but the second paragraph is lengthy; still no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description explains pre_logic and high-level return contents (chunks, facts, engrams). Adequate for understanding return value.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters. The description does not add extra meaning beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it retrieves memory chunks, ESI facts, and engrams for Pattern B reasoning. Specific verb+resource that distinguishes it from many sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly recommends using cuecrux_session first, explains why, and notes that this tool is for backward compatibility. Provides clear when-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pin_receiptPin ReceiptCInspect
Pin a receipt to prevent it from being garbage collected.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Reason for pinning. | |
| expires_at | No | ISO 8601 expiry timestamp. | |
| receipt_id | Yes | The receipt ID to pin. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It implies a mutation ('Pin') but doesn't disclose behavioral traits such as required permissions, whether pinning is reversible, effects on system resources, or error conditions. This is inadequate for a tool that likely modifies state.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's front-loaded and appropriately sized for its function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is insufficient. It lacks details on behavioral context, error handling, and what happens post-pinning, leaving gaps in understanding the tool's full impact.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters. The description adds no meaning beyond the schema, as it doesn't explain parameter interactions or provide examples. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Pin') and resource ('a receipt') with the specific purpose 'to prevent it from being garbage collected.' It distinguishes the tool's function but doesn't explicitly differentiate it from sibling tools like 'get_stale_pins' or 'get_spend_receipt'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites, conditions for pinning, or related tools like 'get_stale_pins' for checking pinned receipts, leaving usage context unclear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
proof_documentProof DocumentAInspect
Submit a document artefact for cryptographic proof. Creates an async proof job that retrieves the artefact, chunks it, hashes each chunk, and produces a Merkle receipt. Returns the job ID for status polling.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Proof mode (default: light) | |
| metadata | No | Optional metadata to attach to the proof job | |
| artefact_id | Yes | The artefact ID to proof |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden and does well by disclosing key behavioral traits: it's an async operation ('creates an async proof job'), describes the processing steps ('retrieves the artefact, chunks it, hashes each chunk'), and specifies the return value ('Returns the job ID for status polling'). It doesn't mention error conditions, rate limits, or authentication requirements, but provides substantial operational context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste. First sentence states purpose and outcome, second sentence describes the async nature and return value. Every word earns its place, and the information is front-loaded with the core functionality.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 3 parameters, 100% schema coverage, and no output schema, the description provides good context about the async nature and processing steps. It doesn't explain what happens to the proof job after creation or error scenarios, but covers the essential operational behavior well given the available structured data.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema (like explaining 'artefact_id' format or 'mode' implications). Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Submit a document artefact for cryptographic proof'), the resource ('document artefact'), and the outcome ('creates an async proof job...produces a Merkle receipt'). It distinguishes from sibling tools like 'get_proof_status' or 'get_proof_receipt' by focusing on the submission/creation action rather than retrieval operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by mentioning 'async proof job' and 'status polling', suggesting this initiates a process that needs follow-up. However, it doesn't explicitly state when to use this versus alternatives like 'get_proofpack' or 'diff_receipts', nor does it mention prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
purchase_bundlePurchase BundleCInspect
Purchase a credit bundle by ID.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| metadata | No | Optional metadata for the purchase. | |
| bundle_id | Yes | The bundle ID to purchase. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states this is a purchase operation (implying a financial transaction and system mutation), but doesn't mention critical aspects like whether this deducts from a credit balance, requires payment authorization, has side effects on user accounts, or what happens on success/failure. This leaves significant gaps for an agent to understand the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise at just one sentence with no wasted words. It's front-loaded with the core action and immediately specifies the required parameter. Every word earns its place in this minimal but complete statement of purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a financial transaction tool with no annotations and no output schema, the description is insufficient. It doesn't explain what happens after purchase (does it return a receipt? update balances? trigger notifications?), doesn't mention authentication requirements, and provides no error handling context. Given the complexity of a purchase operation, this leaves too many unknowns for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description adds no additional semantic context about parameters beyond what's in the schema (e.g., what format bundle_id should be, what metadata is used for, or examples of valid values). This meets the baseline expectation when schema coverage is complete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Purchase') and resource ('a credit bundle by ID'), making the purpose immediately understandable. However, it doesn't differentiate from sibling tools like 'browse_bundles' or 'get_pricing' which might be related to bundles, leaving room for confusion about when to use this specific purchase function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided about when to use this tool versus alternatives like 'browse_bundles' (which might list available bundles) or 'get_pricing' (which might show bundle costs). The description doesn't mention prerequisites such as authentication, payment methods, or whether the user needs sufficient credits/balance to make the purchase.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
query_vaultQuery VaultCInspect
Retrieve relevant documents from the vault using semantic search across one or more corpora.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| lane | No | Retrieval lane controlling depth and cost. Accepts `light|verified|audit` — map informal terms (e.g. 'quick'→`light`, 'strict'→`audit`) to the nearest enum value. Defaults to `light`. | |
| limit | No | Maximum number of results to return (1-50, default 8). | |
| query | Yes | The search query to retrieve documents for. | |
| corpusIds | No | Corpus IDs to search within. | |
| includeCommons | No | Whether to include common/shared corpora in the search. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool performs retrieval via semantic search but omits critical details such as authentication requirements, rate limits, error handling, or the format of returned documents. For a tool with 5 parameters and no output schema, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose without unnecessary words. Every part of the sentence contributes to understanding the tool's function, making it appropriately concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (5 parameters, semantic search functionality) and lack of annotations and output schema, the description is incomplete. It doesn't explain return values, error conditions, or behavioral traits like performance or limitations, leaving significant gaps for an AI agent to use it effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all 5 parameters. The description adds minimal value beyond the schema by implying semantic search functionality and corpus scope, but doesn't provide additional syntax, format details, or examples. Baseline 3 is appropriate when the schema handles parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('Retrieve') and resource ('documents from the vault'), and specifies the method ('semantic search across one or more corpora'). It distinguishes itself from siblings like 'query_with_threshold' by focusing on semantic retrieval rather than threshold-based filtering, though it doesn't explicitly name alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'query_with_threshold' or other search-related tools. It mentions the scope ('across one or more corpora') but lacks explicit when/when-not instructions, prerequisites, or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
query_with_thresholdQuery with Trust ThresholdCInspect
Execute a trust-routed query that filters results by minimum confidence and respects budget constraints.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | The search query. | |
| budget_cap | No | Maximum budget units to spend on this query. | |
| min_confidence | No | Minimum confidence threshold (0-1, default 0.8). | |
| requested_mode | No | Requested routing mode override. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'trust-routed query', 'filters results by minimum confidence', and 'respects budget constraints', which imply some behavioral traits like routing based on trust and cost management. However, it lacks details on permissions, rate limits, error handling, or what happens if budget is exceeded, leaving significant gaps for a tool with potential financial or trust implications.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads key information ('Execute a trust-routed query') and avoids redundancy. Every word contributes to understanding the tool's purpose, making it appropriately concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity implied by terms like 'trust-routed' and budget constraints, along with no annotations and no output schema, the description is incomplete. It doesn't explain the return format, error conditions, or deeper behavioral aspects, leaving the agent with insufficient context to use the tool effectively in varied scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds minimal value by implying that 'min_confidence' and 'budget_cap' are used for filtering and constraints, but doesn't provide additional semantics beyond what's in the schema descriptions. This meets the baseline of 3 for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('execute', 'filters') and resources ('trust-routed query', 'results'), specifying it filters by minimum confidence and respects budget constraints. However, it doesn't explicitly distinguish this tool from sibling tools like 'query_vault' or 'get_beliefs', which might also involve querying operations, leaving some ambiguity about its unique role.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as 'query_vault' or other query-related siblings. It mentions filtering by confidence and budget constraints but doesn't specify scenarios, prerequisites, or exclusions, offering minimal usage context beyond the basic purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
register_agentRegister AgentBInspect
Self-register a new agent with the VaultCrux platform. No API key or tenant ID required.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| callback_url | No | URL for the platform to send callbacks to. | |
| agent_framework | No | The agent framework being used (default: unknown). | |
| agent_display_name | No | A human-readable display name for the agent. | |
| framework_fingerprint | No | Unique fingerprint of the agent framework instance. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions that no API key or tenant ID is required, which is useful context about authentication needs. However, it lacks details on what the registration entails (e.g., whether it creates persistent resources, requires confirmation, has rate limits, or returns specific data like an agent ID), leaving significant gaps for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the key information ('self-register a new agent') and adds necessary context ('No API key or tenant ID required') without any wasted words. Every part earns its place, making it highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that this is a mutation tool with no annotations and no output schema, the description is incomplete. It covers the purpose and authentication context but lacks details on behavioral outcomes (e.g., what happens after registration, error conditions, or return values), which are critical for an agent to use this tool effectively in a complex environment with many sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so the schema already documents all four parameters thoroughly. The description doesn't add any meaning beyond what the schema provides (e.g., it doesn't explain parameter relationships or usage examples), resulting in a baseline score of 3 where the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('self-register') and resource ('new agent with the VaultCrux platform'), making the purpose unambiguous. However, it doesn't explicitly differentiate this from sibling tools like 'invite_seat' or 'create_coalition', which might also involve agent/entity creation in different contexts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool ('No API key or tenant ID required'), indicating it's for initial registration without existing credentials. However, it doesn't specify when NOT to use it or name explicit alternatives among the many sibling tools, such as 'invite_seat' for adding users to an existing setup.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
register_beliefRegister BeliefCInspect
Register a belief about an answer for trust tracking.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| answer_id | No | The answer ID this belief relates to. | |
| cost_crux | No | Credit cost of the belief. | |
| receipt_id | No | The receipt ID backing this belief. | |
| confidence_band | No | Confidence band object. | |
| decision_context | No | Context for the decision. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden but only hints at behavior ('trust tracking'). It doesn't disclose whether this is a mutation, requires authentication, has side effects, or how it interacts with the system. More behavioral context is needed for a tool with 5 parameters.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. It's appropriately sized and front-loaded with the core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 5-parameter mutation tool with no annotations and no output schema, the description is insufficient. It lacks details on behavior, return values, error conditions, and integration with sibling tools, leaving significant gaps in understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all 5 parameters. The description adds no additional parameter semantics beyond what's in the schema, meeting the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('register') and resource ('belief about an answer'), specifying it's for 'trust tracking'. It distinguishes from obvious siblings like 'get_beliefs' (read vs. write) but doesn't explicitly differentiate from all other tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'get_beliefs' or 'watch_answer', nor any prerequisites or contextual triggers for belief registration. The description lacks explicit usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
request_sponsorRequest SponsorBInspect
Request a sponsor for the current agent session. Requires a session token (vcrx_self_ prefixed).
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| session_token | Yes | Session token obtained from agent registration. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions the session token requirement, which is a behavioral trait, but doesn't disclose other critical aspects like what 'sponsor' means, whether this is a read or write operation, potential side effects, or response format. For a tool with no annotation coverage, this is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and front-loaded, consisting of two clear sentences: one stating the purpose and another specifying the requirement. There is no wasted text, making it efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete. It lacks details on what 'sponsor' entails, the tool's behavioral impact (e.g., read vs. write), and expected outcomes. For a tool with such minimal structured data, the description should provide more context to be fully helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the parameter 'session_token' fully documented in the schema. The description adds minimal value by specifying the token prefix ('vcrx_self_'), but doesn't provide additional semantics beyond what the schema already covers. Baseline 3 is appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Request a sponsor') and the resource ('for the current agent session'), making the purpose understandable. However, it doesn't differentiate this tool from sibling tools like 'register_agent' or 'get_session_context', which might involve similar session-related operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by specifying 'Requires a session token (vcrx_self_ prefixed)', which suggests this tool is used after agent registration. However, it doesn't explicitly state when to use it versus alternatives (e.g., 'register_agent' for initial setup or 'get_session_context' for session info), leaving some ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
revoke_seatRevoke SeatCInspect
Remove a member from the organisation by revoking their seat.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| seat_id | Yes | The ID of the seat to revoke. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states this is a removal action ('remove a member'), implying a destructive mutation, but doesn't specify whether this is reversible, what permissions are required, if it triggers notifications, or what happens to associated data. For a mutation tool with zero annotation coverage, this leaves critical behavioral aspects undocumented.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's front-loaded with the core action ('Remove a member') and avoids redundancy with the name or title.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (a destructive mutation tool), lack of annotations, and no output schema, the description is incomplete. It doesn't cover behavioral aspects like permissions, reversibility, or side effects, nor does it explain the result (e.g., what confirmation is returned). For a tool that removes organizational members, this leaves significant gaps for an AI agent to use it safely and effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage (the 'seat_id' parameter is fully documented in the schema), so the baseline is 3. The description doesn't add any parameter-specific details beyond what the schema provides (e.g., it doesn't explain how to obtain the seat_id or format constraints), but it doesn't need to since the schema is comprehensive.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('remove a member') and the resource ('from the organisation') with the specific mechanism ('by revoking their seat'). It distinguishes from sibling tools like 'change_seat_role' or 'list_seats' by focusing on removal rather than modification or listing. However, it doesn't explicitly differentiate from other potential removal tools (none present in siblings), so it's not a perfect 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., seat must exist, user must have permissions), exclusions (e.g., cannot revoke own seat), or related tools like 'invite_seat' or 'change_seat_role'. The agent must infer usage from the name and context alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
schedule_recheckSchedule RecheckCInspect
Schedule a periodic re-check of knowledge freshness.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Scope object for the recheck. | |
| cron_expr | No | Cron expression (defaults to '0 0 * * *'). | |
| next_run_at | No | ISO 8601 timestamp for the next run. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the action ('Schedule a periodic re-check') but doesn't describe what happens after scheduling (e.g., how re-checks are triggered, if they require permissions, or if there are rate limits). This is a significant gap for a scheduling tool with no structured safety hints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's appropriately sized and front-loaded, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of scheduling periodic tasks and the lack of annotations and output schema, the description is insufficient. It doesn't explain what 'knowledge freshness' entails, how the re-check operates, or what the tool returns, leaving critical behavioral aspects undocumented.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all three parameters (scope, cron_expr, next_run_at). The description doesn't add any parameter-specific details beyond what's in the schema, such as examples or constraints, resulting in the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Schedule') and resource ('periodic re-check of knowledge freshness'), making the purpose understandable. However, it doesn't differentiate this tool from potential siblings like 'get_stale_pins' or 'forecast_obsolescence' that might also relate to knowledge freshness, so it doesn't reach the highest score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, exclusions, or related tools, leaving the agent to infer usage context solely from the tool name and purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
set_policySet PolicyCInspect
Set or update an active policy for the agent.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| rules | No | Policy rules object. | |
| policy_name | No | Policy name (defaults to 'default'). | |
| principal_id | No | Principal ID to apply the policy to. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It implies a write operation ('set or update') but doesn't specify if this requires special permissions, whether changes are permanent or reversible, or what happens on success/failure. This is inadequate for a mutation tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. It's front-loaded with the core action and resource, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is incomplete. It lacks details on behavioral traits (e.g., side effects, error handling), usage context, and return values, leaving significant gaps for an AI agent to understand how to invoke it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents the three parameters (rules, policy_name, principal_id). The description adds no additional meaning beyond what's in the schema, such as explaining the structure of 'rules' or how 'principal_id' relates to the agent. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('set or update') and the resource ('active policy for the agent'), making the purpose understandable. However, it doesn't differentiate from sibling tools like 'get_active_policy' or 'set_reasoning_profile', which could cause confusion about when to use each.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'get_active_policy' (for viewing) or 'set_reasoning_profile' (for a different configuration). There's no mention of prerequisites, such as whether a policy must exist to update it, or context for when setting vs. updating applies.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
set_reasoning_profileSet Reasoning ProfileCInspect
Set the agent's reasoning methodology profile.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| constraints | No | Reasoning constraints array. | |
| methodology | No | Reasoning methodology object. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. 'Set' implies a mutation operation, but the description doesn't specify whether this is reversible, requires specific permissions, has side effects (e.g., affecting other agent functions), or what the expected outcome is. For a mutation tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence with zero wasted words. It's front-loaded with the core action and target, making it easy to parse. Every word earns its place, and there's no redundancy or unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (a mutation tool with nested objects and no output schema) and lack of annotations, the description is incomplete. It doesn't explain what a 'reasoning methodology profile' entails, how changes affect the agent's behavior, or what the tool returns (if anything). For a tool that likely impacts core agent functionality, more context is needed to guide proper use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with both parameters ('constraints' and 'methodology') documented in the schema. The description adds no additional parameter semantics beyond what's in the schema (e.g., no examples of valid constraints or methodology structures). Given the high schema coverage, the baseline score of 3 is appropriate, as the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Set') and the target ('the agent's reasoning methodology profile'), which is specific and unambiguous. It distinguishes from the sibling tool 'get_reasoning_profile' by indicating a write operation rather than a read. However, it doesn't fully differentiate from other configuration tools like 'set_policy', leaving some ambiguity about the exact scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., whether the agent must have certain permissions), when it's appropriate (e.g., during setup or dynamic adjustment), or what happens if used incorrectly. With no usage context, the agent must infer from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
submit_feature_requestSubmit Feature RequestBInspect
Submit a new feature request or suggestion to the VaultCrux product team.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| title | Yes | Short title for the feature request. | |
| category | No | Category for the request (default: other). | |
| metadata | No | Additional metadata to attach to the request. | |
| description | Yes | Detailed description of the requested feature. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the action ('Submit') but lacks details on permissions required, whether submissions are public or private, confirmation mechanisms, rate limits, or what happens after submission (e.g., ticket creation, email notification). This leaves significant gaps for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words or fluff. It is appropriately sized and front-loaded with the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is insufficient. It lacks behavioral context (e.g., side effects, response format) and does not compensate for the absence of structured data, leaving the agent with incomplete information for proper invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters (title, category, metadata, description). The description does not add any additional meaning, syntax, or examples beyond what the schema provides, meeting the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Submit a new feature request or suggestion') and identifies the target resource ('VaultCrux product team'), distinguishing it from sibling tools like 'vote_feature_request' or 'get_feature_requests' which involve different operations on similar resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'vote_feature_request' or 'get_feature_requests', nor any context about prerequisites, timing, or exclusions. The description only states what it does, not when it should be used.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tip_agentTip AgentCInspect
Send a credit tip to another agent.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Reason for the tip. | |
| amount_crux | Yes | Tip amount in crux credits. | |
| recipient_principal_id | Yes | The recipient agent's principal ID. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the action ('Send a credit tip') but doesn't describe whether this is a transactional operation, if it requires authentication, what happens on success/failure, or if there are rate limits. For a financial tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It's front-loaded with the core action and appropriately sized for the tool's complexity. Every word earns its place without being overly brief or verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a financial transaction tool with no annotations and no output schema, the description is incomplete. It doesn't cover behavioral aspects like authorization needs, transaction outcomes, or error handling. While the schema covers parameters well, the overall context for safe and effective use is insufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description doesn't add any meaning beyond what's in the schema (e.g., it doesn't explain what 'crux credits' are or provide context for 'recipient_principal_id'). With high schema coverage, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Send a credit tip') and the target ('to another agent'), which provides a specific verb+resource combination. However, it doesn't differentiate this tool from sibling tools like 'tip_platform' or explain how tipping an agent differs from tipping the platform, leaving room for improvement.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'tip_platform', nor does it mention prerequisites such as having sufficient credits (implied by 'get_credit_balance' sibling) or appropriate permissions. It lacks explicit when/when-not instructions or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tip_platformTip PlatformCInspect
Send a credit tip to the platform. Amount must be a positive number.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Tip amount (must be > 0). | |
| reason | No | Optional reason for the tip. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions that the amount must be positive, which is useful, but fails to describe critical aspects like whether this is a write operation (implied by 'Send'), what happens upon success (e.g., credit deduction, confirmation), error conditions, or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two sentences that directly convey the core purpose and a key constraint. There is no wasted language, and it is front-loaded with the main action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool that performs a financial transaction (tipping) with no annotations and no output schema, the description is insufficient. It lacks details on behavioral outcomes (e.g., what is returned, error handling), usage context (e.g., authentication needs, credit balance implications), and differentiation from similar tools like 'tip_agent'.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('amount' and 'reason') adequately. The description adds minimal value by reiterating that the amount must be positive, but does not provide additional context beyond what the schema specifies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Send a credit tip') and target ('to the platform'), which is specific and unambiguous. However, it does not explicitly differentiate this tool from sibling tools like 'tip_agent', leaving room for potential confusion about when to use each.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as 'tip_agent' (which likely tips an agent instead of the platform). It also lacks information about prerequisites, like whether the user must have sufficient credits or be authenticated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
unwatch_answerUnwatch AnswerAInspect
Remove an existing watch by its watch ID.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| watch_id | Yes | The ID of the watch to remove. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool removes a watch, implying a destructive mutation, but does not mention permissions required, whether the action is reversible, error handling (e.g., invalid watch IDs), or side effects. For a mutation tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words, front-loading the core action ('Remove an existing watch') and specifying the key input ('by its watch ID'). It is appropriately sized for a simple tool with one parameter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, no output schema, no annotations), the description is minimally adequate but lacks completeness for a mutation tool. It does not cover behavioral aspects like permissions, reversibility, or error responses, which are important for safe invocation by an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds meaning by specifying that the watch_id parameter corresponds to 'an existing watch' to remove, which clarifies the context beyond the schema's description ('The ID of the watch to remove'). With 100% schema description coverage and only one parameter, the description compensates adequately, earning a baseline above 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Remove') and the resource ('an existing watch by its watch ID'), making the purpose specific and unambiguous. It distinguishes from sibling tools like 'watch_answer' (which creates watches) and 'get_watches' (which lists them), establishing a clear functional boundary.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when a watch needs to be removed, but provides no explicit guidance on when to use this tool versus alternatives (e.g., if there are other ways to manage watches) or any prerequisites (e.g., needing an existing watch ID). It lacks context about when-not to use it or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
verify_passportVerify PassportCInspect
Verify another agent's trust passport.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| principal_id | No | Principal ID to verify (defaults to own agent ID). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the action is to 'verify,' which implies a read operation, but doesn't disclose behavioral traits such as authentication needs, rate limits, what 'verify' entails (e.g., checks validity, returns status), or potential side effects. This is a significant gap for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence with zero waste. It's front-loaded and efficiently conveys the core purpose without unnecessary elaboration, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of verification operations and the lack of annotations and output schema, the description is incomplete. It doesn't explain what 'verify' means in practice, what the expected output is, or any error conditions, leaving the agent with insufficient context to use the tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'principal_id' documented as 'Principal ID to verify (defaults to own agent ID).' The description adds no additional meaning beyond this, as it doesn't elaborate on parameter usage or implications. Baseline 3 is appropriate since the schema fully describes the parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Verify another agent's trust passport' clearly states the action (verify) and the resource (trust passport), with the qualifier 'another agent's' indicating it's for external verification. However, it doesn't explicitly differentiate from sibling tools like 'get_passport' or 'get_trust_level', which might retrieve rather than verify.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description implies verification of another agent's passport, but it doesn't specify use cases, prerequisites, or exclusions, leaving the agent to infer context from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
vote_feature_requestVote on Feature RequestCInspect
Cast an upvote on an existing feature request to signal interest.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| metadata | No | Additional metadata to attach to the vote. | |
| request_id | Yes | The ID of the feature request to vote on. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the action is to 'cast an upvote', implying a write operation, but doesn't disclose behavioral traits such as whether this requires authentication, if votes are reversible, rate limits, or what happens on success/failure. The description is minimal and lacks critical operational context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action and purpose. There's no wasted wording, and it's appropriately sized for a simple tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a mutation tool. It lacks details on permissions, side effects, response format, or error handling. While concise, it doesn't provide enough context for safe and effective use by an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('request_id' and 'metadata'). The description doesn't add any meaning beyond the schema—it doesn't explain parameter usage, constraints, or examples. The baseline score of 3 is appropriate as the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Cast an upvote') and the target resource ('an existing feature request'), with the goal 'to signal interest'. It's specific about the verb and resource, but doesn't explicitly differentiate from sibling tools like 'submit_feature_request' or 'get_feature_requests' beyond the action itself.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing an existing feature request ID), exclusions, or comparisons to sibling tools like 'get_feature_requests' for viewing requests or 'submit_feature_request' for creating them.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
watch_answerWatch AnswerBInspect
Create a watch on an answer to receive alerts when it changes or becomes stale.
Prefer cuecrux_session as your first and only direct MCP call. It returns a typed capability plan that routes this tool (and every other) to its preferred channel, tier, and cost class. One call per session is enough; the plan is the source of routing truth for all subsequent work. This tool remains directly callable for backward compatibility; the collapsed surface is the intended surface.
| Name | Required | Description | Default |
|---|---|---|---|
| answer_id | Yes | The ID of the answer to watch. | |
| frequency | No | How often to check for changes (default: daily). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but only states the outcome ('receive alerts') without detailing alert mechanisms (e.g., notifications, webhooks), permissions required, rate limits, or whether watches are user-specific. It mentions 'changes or becomes stale' but doesn't define 'stale' criteria, leaving behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the purpose and outcome with zero wasted words. It directly communicates the tool's function without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a mutation tool ('Create'). It lacks details on return values (e.g., watch ID, confirmation), error conditions, or side effects, leaving significant gaps for an agent to understand full behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('answer_id' and 'frequency'). The description adds no additional meaning beyond implying 'answer_id' is needed for watching, but doesn't clarify parameter interactions or usage beyond the schema's details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Create a watch'), target resource ('on an answer'), and outcome ('to receive alerts when it changes or becomes stale'). It distinguishes itself from sibling tools like 'unwatch_answer' and 'get_watch_alerts' by focusing on creation rather than removal or retrieval.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'unwatch_answer' or 'get_watches', nor does it mention prerequisites (e.g., needing answer access) or exclusions. Usage is implied through the action but lacks explicit context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!