arifOS — Constitutional AI Kernel
Server Details
Constitutional AI kernel with 13 MCP tools, 888_JUDGE verdict pipeline, and VAULT999 ledger.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 3.1/5 across 48 of 48 tools scored. Lowest: 1.1/5.
Many tools have overlapping purposes, such as multiple vault query tools (arif_vault_query and hermes_vault_query), multiple health probes (arif_floor_status, arif_stack_health_probe, hermes_system_status, arif_organ_attest_all), and multiple lease management tools (arif_lease_inspect, arif_lease_issue, arif_lease_revoke) that could be combined. The descriptions often contain similar language, making it difficult for an agent to distinguish between them.
Naming conventions are inconsistent: tools use multiple prefixes (arif_, hermes_, forge_, mcp_) without clear rationale, and while most are snake_case, the prefixes create confusion about organ ownership. Some tool names are overly long and include multiple verbs (e.g., arif_detect_institutional_shadow_drift) while others are very short (e.g., arif_act). The deprecated forge_ tools have a different naming pattern from active ones.
48 tools is excessive for a coherent interface. Many tools are redundant (e.g., two vault query tools, multiple health probes) or deprecated (forge_ tools). A leaner set of 20-30 well-scoped tools would be more manageable. The high count increases cognitive load on agents without proportional benefit.
The tool surface covers a wide range of capabilities: constitutional pipeline (init, observe, think, judge, act, seal), federation health, vault, wiki, leases, and fact-checking. However, there are notable gaps: no tool for user feedback or error handling, and the presence of deprecated tools suggests incomplete cleanup. The wiki tools require strict ordering (ingest before ask) which may cause agent failures if not followed.
Available Tools
48 toolsarif_actAInspect
Execution gate (900). REQUIRES: seal_verdict_id + approved_action_hash from prior judge→seal pipeline. Without these, returns 888_HOLD structurally. Routes through A2ASealVerifier before execution. Skip when: still planning (use arif_think) or no prior SEAL exists. Last tool in constitutional pipeline — seal the result after.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | ||
| manifest | No | ||
| _envelope | No | ||
| session_id | No | ||
| seal_verdict_id | Yes | ||
| approved_action_hash | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Despite missing annotations, the description details the hard requirement, the gradient toward act, the structural 888_HOLD return without required params, and the cryptographic verification route through A2ASealVerifier. It also notes the post-execution action of sealing the result.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured, front-loading the core purpose, then logically progressing through requirements, usage guidance, and exclusions. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, the description adequately covers the tool's role in the pipeline, preconditions, and process. Minor shortfall: optional parameters are not explained, but the core functionality is well-covered.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description explains the purpose of the two required parameters (seal_verdict_id and approved_action_hash) in context. However, it does not describe the other parameters (actor_id, manifest, _envelope, session_id), leaving them undocumented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it is for executing an approved action as the '900 execution gate'. It distinguishes itself from sibling tools by positioning it as the 'LAST tool in the constitutional pipeline' and explicitly ties it to prior judge and seal steps.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to select (when seal_verdict_id and approved_action_hash are present) and when NOT to select (during planning/critique phase or without prior SEAL). Provides clear context for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_canaryARead-onlyIdempotentInspect
Unified transport diagnostic probe. One tool, six modes. Use for liveness checks, protocol version verification, schema round-trip testing, transport detail dumps, MCP handshake tests, and full conformance spine. Modes: ping | schema_echo | version_echo | transport_echo | initialize_probe | conformance_report
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | ping | |
| payload | No | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| client_capabilities | No | ||
| include_constitution | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description aligns with annotations (readOnlyHint, idempotentHint, destructiveHint) and adds behavioral context by detailing the modes. It provides transparency beyond the annotations by explaining the diagnostic capabilities.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: two sentences that front-load the purpose and enumerate modes. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has 7 parameters and multiple modes; the description provides a high-level overview but lacks detail on parameter usage per mode. Given the output schema exists, return values are not needed, but param guidance is lacking.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds semantic value for the 'mode' parameter by listing allowed values, but fails to describe the other 6 parameters (payload, actor_id, _envelope, etc.). With 0% schema coverage, this is a significant gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool is a 'unified transport diagnostic probe' with six specific modes, distinguishing it from sibling diagnostic tools like arif_heartbeat and arif_stack_health_probe. It explicitly lists all modes and their purposes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description tells what the tool is used for (liveness checks, protocol verification, etc.), implying clear use cases. However, it does not explicitly mention when not to use it or provide direct alternatives among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_composeARead-onlyInspect
Governed response composition — formats final output for Arif with citations and tone calibration. Call as the LAST step before presenting results. Modes: compose | summarize | cite | tone_shift | style | format | nudge | repo_answer.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | compose | |
| style | No | ||
| message | No | ||
| actor_id | No | ||
| language | No | en | |
| _envelope | No | ||
| citations | No | ||
| session_id | No | ||
| ai_involvement | No | full |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds that it formats with citations and tone calibration, and that it's governed. This adds useful context beyond annotations, though it does not detail all behavioral aspects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two clear sentences plus a list of modes. No unnecessary words. Front-loaded with purpose and usage instruction.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 9 parameters and multiple modes, the description is minimal. It conveys purpose and timing but lacks details on mode semantics, parameter roles, or return behavior. Output schema exists but is not referenced.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 9 parameters with 0% description coverage, yet the description only indirectly hints at the 'mode' parameter by listing values. No other parameters (style, message, etc.) are explained, failing to compensate for the missing schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it formats final output for Arif with citations and tone calibration, and is identified as the LAST step. This distinguishes it from sibling tools like arif_act, arif_think, etc., which are earlier steps.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Call as the LAST step before presenting results,' providing clear timing context. Lists modes but does not specify when NOT to use or alternatives, though the last-step instruction implies exclusivity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_conformance_reportARead-onlyIdempotentInspect
Full conformance spine proof machine. Runs 9 live checks against the running arifOS kernel: arifos_alive, mcp_initialize, protocol_version, schema_echo_stable, session_starts, authority_checked, hold_blocks_mutation, vault_replay, cooling_ledger. Returns structured PASS/FAIL report with evidence per check. No session required. Read-only. Zero side effects. Use when: you need to prove arifOS is a substrate, not just a narrative.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | full | |
| limit | No | ||
| payload | No | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| client_capabilities | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds context beyond annotations by listing the 9 checks and confirming zero side effects. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences plus usage statement. Concise and front-loaded with key action. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists so return format is partially covered. But with 7 undocumented parameters, the description lacks information on how to invoke correctly (e.g., what values for mode).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 7 parameters with 0% coverage in schema, but description does not explain any parameter (e.g., mode, limit). Does not compensate for low schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states tool runs 9 specific live checks to prove arifOS is a substrate. Distinguishes from siblings like arif_heartbeat which are single checks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use when: you need to prove arifOS is a substrate, not just a narrative.' Also notes no session required and read-only. Lacks explicit exclusions or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_detect_institutional_shadow_driftBInspect
Detect when a sovereign institution's observed functions have outgrown its declared name (GENESIS/006 Petronas Paradox). Returns drift_score, sovereignty_score, risk_class, verdict, and cross-node routing hints. Use before high-blast-radius deals involving national institutions.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| institution_name | Yes | ||
| declared_function | Yes | ||
| last_shadow_update | No | ||
| observed_functions | Yes | ||
| capital_exposure_myr | No | ||
| declared_function_keywords | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so description must cover behavioral traits. It only states the function's purpose and outputs, but does not disclose side effects, state changes, authorization needs, or rate limits. Leaves agent with insufficient behavioral understanding.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, no redundancy. Purpose and usage are front-loaded. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 9 parameters, 0% schema coverage, and no annotations, the description is far from complete. It fails to guide parameter usage or explain the tool's behavior adequately, despite having an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and description adds no explanation for any of the 9 parameters. It only mentions return fields, leaving parameter meanings completely undocumented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'detect' and resource 'institutional shadow drift', names the specific outputs (drift_score, sovereignty_score, etc.), and references a known concept (GENESIS/006 Petronas Paradox). This clearly distinguishes it from sibling tools like arif_detect_narrative_tension.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'before high-blast-radius deals involving national institutions'. Provides clear context for invocation, though it does not mention when not to use or list alternatives explicitly.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_detect_narrative_tensionCInspect
Detect paradox tension, power asymmetry, and implicit frames in news articles or institutional text. Returns a FrameGraph with actors, claims, tensions, and a kernel verdict. Golden case: Putra Heights Kosmo 2026-06-12.
| Name | Required | Description | Default |
|---|---|---|---|
| url | No | ||
| tags | No | Subject tags. | |
| text | Yes | Full article text. | |
| title | Yes | Article headline. | |
| author | No | ||
| source | No | ||
| actor_id | No | ||
| _envelope | No | ||
| article_id | No | Optional stable identifier; triggers golden-case lookup for known articles. | |
| session_id | No | ||
| actor_hints | No | Optional hints for actor extraction. | |
| published_at | No | ||
| public_interest | No | LOW | MEDIUM | HIGH | CRITICAL | MEDIUM |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description bears full responsibility for behavioral context. It does not disclose whether the tool is read-only, has side effects, requires specific permissions, or has any rate or resource limits. The lack of such details is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise and front-loaded with the most important information. Every sentence serves a purpose, and there is no superfluous content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that an output schema exists, the description does not need to detail return values. However, with 13 parameters and low schema coverage, the description could provide more context about typical usage scenarios or constraints. It is adequate but leaves gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 13 parameters with only 46% description coverage, yet the tool description adds no parameter-level explanation. Many parameters have empty descriptions, and the description does not clarify their usage beyond the general purpose.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the tool's action (detect) and resource (narrative tension in text), and specifies the return type (FrameGraph). However, it does not explicitly differentiate this tool from the similar sibling 'arif_detect_institutional_shadow_drift', so it's not a perfect 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides context for when to use the tool (news articles or institutional text) but offers no guidance on when not to use it or alternatives. The golden case is mentioned but insufficient for usage decisions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_floor_statusCInspect
Report the current state of all 13 constitutional floors.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | ||
| session_id | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided. The description implies a read operation ('report') but does not explicitly state idempotency, safety, auth requirements, or side effects. Without annotations, the description carries full burden and fails to disclose behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, concise and front-loaded. However, it could benefit from a brief explanation of parameters or context without becoming verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema, no annotations, and two undocumented parameters, the description is too brief to fully support agent decision-making. It does not explain defaults, expected output, or prerequisites.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description does not mention the two optional parameters (actor_id, session_id) or their semantics. No value added beyond the nullable types already in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Report', the resource 'current state of all 13 constitutional floors', and the scope 'all 13'. It distinguishes from sibling tools like arif_observe or arif_stack_health_probe, which cover different actions or resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description provides no guidance on when to use this tool vs alternatives. With many sibling tools that might report status (e.g., arif_stack_health_probe), the lack of explicit context or exclusions hinders correct tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_get_affordanceAInspect
Return the complete constitutional affordance contract for a tool name. Use this to decide 'why this tool', 'why not others', risk, agency level, and expected metacognition shape. Preferred over guessing from name alone.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses that the tool returns a 'constitutional affordance contract' covering risk and agency level. However, it does not mention behavioral traits like idempotency, required permissions, or rate limits. The description is adequate but not detailed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the primary action and purpose. Every sentence adds value, and there is no extraneous text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the concept of a 'constitutional affordance contract' is complex, the description covers the key points: what it returns and why to use it. An output schema exists, so return values need not be detailed. The tool has 4 parameters but only 1 is required; the optional parameters might be standard context, so their omission is acceptable. Overall, the description is fairly complete for its purpose.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description does not add any meaning beyond the schema. The only required parameter 'name' is implied in the description ('for a tool name'), but the other three parameters (actor_id, _envelope, session_id) are not explained at all. With low coverage, the description should compensate but fails to do so.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns the 'constitutional affordance contract for a tool name', which is a specific verb and resource. It also distinguishes itself from siblings by positioning it as the preferred method for understanding tool rationale over guessing from names alone.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells when to use the tool: 'to decide why this tool, why not others, risk, agency level, and expected metacognition shape.' It recommends it over guessing from names. While it doesn't explicitly state when not to use it, the guidance is strong and contextually clear given the sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_heartbeatCInspect
Record or query federation heartbeats. Returns liveness verdict for known organs.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | ||
| organ_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions 'record or query' and 'returns liveness verdict', but lacks detail on side effects, permissions, or what 'record' entails. The term 'liveness verdict' is vague.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no wasted words, but it omits critical information about parameters and behavior. Conciseness does not compensate for lack of completeness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, 4 undocumented parameters, and existence of an output schema not referenced, the description is insufficient for an AI to correctly invoke the tool. It fails to provide enough context about inputs, outputs, or behavioral nuances.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and the description does not explain any of the four parameters (actor_id, organ_id, _envelope, session_id). The AI agent has no guidance on how to populate them.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool records or queries federation heartbeats and returns a liveness verdict for organs. This is a specific verb+resource combination that distinguishes it from many sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for heartbeat recording or querying, but provides no explicit guidance on when to use vs alternatives like other arif tools. No when-not-to or alternative references.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_initInit SessionAIdempotentInspect
START HERE. Bind governed session before any arif_* call. Without session_id, no governed action possible. Returns: session_id, authority level, floor status, next_tool. Modes: ping (<1s probe) | light (<1s pointers) | init (~60s full) | resume | validate | epoch_open | epoch_seal. Skip when: live session exists (use arif_triage) or factual question (use arif_observe).
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | init | |
| nonce | No | ||
| intent | No | ||
| context | No | ||
| payload | No | ||
| tooling | No | ||
| verbose | No | ||
| actor_id | No | ||
| epoch_id | No | ||
| evidence | No | ||
| trace_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| agent_policy | No | ||
| counterparty | No | ||
| sovereign_id | No | ||
| actor_signature | No | ||
| caller_actor_id | No | ||
| delegation_mode | No | ||
| idempotency_key | No | ||
| ack_irreversible | No | ||
| executor_actor_id | No | ||
| declared_model_key | No | ||
| client_capabilities | No | ||
| requested_authority | No | OBSERVE_ONLY | |
| previous_session_hash | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds behavioral context beyond annotations: session binding, return values, mode durations (init ~60s, light <1s), and prerequisite ('no session_id, no governed action'). No contradiction with annotations, which are not detailed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with critical info, but the single dense paragraph could be more readable with bullet points. No wasted words, but could be slightly more structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers core usage and return values, but lacks detailed guidance on when to use each mode (e.g., validate, epoch_open) given the tool's complexity and many modes.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 26 parameters and 0% schema description coverage, the description only explains the 'mode' parameter's enum values. No meaning added for other parameters like nonce, intent, context, etc., leaving the agent underinformed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool's purpose: 'Bootstrap a governed constitutional session and bind actor identity.' It identifies itself as the starting point ('START HERE') and clearly distinguishes from siblings like arif_triage for live sessions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use scenarios (starting new conversation, resuming session, before other arif_* calls) and when-not-to-use (already have session_id and authority, or factual questions). Names alternative tool 'arif_triage'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_judgeJudge DeliberateARead-onlyInspect
Constitutional verdict gate. Select when: evidence gathered, plan ready, floor compliance needs verification. Returns: SEAL | HOLD | SABAR | VOID + violated floors + receipts. Modes: judge | compare | history | explain | floor_status | witness_consensus. REQUIRES: actor, intent, domain, reversibility_level, blast_radius. Skip when: evidence incomplete (use arif_observe), plan not ready (use arif_think), or action is reversible + low-risk (advisory sufficient).
| Name | Required | Description | Default |
|---|---|---|---|
| actor | Yes | ||
| domain | Yes | ||
| intent | Yes | ||
| actor_id | No | ||
| evidence | No | ||
| _envelope | No | ||
| session_id | No | ||
| blast_radius | Yes | ||
| authority_token | No | ||
| epistemic_state | No | UNKNOWN | |
| reversibility_level | Yes | ||
| requested_capability | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true and destructiveHint=false, so description carries less burden. It adds context about return values ('SEAL | HOLD | SABAR | VOID'), required authority verification, and modes. However, readOnlyHint might slightly conflict with 'render' but the return-only behavior aligns. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is fairly concise with meaningful sentences, but the list of required parameters and modes could be streamlined. Still, it is well-structured and front-loaded with purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 12 parameters, 6 required, and no schema descriptions, the description should cover all required parameters and give context for optional ones. It misses one required parameter and does not explain defaults or usage of parameters like evidence, authority_token, etc. The output schema is not provided, but description gives return types; still incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 0% description coverage, so description must explain parameters. It lists 5 required parameters but omits 'requested_capability' which is required per schema. It provides no definitions for any parameter beyond listing them, leaving their meaning unclear.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Render final constitutional verdict' with a specific verb and resource. It distinguishes from siblings by referencing when not to use (choose observe or think first). There are no other judge tools among siblings, so purpose is unique and clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to select: 'decision is ready for binding judgment', evidence gathered, reasoning done. Also lists when NOT to select: incomplete evidence (choose observe), plan not ready (choose think), reversible low-risk (advisory mode). Alternative tools are directly named.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_lease_inspectDInspect
Issue, inspect, or revoke a bounded authority lease. Leases scope organ/agent tool access and action class.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | ||
| lease_id | No | ||
| organ_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must fully convey behavior. It claims three distinct actions (issue, inspect, revoke) with no explanation of side effects, permissions required, or implications. This is insufficient for an agent to safely invoke the tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very short (two sentences), but the first sentence lists three actions, which is ambiguous and not concise in conveying the tool's single purpose. It could be improved by focusing on inspection and referencing the specialized sibling tools.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Even though an output schema exists, the description does not explain what 'inspect' returns or how the tool processes inputs. For a tool with five parameters and no parameter documentation, this description is insufficient for complete understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, and the description adds no meaning to any of the five parameters (actor_id, lease_id, organ_id, _envelope, session_id). The agent cannot infer what each parameter represents or how to use them.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the tool can 'issue, inspect, or revoke a bounded authority lease,' but the name 'arif_lease_inspect' and the presence of sibling tools 'arif_lease_issue' and 'arif_lease_revoke' indicate this tool should focus on inspection only. This overlap creates ambiguity and reduces clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus its siblings arif_lease_issue and arif_lease_revoke. The description does not specify that inspection is the primary use case, leaving the agent without direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_lease_issueCInspect
Issue, inspect, or revoke a bounded authority lease. Leases scope organ/agent tool access and action class.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | Yes | ||
| actor_id | Yes | ||
| max_uses | No | ||
| organ_id | Yes | ||
| _envelope | No | ||
| forbidden | No | ||
| session_id | No | ||
| ttl_seconds | No | ||
| max_action_class | No | OBSERVE |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description must disclose behavior. It does not mention destructive nature (issuing/revoking likely modifies state), required permissions, or side effects. The phrase 'bounded authority lease' is abstract.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Only two sentences, but the conciseness sacrifices meaningful information. The second sentence is a definition that could be more precise. No structure or bullet points.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 9 parameters, no parameter documentation, and no behavioral caveats, the description is severely incomplete. An output schema exists but the description does not reference it or clarify return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and the description does not explain any of the 9 parameters (organ_id, actor_id, scope, etc.). Parameters like max_uses, ttl_seconds, forbidden, max_action_class are critical for safe use but undefined.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states three actions (issue, inspect, revoke) and mentions 'bounded authority lease', but siblings arif_lease_inspect and arif_lease_revoke exist, causing ambiguity about which tool to use for each action. The verb 'scope' is vague.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus the sibling tools arif_lease_inspect and arif_lease_revoke. The description includes all three actions, potentially misleading an agent to use this for inspection/revocation when dedicated tools exist.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_lease_revokeDInspect
Issue, inspect, or revoke a bounded authority lease. Leases scope organ/agent tool access and action class.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | sovereign_revoke | |
| actor_id | No | ||
| lease_id | Yes | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, so the description must disclose behavioral traits. It states 'revoke a bounded authority lease' which implies a destructive action, but then conflates with 'issue, inspect' without clarifying side effects, prerequisites, or reversibility. Contradicts the tool's implied purpose.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is only two sentences, but the first sentence is misleading by including 'issue, inspect' which does not match the tool name. Brevity is not beneficial when it causes confusion.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 5 parameters and no output schema visibility, the description is severely incomplete. It fails to explain input semantics, return values, or any operational context, leaving the agent guessing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, yet the description provides no explanation for any of the 5 parameters (lease_id, reason, actor_id, _envelope, session_id). The description adds no meaning beyond the schema's empty definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description says 'Issue, inspect, or revoke a bounded authority lease' but the tool name is 'revoke', implying only revocation. This inconsistency makes the purpose unclear. It does not distinctly differentiate from sibling tools arif_lease_inspect and arif_lease_issue.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like arif_lease_inspect or arif_lease_issue. The description does not include any context for appropriate usage or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_observeSense & ObserveARead-onlyIdempotentInspect
Reality grounding: web search, URL fetch, vitals, repo map, entropy. Select when: answer requires data not in your context. Modes: search | ingest | compass | atlas | entropy_dS | vitals | repo_map | hybrid_discovery. Returns: search results, ingested content, vitals, entropy scores, repo map. Skip when: purely reasoning (use arif_think) or evidence already in context.
| Name | Required | Description | Default |
|---|---|---|---|
| url | No | ||
| mode | No | search | |
| query | No | ||
| layers | No | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| result_limit | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, destructiveHint=false, etc. The description adds no contradictions and further explains behavioral traits: returns from various modes (search, ingest, etc.) and specifies outputs (search results, ingested content, system vitals, entropy scores, repo map). Provides transparency beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and front-loaded with purpose. It efficiently conveys key information in a single paragraph. However, some phrasing ('The gradient pulling toward this tool') is stylistically verbose, slightly reducing conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Provides a good overview of tool functionality, modes, and return types, but lacks explanation of most parameters (url, query, layers, actor_id, etc.). Given the tool's complexity and 0% schema coverage, the description is incomplete for full agent understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning the description does not explain individual parameters. Although it lists modes and outputs, it lacks details on parameters like 'url', 'query', 'layers', etc., which are documented only in the schema. The description fails to compensate for low coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: grounding decisions in reality by gathering external evidence, mapping repository structure, or computing entropy. It uses specific verbs ('Ground', 'Select when') and explicitly distinguishes from sibling tool 'arif_think' for reasoning tasks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use criteria (e.g., need external evidence, map repository, compute entropy) and when-not-to-use (purely reasoning/analysis, choose arif_think). Includes exclusions and alternative tool names, offering clear guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_organ_attestCInspect
Probe and attest a federation organ (GEOX, WEALTH, WELL). Returns organ heartbeat, schema hash, tool count, and kernel envelope.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | ||
| organ_id | Yes | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It does not disclose whether the operation is read-only, destructive, or any behavioral traits beyond stating what is returned.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that efficiently conveys action and output. However, it could be more structured with bullet points for clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of 4 parameters and the existence of an output schema, the description is too minimal. It fails to explain parameter usage or provide behavioral context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description should add meaning to parameters, but it only mentions 'federation organ' implicitly linked to 'organ_id'. No details on 'actor_id', '_envelope', or 'session_id' are provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action 'probe and attest', the resource 'federation organ' with specific examples (GEOX, WEALTH, WELL), and lists the return items. It distinguishes from siblings like 'arif_organ_attest_all'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives like 'arif_organ_attest_all' or 'arif_organ_consensus'. The description does not provide context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_organ_attest_allBInspect
Attest arifOS plus all federation organs in one call. Returns per-organ heartbeat and a degraded-organ list.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the burden. It discloses return values (per-organ heartbeat and degraded list) and indicates batch operation, but does not mention side effects, permissions, or whether it is read-only.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two sentences, front-loading the action and return values without extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and typical parameters, the description is adequate but could elaborate on concepts like 'degraded organ' or the meaning of 'attest' for newcomers. It provides basic completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and the description does not mention any parameters, providing no additional meaning beyond the schema's definitions for actor_id, _envelope, and session_id.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool attests arifOS plus all federation organs in one call, distinguishing it from sibling tools like arif_organ_attest (single organ) and arif_os_attest (only OS).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use when needing to attest all organs at once, but lacks explicit guidance on when not to use it or prerequisites, relying on inference from sibling tool names.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_organ_consensusCInspect
Request cross-organ consensus from WELL, WEALTH, and GEOX.
| Name | Required | Description | Default |
|---|---|---|---|
| organs | No | Subset of organs to consult (default: all three). | |
| actor_id | No | Sovereign actor identifier. | |
| session_id | No | Governed session ID for audit trace. | |
| action_description | No | Free-text description of the proposed action. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It only states the action without revealing side effects, authentication needs, or whether it is a read-only or mutating operation. This is insufficient for safe invocation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence with no wasted words, but its brevity sacrifices necessary detail. It is front-loaded but underinformative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 4 optional parameters and no behavioral disclosure, the description is incomplete. An output schema exists (though not shown), which may return value, but the lack of behavioral context leaves significant gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all 4 parameters. The tool description mentions the default organs list, which adds slight context, but the schema already clarifies defaults and meanings. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Request' and the resource 'cross-organ consensus' from specific organs (WELL, WEALTH, GEOX). It distinguishes from sibling tools like arif_organ_attest by focusing on consensus across organs rather than individual attestation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives, prerequisites, or conditions that would make this tool inappropriate. Sibling tools are numerous, but the description offers no comparative context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_os_attestAInspect
arifOS organ.attest(): live self-attestation of the constitutional kernel. Returns constitution_hash, schema_hash, tool_surface, health, and active lease state. Required before any kernel-grade federation call.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses that it is a 'live self-attestation' and lists return fields, but does not discuss side effects, permissions, or whether it modifies state. The behavioral disclosure is adequate but not comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that includes the tool's purpose, return values, and usage prerequisite. Every word adds value, with no redundancy or extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that an output schema exists, the description's mention of return fields is sufficient. However, with 0% parameter coverage, the description is incomplete for the agent to understand how to invoke the tool correctly. The prerequisite note helps but doesn't offset the missing parameter documentation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, meaning the description adds no explanation for any of the three parameters (actor_id, _envelope, session_id). The description must compensate for this gap but fails to mention parameters at all, leaving the agent to rely solely on the schema which provides no semantic context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs 'live self-attestation of the constitutional kernel' and lists specific return fields (constitution_hash, schema_hash, etc.). It distinguishes itself from the sibling 'arif_organ_attest_all' by focusing on a single organ attestation, and the verb 'attest' is specific and actionable.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Required before any kernel-grade federation call,' providing clear context for when to use the tool. It does not mention alternative tools or when not to use it, but the prerequisite statement gives strong guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_peer_contract_attestDInspect
Peer Federation Contract v1 — attest.
| Name | Required | Description | Default |
|---|---|---|---|
| organ | No | arifOS | |
| actor_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, so the description must disclose behavioral traits. It offers no information about side effects, permissions, data mutation, or constraints. The agent has zero insight into what changes or impacts result from attesting.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely short but not effectively front-loaded. It is under-specified, making it insufficient rather than concise. Every sentence should earn its place; here, one short fragment fails to convey necessary information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's 4 parameters and 0% schema coverage, plus an output schema that is not described, the description is severely inadequate. It fails to provide any context beyond a name and action, leaving critical gaps for an AI agent to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description adds no parameter meanings. Four parameters (organ, actor_id, _envelope, session_id) remain completely unexplained, leaving the agent unable to understand their roles or how to set them correctly.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Peer Federation Contract v1 — attest' indicates an action (attest) on a specific resource (Peer Federation Contract v1). It is more than a tautology and distinguishes from siblings like 'forbid' and 'validate'. However, it lacks a specific verb+resource clarity (e.g., what attesting entails) and does not describe the tool's scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., arif_peer_contract_forbid, arif_peer_contract_validate). The description gives no context for appropriate usage scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_peer_contract_forbidDInspect
Peer Federation Contract v1 — forbid.
| Name | Required | Description | Default |
|---|---|---|---|
| organ | Yes | ||
| reason | No | sovereign_forbid | |
| actor_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, and the description discloses zero behavioral traits (e.g., side effects, permissions, reversibility). The agent has no insight into the tool's consequences.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely short but at the expense of all clarity. It is under-specified, not concise; every sentence must earn its place, and this single phrase does not.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 5 parameters, 1 required, no param docs, and siblings performing related actions, the description is entirely insufficient. Output schema exists but is ignored.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%. The description does not explain any of the 5 parameters (organ, reason, actor_id, _envelope, session_id). The agent cannot infer parameter purpose from the text.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description is 'Peer Federation Contract v1 — forbid.' It restates the tool name with no verb or resource, providing no actionable purpose. The agent cannot determine what operation 'forbid' performs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus siblings like arif_peer_contract_attest or arif_peer_contract_validate. The description gives no context for appropriate invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_peer_contract_validateDInspect
Peer Federation Contract v1 — validate.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | ||
| contract | No | ||
| _envelope | No | ||
| session_id | No | ||
| contract_url | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, and the description does not disclose any behavioral traits such as side effects, permissions needed, or output format. The tool's behavior is entirely opaque.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise (5 words) but at the cost of utility. It under-specifies the tool's purpose and behavior, making it insufficient for an agent to use correctly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 parameters and an output schema, the description is woefully incomplete. It provides no context about the validation process, input requirements, or output interpretation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description fails to explain any of the five parameters. Since schema description coverage is 0%, the description should compensate but does not. Parameters like 'contract', 'contract_url', 'actor_id' remain unexplained.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states 'validate' but does not specify what validation entails, what the input contract should be, or what the output indicates. Among sibling tools like 'arif_peer_contract_attest' and 'arif_peer_contract_forbid', there is implied differentiation but no explicit clarification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description lacks context about prerequisites, appropriate scenarios, or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_resolve_toolAInspect
Resolve a tool name or alias to the canonical arifOS tool name. Use when you have a tool name but aren't sure if it's the canonical name. Returns: canonical_name, aliases, callable, schema_valid, authority_class, use_when, examples. Only name is used — all other fields are absorbed and ignored.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses that only the 'name' parameter is used and all others are absorbed and ignored, which is critical behavioral information. It also lists the return fields. It does not explicitly state read-only nature or authorization needs, but the description is sufficient for a lookup tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the action and purpose, and every sentence adds value. There is no verbosity or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that an output schema exists, the description does not need to fully detail return values but still lists them. It could mention error handling (e.g., if name not found) for completeness, but the tool's role as a resolver is well-defined among many sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It clarifies that only 'name' is used and explains the return values. It also notes that other fields like 'actor_id' are ignored, adding meaning beyond the schema's property names. However, it does not describe the format or constraints of 'name'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Resolve' and the resource 'tool name or alias to the canonical arifOS tool name'. It distinguishes itself from sibling tools like 'arif_act' or 'arif_canary' by being a meta-tool for name resolution, leaving no ambiguity about its function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use when you have a tool name but aren't sure if it's the canonical name.' This gives direct guidance. However, it does not mention when not to use it or suggest alternatives among siblings, though the context makes the purpose clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_retrieve_toolsAInspect
BM25 lexical tool retrieval across the arifOS federation tool catalog. Schema-aware text projection: tool name + description + parameter names + enum values, JSON syntax stripped. k1=0.9 b=0.4 tuned for short tool descriptions (Ratel ADR-0004). Cross-organ by default; set organ= to restrict to one organ. F2 TRUTH: BM25 scores are TERM FREQUENCY lexical match, NOT confidence. F7 HUMILITY: Scores measure keyword overlap, not semantic relevance. Use as DISCOVERY hints to find the right tool, then verify with arif_resolve_tool. Pair with arif_route for the full pipeline: intent → organ → BM25 rank → select. Parameters: query (natural language intent), organ (optional, restrict to organ), top_k (1-20, default 5), include_scores (default true).
| Name | Required | Description | Default |
|---|---|---|---|
| organ | No | Restrict to organ (arifOS/GEOX/WEALTH/WELL/A-FORGE). None = cross-organ. | |
| query | Yes | Natural language intent. E.g. "load well logs from LAS file" | |
| top_k | No | Number of top results (1-20, default 5). | |
| actor_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| include_scores | No | Include BM25 scores in output (default True). |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It transparently explains that BM25 scores are term frequency lexical matches and not confidence, and that scores measure keyword overlap, not semantic relevance. This is excellent transparency about the tool's limitations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and concise. It front-loads the core purpose, then explains technical details (k1, b, BM25 tuning), usage scope, truth/humility statements, and finally parameter explanations. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (retrieval with multiple parameters, output schema exists), the description is largely complete. It covers the retrieval mechanism, limitations, usage context, and parameter details. It does not explain the output format, but this is acceptable since an output schema is provided.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 0% schema description coverage, the description explains the key parameters: query (natural language intent), organ (optional, restrict), top_k (1-20, default 5), and include_scores (default true). It does not mention actor_id, _envelope, or session_id, which are likely envelope/context parameters, but covers the essential functional parameters with meaningful context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs BM25 lexical retrieval across the arifOS federation tool catalog. It provides specific details about schema-aware text projection and distinguishes itself from siblings by mentioning arif_route and arif_resolve_tool as partners in the pipeline.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells when to use the tool ('Use as DISCOVERY hints') and when not to rely on it ('NOT confidence'). It suggests pairing with arif_resolve_tool and arif_route, and explains cross-organ vs. restricted organ usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_routeRouteARead-onlyIdempotentInspect
Canonical intent router. Select when: you know what you want but not which tool or organ to call. Routes natural-language intent to the correct federation organ (GEOX, WEALTH, WELL, A-FORGE) or kernel tool. Optionally accepts organ_tool to bridge-call directly — bypassing the routing decision. Returns: organ, port, tool_prefix, suggested_tools. Do NOT select when: you already know the exact tool to call.
| Name | Required | Description | Default |
|---|---|---|---|
| task | No | Alias for intent (backward compat). | |
| organ | No | Optional explicit organ override. If provided, intent matching is skipped and this organ is used directly. | |
| intent | Yes | Natural-language description of what the user wants. e.g. "interpret this seismic section", "assess portfolio risk" | |
| actor_id | No | Calling actor. | |
| _envelope | No | ||
| arguments | No | Arguments to pass to organ_tool. | |
| organ_tool | No | The tool name on the target organ to call. If absent, returns routing decision only (no bridge call). | |
| session_id | No | Governing session. |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds behavioral context by explaining it routes intent and returns organ, port, tool_prefix, and suggested_tools, and that providing organ_tool bypasses routing. This adds value beyond annotations, though the return behavior could be more detailed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences, wall-to-wall valuable: purpose, usage conditions, and return values. No filler. Front-loaded with the core function. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (8 parameters, 1 required, output schema exists) and the richness of annotations and output schema, the description tells the agent exactly what the tool does, when to use it, and what it returns. Together with the structured data, the agent has full context to select and invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is high (88%), so the baseline is 3. The description adds meaning beyond the schema by clarifying that organ_tool allows a direct bridge call bypassing routing decisions, and implies intent is a natural-language description. This additional context justifies a slightly higher score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool is a 'Canonical intent router' that routes natural-language intent to the correct federation organ or kernel tool. It distinguishes use cases with explicit 'Select when' and 'Do NOT select when' guidance, directly addressing differentiation from siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit conditions for use: 'Select when: you know what you want but not which tool or organ to call' and 'Do NOT select when: you already know the exact tool to call.' It also explains the optional bypass via organ_tool, offering clear context for when to choose this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_scan_local_instructionsCInspect
Scan files for hidden instructions or governance violations.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | ||
| root_dir | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. It does not disclose whether the tool is read-only, what permissions are needed, side effects, or scanning scope (e.g., recursive). This is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise (one short sentence) but lacks necessary detail. It is front-loaded but too minimal to be useful.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With three undocumented parameters, no annotations, and a vague description, the tool is incomplete. The agent lacks essential context to invoke it correctly, despite having an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and the description does not explain any of the three parameters (actor_id, root_dir, session_id). The agent has no clue what to provide, which is a critical gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool scans files for hidden instructions or governance violations, which is a specific verb+resource. It is distinct from many siblings that focus on drift detection or observation, but lacks explicit differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool over alternatives, prerequisites, or context. The description does not mention any exclusions or scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_sealVault SealBDestructiveInspect
Immutable ledger append (VAULT999). Select when: SEAL verdict needs permanent anchoring or execution result needs audit trail. Irreversible. Modes: seal | verify | chain | list | dry_run | seal_card | render. Requires ack_irreversible=True for seal mode. Skip when: verdict is HOLD/SABAR/VOID (seal only SEAL) or testing (use dry_run).
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | seal | |
| nonce | No | ||
| payload | No | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| drift_events | No | ||
| witness_type | No | ai | |
| actor_signature | No | ||
| ack_irreversible | No | ||
| judge_state_hash | No | ||
| constitutional_chain_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Irreversibility is emphasized and aligns with destructiveHint=true, but description does not distinguish that non-seal modes (verify, list) are not destructive. Also, mode list includes items not in schema, potentially misleading.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, front-loaded with purpose and usage. No wasted words, though the mode mismatch adds slight confusion. Efficient overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, usage, and key behavioral traits, but lacks parameter details and fails to align mode list with schema. Output schema exists but description doesn't reference return values. Adequate but not thorough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 12 parameters and 0% schema coverage, description only explains 'ack_irreversible' and 'mode' partially. No detail on payload, nonce, actor_id, etc., leaving most parameters opaque.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description states it appends to immutable ledger, but lists modes that partially mismatch schema enums (e.g., 'chain', 'list' not in schema; 'ledger', 'changelog' in schema not described). This reduces clarity for the agent.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use (verdict or execution result needing permanent anchoring), when not to (HOLD/SABAR/VOID verdicts, testing), and alternatives (dry_run mode). Also notes mode prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_session_budgetCInspect
Check token, compute, and financial budget for this session.
| Name | Required | Description | Default |
|---|---|---|---|
| cost | No | ||
| mode | No | status | |
| risk | No | ||
| delta_s | No | ||
| actor_id | No | ||
| session_id | No | ||
| objective_id | No | ||
| decision_class | No | C1 | |
| ceiling_override | No | ||
| action_description | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description implies a read-only operation ('Check'), but with no annotations, it should explicitly state that it does not modify state. It does not disclose side effects, authentication requirements, or rate limits, which is acceptable given the simple nature of checking, but still leaves ambiguity.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence, but it is too brief for a tool with 10 parameters. It would benefit from additional structure, such as listing key parameters or their roles.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 10 parameters, no schema descriptions, and no annotations, the description is severely incomplete. It does not explain parameter usage, behavior, or output format. Output schema exists but is not referenced.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and the description provides no explanation for any of the 10 parameters. Users have no guidance on what each parameter does or how to use them.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description states the specific verb 'Check' and resources 'token, compute, and financial budget' for the session. It is clear but does not differentiate from sibling tools like arif_floor_status or arif_stack_health_probe, which may have overlapping purposes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. The description only states what it does, with no context for when it is appropriate or when to choose another tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_stack_health_probeCInspect
Probe federation health: arifOS, organs, vault, and model registry.
| Name | Required | Description | Default |
|---|---|---|---|
| actor_id | No | Sovereign actor identifier. | |
| session_id | No | Governed session ID for audit trace. | |
| include_services | No | Subset of services to probe. Default = all. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose behavioral traits. The description implies a read-only health probe but does not explicitly state it is non-destructive, mention permissions, or describe any side effects. This lack of transparency could lead to misuse.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence with no fluff. It is appropriately front-loaded, though it could benefit from additional structure such as a note on default behavior or usage hints.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that an output schema exists, the description does not need to detail return values. However, the description is minimal for a tool with 3 optional parameters and many sibling tools. It provides the essential purpose but lacks depth for comprehensive understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the input schema already documents all three parameters. The tool description adds no extra meaning beyond the schema, matching the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool probes federation health for specific components (arifOS, organs, vault, model registry). It is a specific verb ('probe') and resource ('federation health'), but does not explicitly differentiate from siblings like arif_heartbeat or arif_os_attest, which might have overlapping purposes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. There is no mention of context, prerequisites, or exclusions. The description is too brief to help an agent decide when to invoke this tool over sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_thinkMind ReasonARead-onlyIdempotentInspect
Cognitive engine: reason, plan, reflect, critique, synthesize. Select when: problem needs decomposition, plan generation, hypothesis evaluation, or evidence synthesis. Gradient: cognitive overload → structured reasoning. Modes: reason | reflect | verify | critique | plan | plan_review | plan_approve | refactor_plan | metabolize | axioms. Returns: epistemic labels (OBS/DER/INT/SPEC), facts, inferences, next_safe_action. Skip when: factual question (use arif_observe) or immediate action needed (use arif_act).
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | reason | |
| query | No | ||
| plan_id | No | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| witness_type | No | ai |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint, so the description's role is to add context. It does so by describing the structured reasoning output with epistemic labels and next_safe_action. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose and includes structured lists for modes and exclusions. While it contains some redundancy (e.g., listing modes twice), it remains focused and scannable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (7 parameters, multiple modes, and an output schema), the description covers the essential behavioral context: purpose, usage triggers, return structure, and mode explanations. It does not fully detail every parameter, but the output schema likely fills return value details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It sufficiently explains the 'mode' parameter with its enum values and their meanings. However, other parameters like query, plan_id, actor_id are not explained, leaving gaps for an AI agent.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose as a cognitive engine for reasoning, planning, reflecting, and critiquing. It distinguishes itself from siblings by providing explicit 'Select when' and 'Do NOT select when' clauses, referencing sibling tools like arif_observe and arif_act.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use the tool (e.g., decomposing complex problems, generating plans) and when not to use it (e.g., factual queries, immediate action). It also lists modes with brief explanations, giving clear context for invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_triageARead-onlyInspect
Session status, priority queue, and preflight checks. Modes: status (active session count + stage) | preflight (pre-session safety probe) | triage (priority assessment). Core immune function — run before arif_init when unsure of session state.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | "status" | "preflight" | "triage" | status |
| stage | No | Stage hint (used if session_id not provided) | |
| actor_id | No | Calling actor | |
| priority | No | Task priority hint for triage mode | |
| _envelope | No | ||
| session_id | No | Optional session to query |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint true and destructiveHint false, so safety is clear. Description adds behavioral context about being a core immune function and the three modes, though could detail mode behaviors further.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise: two sentences with front-loaded summary and clear expansion of modes and usage. Every sentence adds value with no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With output schema present, return values are covered. Description adequately covers the three modes, their purposes, and usage ordering relative to arif_init, making it complete for a status tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is high (83%), so parameters are already well-documented in the schema. Description does not add additional parameter meaning beyond the schema's descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool provides session status, priority queue, and preflight checks with three explicit modes. It distinguishes from sibling arif_init by noting it should be run before that tool when unsure of session state.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('run before arif_init when unsure of session state'), providing clear context. Does not list alternatives or when not to use, but the usage context is adequate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_vault_queryBRead-onlyInspect
ARIF_VAULT_QUERY: Query VAULT999 audit ledger.
Modes:
recent — return N most recent VAULT entries (default)
search — search VAULT entries by keyword
organ — filter entries by organ name
date — filter entries by date prefix (YYYY-MM-DD)
Args:
query — search term or organ name or date (depends on mode)
limit — max entries to return (default 10, max 50)
actor_id— acting agent identity
F2: All data direct from VAULT999 filesystem.
Cycle 3 fix (2026-06-21): the conformance spine's `vault_replay` check
needs each entry to expose `ts`/`timestamp` (alias of mtime), and the
response itself to expose `status` + `chain_ok` INSIDE the result dict
(because `_extract_tool_result` strips the outer `status` key when
returning parsed["result"]). Without these the kernel could not prove
it can read its own sealed past — the load-bearing property of a
substrate. Now it can.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | recent | |
| limit | No | ||
| query | No | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds that responses include 'status' and 'chain_ok' inside the result dict, which is useful. However, the verbose Cycle 3 fix paragraph may confuse agents and detracts from clarity.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is overly long due to the Cycle 3 fix paragraph, which is internal implementation detail irrelevant to an agent. A more concise version would improve usability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description adequately covers modes and parameter behaviors. The fix paragraph adds some response structure info beyond what annotations provide, but it's redundant and poorly integrated. With an output schema present, the detail is partially unnecessary.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description effectively explains 4 of 6 parameters (mode, limit, query, actor_id) including defaults and constraints. Session_id and _envelope are omitted, but the essential ones are covered.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool queries the VAULT999 audit ledger and explains four modes. However, it does not differentiate from sibling tool hermes_vault_query, missing an opportunity to clarify when to use which.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like hermes_vault_query or other arif_ tools. The modes are described but without context on which situation fits which mode.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_wiki_askCInspect
Ask a natural-language question over the wiki index. Run arif_wiki_ingest first.
| Name | Required | Description | Default |
|---|---|---|---|
| top_k | No | ||
| question | Yes | ||
| repo_path | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description bears full responsibility. It lacks details on side effects, authentication needs, or output format. The description only indicates it queries a pre-ingested index, leaving behavioral traits largely unspecified.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise—two short sentences with no filler. Every word is necessary and the most critical information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is too minimal: it fails to explain parameters or output despite having a rich output schema (not shown). It only covers usage prerequisite, leaving significant gaps for effective tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning the description adds no information about the parameters (repo_path, question, top_k). Without any parameter explanation, the agent cannot understand what values to provide beyond schema constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool asks a natural-language question over the wiki index. It provides a specific verb ('ask') and resource ('wiki index'), and implies a different approach from siblings like 'arif_wiki_search' by explicitly mentioning 'natural-language'. However, it does not explicitly differentiate from other wiki tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a prerequisite ('Run arif_wiki_ingest first'), which guides usage timing. But it does not mention when to use this tool versus alternatives like 'arif_wiki_search' or 'arif_wiki_map', nor does it specify when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_wiki_ingestBInspect
Index a local repository into the wiki. Run this FIRST before map, search, or ask.
| Name | Required | Description | Default |
|---|---|---|---|
| repo_path | Yes | ||
| scope_name | No | arifOS | |
| write_wiki | No | ||
| exclude_globs | No | ||
| include_globs | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool indexes a local repository, implying a write operation, but provides no details on what happens to existing data, required permissions, side effects, or output. This is insufficient for understanding the tool's impact.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loads the core purpose, and immediately follows with usage guidance. Every word adds value—no fluff, no repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 parameters, no annotations, but has an output schema, the description is incomplete. It fails to explain parameters like repo_path, scope_name, exclude_globs, etc., which are critical for correct use. The description adequately conveys the primary action and sequence but lacks the detail needed for a complex tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 5 parameters with 0% description coverage, meaning no parameter descriptions exist. The tool description adds no meaning for any parameter, leaving the agent to guess their roles despite having an output schema and no enums. This severely hinders correct invocation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Index a local repository into the wiki.' It also distinguishes from siblings by specifying it should be run first before map, search, or ask, making the purpose and ordering explicit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance: 'Run this FIRST before map, search, or ask.' This clearly indicates when to use this tool relative to its siblings, though it does not discuss when not to use it or mention alternatives beyond the sequence.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_wiki_mapBInspect
Get a structural map of an indexed repository. Run arif_wiki_ingest first.
| Name | Required | Description | Default |
|---|---|---|---|
| max_depth | No | ||
| repo_path | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. It only states what the tool does but does not mention any behavioral traits such as read-only nature, required permissions, rate limits, or what constitutes a 'structural map'.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences long, directly stating the purpose and a prerequisite. It is front-loaded and contains no extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Although an output schema exists, the description does not elaborate on the nature of the 'structural map' output. Given the tool's complexity and the lack of parameter explanations, the description feels incomplete for a tool that likely returns a significant data structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0% description coverage, meaning the description must explain parameter meaning. However, the description adds no information about the parameters max_depth or repo_path, leaving their semantics entirely to inference from the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves a structural map of an indexed repository, using a specific verb and resource. It also mentions a prerequisite (run arif_wiki_ingest first), which distinguishes it from sibling tools like arif_wiki_ingest and arif_wiki_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear when-to-use guidance by noting the prerequisite of prior ingestion. It does not explicitly state when not to use or name alternatives, but the context implies it is for exploring structure after ingest.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arif_wiki_searchBInspect
Search the wiki index for scored evidence chunks. Run arif_wiki_ingest first.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | ||
| top_k | No | ||
| repo_path | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose behavioral traits. It only states the basic function and fails to mention whether the tool is read-only, destructive, or any other important behaviors.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two short sentences that front-load the primary action. It is efficient with no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the core function and a prerequisite, but leaves gaps about the nature of 'scored evidence chunks' and how top_k affects results. An output schema exists, so return values need not be described, but more context would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0% description coverage, and the description adds no explanation for parameters like query, top_k, or repo_path. While parameter names are somewhat self-explanatory, the description does not compensate for the lack of schema details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches the wiki index for scored evidence chunks, specifying the verb and resource. However, it does not differentiate from sibling tools like arif_wiki_ask or arif_wiki_map, which could cause confusion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a prerequisite ('Run arif_wiki_ingest first') but does not specify when to use this tool over alternatives or any exclusions, leaving guidance incomplete.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
forge_dry_runDInspect
forge_* tools have moved to A-FORGE MCP. Connect to forge.arif-fazil.com/mcp for engineering tools. This arifOS endpoint will be removed 2026-07-15.
| Name | Required | Description | Default |
|---|---|---|---|
| cwd | No | . | |
| plan_id | No | ||
| actor_id | No | ||
| manifest | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description does not disclose any behavioral traits such as side effects or requirements. The description only states deprecation, ignoring the tool's actual behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very short, but it is not concise for a functional tool; it is a deprecation notice. It could be considered underspecified rather than concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the high parameter count, lack of schema coverage, and no output schema description, the description is completely inadequate for an agent to understand and use the tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 6 parameters with 0% description coverage. The description adds no meaning to the parameters, leaving the agent without any guidance on how to fill them.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description only states that the tool has moved to a different endpoint and will be removed. It does not explain what the tool does, making it misleading for an agent trying to select the tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool or alternatives. It only mentions migration, but does not direct to the new MCP server tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
forge_planDInspect
forge_* tools have moved to A-FORGE MCP. Connect to forge.arif-fazil.com/mcp for engineering tools. This arifOS endpoint will be removed 2026-07-15.
| Name | Required | Description | Default |
|---|---|---|---|
| goal | No | ||
| actor_id | No | ||
| _envelope | No | ||
| workspace | No | . | |
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description does not disclose any behavioral traits such as side effects, permissions, or return behavior. It only mentions deprecation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very short, but it fails to provide essential information about the tool's functionality. It is under-specification rather than concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having an output schema and 5 parameters, the description says nothing about inputs, outputs, or behavior. It is incomplete for any meaningful use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 5 parameters with 0% description coverage, and the description adds no meaning to any parameter. Parameters like 'goal' and 'actor_id' are left entirely unexplained.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description only states that the tool has moved to another endpoint and will be removed, but does not describe what the tool actually does. It fails to specify the verb and resource, making the purpose unclear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No usage guidance is provided. The description only indicates deprecation without explaining when or why to use this tool versus alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
forge_plan_and_simulateDInspect
forge_* tools have moved to A-FORGE MCP. Connect to forge.arif-fazil.com/mcp for engineering tools. This arifOS endpoint will be removed 2026-07-15.
| Name | Required | Description | Default |
|---|---|---|---|
| intent | No | ||
| context | No | ||
| _envelope | No | ||
| risk_tier | No | medium | |
| force_simulation | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose behavioral traits. However, it does not describe any behavior (e.g., whether it is read-only, destructive, or requires authorization). It only states deprecation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, so it is concise. However, it does not serve the purpose of describing the tool; it is a migration notice. A concise but uninformative description still fails the conciseness criterion as it does not earn its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 parameters, no schema coverage, and a complex name implying planning and simulation, the description is completely inadequate. It lacks any information about how to invoke the tool or what it returns.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description does not explain any parameters. The five parameters (intent, context, _envelope, risk_tier, force_simulation) are left entirely undocumented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description does not state what the tool does; it only indicates that forge_* tools have moved to another MCP server and this endpoint will be removed. There is no verb or resource to describe the tool's function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. The description mentions a migration but does not explicitly tell the agent to use the A-FORGE MCP's tool instead, nor does it provide any usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
forge_queryDInspect
forge_* tools have moved to A-FORGE MCP. Connect to forge.arif-fazil.com/mcp for engineering tools. This arifOS endpoint will be removed 2026-07-15.
| Name | Required | Description | Default |
|---|---|---|---|
| cwd | No | . | |
| query | No | ||
| actor_id | No | ||
| manifest | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No behavioral traits are disclosed. The description adds confusion by stating the tool has moved, yet the tool is still available. No annotations exist to compensate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very short, but it omits essential information and instead provides a deprecation notice. It is not effectively concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 6 parameters, no annotations, no parameter descriptions, and an output schema that is not referenced, the description fails to provide a complete understanding of the tool's functionality.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description provides no information about the 6 parameters, and the schema has 0% coverage. Without any parameter documentation, the agent cannot understand how to use the tool.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description does not state what forge_query does; it only announces deprecation and redirection. The tool name suggests it performs queries, but no details are given.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when or how to use forge_query. The description suggests not to use it because it will be removed and recommends an alternative MCP, which is confusing as the tool still exists.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
hermes_cross_verifyARead-onlyInspect
HERMES_CROSS_VERIFY: Submit a CLAIM for cross-agent verification.
Delegates fact-check to a second agent (OpenCode by default) for
independent verification. Returns structured result only.
Args:
claim — the claim to verify
target — verification target: "opencode" (default)
time_budget_seconds — max time to wait for verification (default 60)
Returns:
status: OK | ERROR | TIMEOUT
verdict: VERIFIED | CONTRADICTED | INSUFFICIENT
evidence: list of evidence discovered
confidence: 0.0–1.0
F2 + F3: Cross-agent witness methodology.
| Name | Required | Description | Default |
|---|---|---|---|
| claim | No | ||
| target | No | opencode | |
| actor_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| time_budget_seconds | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark readOnlyHint=true and destructiveHint=false, but the description adds valuable context: it returns structured results, explains the timeout via time_budget_seconds, and lists return fields. The cryptic 'F2 + F3' note slightly reduces clarity, but overall it adds behavioral insight beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with Args and Returns sections, and is fairly concise. The 'F2 + F3' line is cryptic and adds little value, but doesn't significantly harm conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 6 parameters (some internal) and a output schema, the description covers the main functional aspects but leaves actor_id, envelope, and session_id unexplained. The output schema reduces the need to detail return values, but the missing parameter explanations and the cryptic methodology note prevent full completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description must compensate. It explains claim, target, and time_budget_seconds clearly, but misses actor_id, _envelope, and session_id, which are present in the schema but undocumented. This partial coverage gives a baseline of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool submits a claim for cross-agent verification, distinguishing it from siblings like hermes_fact_check by emphasizing independent delegation to another agent. The verb 'Submit' and resource 'CLAIM' are specific, and the cross-agent nature differentiates it.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains that the tool delegates fact-checking to a second agent (default OpenCode) for independent verification, giving clear context for when to use it. However, it does not explicitly state when not to use or mention alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
hermes_epistemic_checkARead-onlyInspect
HERMES_EPISTEMIC_CHECK: Pre-flight epistemic confidence check for a CLAIM.
Evaluates a claim against available evidence to produce a
CONFIDENCE_LEVEL, GAPS, and RECOMMENDATION before the claim
enters the metabolic pipeline.
Modes:
quick — fast heuristic check (default)
vault — cross-reference against VAULT999 entries
full — vault + memory + organ consensus (slowest)
Args:
claim — the claim to evaluate
evidence_context — optional context describing available evidence
actor_id — acting agent identity
F2: Returns CONFIDENCE label with explicit gap analysis.
F7: Never returns overconfident labels.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | quick | |
| claim | No | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| evidence_context | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true and destructiveHint=false. The description adds behavioral traits beyond these, such as 'Never returns overconfident labels' (F7) and 'Returns CONFIDENCE label with explicit gap analysis' (F2), along with mode-specific behavior. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a header, brief explanation, modes list, args list, and behavioral notes (F2/F7). It is concise enough, though the use of uppercase for some terms and the block formatting could be slightly streamlined. Overall, every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (as indicated by context) and the annotations, the description is complete. It explains the tool's purpose, modes, parameters, return values (confidence with gap analysis), and behavioral guarantees. The six parameters are sufficiently described for an agent to invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 0% schema description coverage, the description adds meaningful explanations for key parameters: claim (the claim to evaluate), evidence_context (optional context describing available evidence), actor_id (acting agent identity), and mode (with enum values explained). It does not explain _envelope or session_id, but the essential parameters are well covered.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: a pre-flight epistemic confidence check for a claim, evaluating against evidence to produce confidence level, gaps, and recommendation. It distinguishes from siblings by emphasizing its position before the metabolic pipeline and listing explicit modes (quick, vault, full) that differentiate it from related tools like hermes_fact_check or hermes_cross_verify.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use the tool ('pre-flight ... before the claim enters the metabolic pipeline') and describes the modes ('quick' for fast heuristic, 'vault' for cross-reference, 'full' for comprehensive). However, it does not explicitly state when NOT to use it or name alternatives among siblings, though the mode selection offers implicit guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
hermes_fact_checkARead-onlyInspect
HERMES_FACT_CHECK: Verify a CLAIM against evidence sources.
Uses web_search + VAULT999 + available tools to produce a structured
verdict about a factual claim.
Modes:
quick — local heuristic + VAULT999 check only (default)
web — quick + web_search for external corroboration
deep — web + VAULT999 + memory cross-reference
Returns:
verdict: CONFIRMED | REFUTED | MIXED | UNKNOWN
confidence: 0.0–1.0
evidence: list of sources found
gaps: list of missing evidence
F2: Returns structured evidence, not narrative.
F7: Never overclaims confidence.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | quick | |
| claim | No | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No | ||
| time_sensitive | No | ||
| evidence_context | No | ||
| required_confidence | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant behavioral details beyond annotations: it explains the return structure (verdict, confidence, evidence, gaps), mentions constraints like 'F2: Returns structured evidence, not narrative' and 'F7: Never overclaims confidence.' These traits are not available in the readOnlyHint/destructiveHint annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-organized with sections for modes, returns, and meta-notes. It is slightly verbose but each sentence adds value. The structure aids readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (8 parameters, multiple modes, output schema), the description covers the core behavior and return values but omits parameter details. The output schema existence is noted but not fully leveraged in the description.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage and 8 parameters, the description fails to explain most parameters (e.g., actor_id, envelope, session_id, time_sensitive, evidence_context, required_confidence). It only mentions 'claim' implicitly and 'mode' via modes. This is insufficient compensation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Verify a CLAIM against evidence sources.' It specifies the verb 'verify', the resource 'claim', and the method using web search and VAULT999. The modes differentiate it from sibling tools like hermes_cross_verify, making the purpose distinct.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use each mode (quick, web, deep), describing the sources used for each. However, it does not explicitly state when not to use this tool or name alternatives, though the sibling list implies differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
hermes_memory_stewardARead-onlyInspect
HERMES_MEMORY_STEWARD: Classify content for memory storage.
Evaluates content and recommends where it should go in the
memory hierarchy:
- STORE_IN_VAULT — high-importance, permanent audit trail
- STORE_IN_GRAPHITI — relational knowledge, edges between concepts
- STORE_IN_MEMORY — operational context, medium-term
- DISCARD — ephemeral, not worth storing
- TODO_FOR_ARIF — requires human review
Args:
content — the content to classify
importance — low | medium | high | critical
mode — classify | compact
Returns:
recommendation: storage target
justification: why this classification
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | classify | |
| content | No | ||
| actor_id | No | ||
| _envelope | No | ||
| importance | No | medium | |
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds meaningful behavioral context by explaining the evaluation and recommendation process, the modes (classify/compact), and the exact storage targets. This goes beyond annotations without contradicting them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a clear opening, bullet list of classification targets, and labelled arguments. It is not overly verbose, though the args block could be slightly more concise. The front-loading is good.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (not shown but context signals confirm), the description adequately covers the classification logic and return value concepts (recommendation, justification). The tool has 6 parameters with no required ones, and the description explains the most critical ones. It is complete enough for an AI agent to understand the tool's operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description adds meaning for key parameters: content, importance (with enum values), and mode (classify vs compact). However, parameters like actor_id, session_id, and _envelope are not explained, which would require the schema to fill the gap. The description compensates partially but not fully.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool classifies content for memory storage and enumerates the five possible storage targets (STORE_IN_VAULT, STORE_IN_GRAPHITI, etc.). It uses a specific verb (classify) and resource (memory storage), distinguishing it from sibling hermes tools like hermes_cross_verify.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains what the tool does but does not provide explicit guidance on when to use it versus alternatives like hermes_epistemic_check or hermes_vault_query. Usage is implied by the classification task, but no when-to-use or when-not-to-use conditions are given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
hermes_plan_reviewARead-onlyInspect
HERMES_PLAN_REVIEW: Review a multi-step plan for safety and completeness.
Checks:
- Missing safety/verify steps
- F1-F13 floor violations
- Unclear success criteria
- Risk assessment per step
- E7 autonomy ceiling
Args:
plan — JSON string or numbered list of steps
goal — what the plan aims to achieve
mode — quick | full
Returns:
status: OK | WARN | BLOCK
issues: list of potential problems
recommendations: list of suggested changes
| Name | Required | Description | Default |
|---|---|---|---|
| goal | No | ||
| mode | No | quick | |
| plan | No | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare read-only and non-destructive. The description adds detail about specific checks and return fields (status, issues, recommendations), enhancing transparency beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with bullet points and an Args section. It is concise and front-loaded with the purpose, though some redundancy exists (e.g., repeating the name in all caps).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Return values are explained, and the output schema exists. However, missing explanations for optional parameters and a lack of contextual usage among many siblings reduce completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, but the description explains plan, goal, and mode in an Args section. However, actor_id, _envelope, and session_id are not described, leaving gaps for an agent.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool reviews multi-step plans for safety and completeness, listing specific checks. It distinguishes itself from sibling tools like forge_plan (which creates plans) and hermes_cross_verify (which cross-verifies).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Usage is implied as post-plan creation, but no explicit when-not or alternative guidance is provided. Sibling tools include plan creation and verification, but no comparison is made.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
hermes_system_statusARead-onlyInspect
HERMES_SYSTEM_STATUS: Return current federation state snapshot.
Modes:
brief — organ health + latest event count (default)
full — brief + VAULT999 seal count + memory stats
organs — organ health only
events — NATS governance events only
F2: All organ health data from live TCP probes.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | brief | |
| actor_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds valuable behavioral context beyond annotations: it states that all organ health data comes from 'live TCP probes', indicating real-time data rather than cached. This is useful for an agent deciding on freshness. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise (3 sentences plus a 4-line list) with no wasted words. Purpose is front-loaded, modes are clearly enumerated in a structured list, and every sentence adds unique value (purpose, mode definitions, data source).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that the tool has only one meaningful parameter (mode) explained in the description, an output schema that defines return structure, and annotations covering safety, the description provides sufficient context for correct invocation. The addition of 'live TCP probes' adds important nuance about data freshness. No gaps that would hinder use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 4 parameters with 0% description coverage. The description partially compensates by explaining the 'mode' enum parameter, detailing what each enum value returns. However, parameters like actor_id, session_id, and _envelope are not mentioned, leaving their purpose unclear. Schema coverage is low, so description helps but not fully.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Return' and resource 'current federation state snapshot', immediately establishing purpose. It further distinguishes itself by listing specific modes (brief, full, organs, events) that output different slices of system status, differentiating it from sibling tools like hermes_cross_verify or hermes_epistemic_check.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description details what each mode returns, it provides no explicit guidance on when to use this tool versus alternatives or when to prefer one mode over another. There is no mention of prerequisites, caveats, or explicit exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
hermes_vault_queryCRead-onlyInspect
ARIF_VAULT_QUERY: Query VAULT999 audit ledger.
Modes:
recent — return N most recent VAULT entries (default)
search — search VAULT entries by keyword
organ — filter entries by organ name
date — filter entries by date prefix (YYYY-MM-DD)
Args:
query — search term or organ name or date (depends on mode)
limit — max entries to return (default 10, max 50)
actor_id— acting agent identity
F2: All data direct from VAULT999 filesystem.
Cycle 3 fix (2026-06-21): the conformance spine's `vault_replay` check
needs each entry to expose `ts`/`timestamp` (alias of mtime), and the
response itself to expose `status` + `chain_ok` INSIDE the result dict
(because `_extract_tool_result` strips the outer `status` key when
returning parsed["result"]). Without these the kernel could not prove
it can read its own sealed past — the load-bearing property of a
substrate. Now it can.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | recent | |
| limit | No | ||
| query | No | ||
| actor_id | No | ||
| _envelope | No | ||
| session_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| meta | No | Metadata including actor_id, mode, circuit |
| tool | Yes | Canonical tool name that produced this response |
| result | Yes | Tool-specific payload |
| status | Yes | Execution status: OK, ERROR, TIMEOUT, DRY_RUN |
| delta_S | No | Thermodynamic entropy change |
| reasons | Yes | Human-readable justification list |
| verdict | Yes | Constitutional verdict: SEAL, HOLD, VOID, SABAR, PROVISIONAL, PARTIAL |
| actor_id | No | Sovereign or agent actor ID |
| timestamp | No | ISO-8601 timestamp |
| session_id | No | Active session identifier |
| _violations | No | Non-compliance audit trail |
| nine_signal | Yes | F2 addendum nine-signal block |
| output_policy | No | Policy constraints: DOMAIN_SEAL, DOMAIN_HOLD, DOMAIN_VOID, SIMULATION_ONLY |
| stage_progression | No | Next stage auto-chain hint |
| _nine_signal_compliant | No | Internal compliance flag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark it as read-only. Description adds that data comes directly from the filesystem and reveals internal response structure (ts/timestamp, status, chain_ok). However, the long fix paragraph includes implementation details that may confuse.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is verbose, includes a lengthy and technical fix paragraph that is not essential for the agent; could be more concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Fails to describe all six parameters (mode, _envelope, session_id omitted), and does not fully explain output schema despite its existence.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Description explains three parameters (query, limit, actor_id) with context, but misses mode, _envelope, and session_id. Schema coverage is 0% but description adds value for those covered.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it queries the VAULT999 audit ledger and lists modes, but does not distinguish it from sibling tools like arif_vault_query.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool vs alternatives; no mention of prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
mcp_drift_checkCInspect
Compare live tool registry against canonical manifest for drift.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | report | |
| target_manifest | No | canonical13 |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description bears full responsibility for disclosing behavioral traits. It only states the action without explaining whether it is read-only or destructive, what 'drift' detection entails, or what side effects occur. Minimal transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence with no redundancy. It is appropriately short, though it sacrifices depth for brevity. Could benefit from slightly more detail without being verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having an output schema, the description omits any explanation of output, return values, or the implications of drift checking. For a tool with two parameters and no annotations, the description is incomplete, leaving the agent uncertain about expected results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description does not explain the parameters 'mode' or 'target_manifest' at all. The agent receives no guidance on what values mean or how they affect behavior, leaving the schema defaults unenlightened.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Compare' and identifies the two resources being compared: 'live tool registry' and 'canonical manifest' for 'drift'. It is specific and distinct from siblings like 'arif_detect_institutional_shadow_drift' which targets a different kind of drift.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives. Among many siblings, there is no mention of contexts, prerequisites, or exclusions. The agent must infer usage from the name and description alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!