healthclaw-guardrails

Name: healthclaw-guardrails
Author: aks129

by io.github.aks129

Server Details

Guardrailed FHIR access for AI agents: PHI redaction, audit trail, step-up auth, tenant isolation

Status: Unhealthy
Last Tested: 2026-07-25 15:44
Transport: Streamable HTTP
URL
Repository: aks129/HealthClawGuardrails
GitHub Stars: 27
Server Listing: HealthClawGuardrails

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.9/5.0

Tool DescriptionsA

Average 4.1/5 across 24 of 24 tools scored. Lowest: 3.5/5.

Server CoherenceA

Disambiguation5/5

Each tool has a clearly distinct purpose, with explicit descriptions that avoid overlap. The action_*, curatr_*, fhir_*, questionnaire_*, shl_*, sources_*, and wearables_* groups each cover unique functionality without ambiguity.

Naming Consistency3/5

The naming follows a consistent prefix pattern (e.g., fhir_, curatr_), but within the fhir_ group there is inconsistency: some use verbs (read, search), others use nouns (stats, lastn, compiled_truth). This mix can cause slight confusion.

Tool Count4/5

With 24 tools, the set is comprehensive for a health data guardrail server, covering FHIR operations, data quality, actions, questionnaires, SHL generation, and external source checks. The count is high but justified by the domain's complexity.

Completeness4/5

The tool set covers core operations: read, search, write (propose/commit), validate, statistics, lab interpretation, permission evaluation, data quality fixes, actions, questionnaires, SHL, and source checking. Missing explicit delete or update operations, but writing may encompass updates, and deletions may be out of scope.

Available Tools

29 tools

action_commitCommit Real-World ActionA

Destructive

Inspect

Execute a previously proposed action AFTER the patient has explicitly approved the draft. Requires step-up authorization (call fhir_get_token first; pass as _stepUpToken). Only call this after the patient says yes.

ParametersJSON Schema

Name	Required	Description	Default
`action_id`	Yes	ID returned by action_propose

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint: true, but the description adds context about step-up authorization and the need for explicit patient consent, which goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no superfluous information; front-loaded with the core action and conditions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the single parameter, no output schema, and adequate annotations, the description fully covers when, how, and prerequisites for using the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter. Description adds minor context that action_id comes from action_propose, but this is not substantially new beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'execute' and the resource 'previously proposed action', distinguishing it from siblings like action_propose and action_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly specifies when to use (only after patient approval), prerequisites (step-up token via fhir_get_token), and the condition for invoking the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

action_proposePropose Real-World ActionA

Read-only

Inspect

Propose a real-world action (phone call or SMS) on the patient's behalf. Returns a draft (id + script) the patient MUST review before commit. Does not execute anything.

ParametersJSON Schema

Name	Required	Description	Default
`kind`	Yes	Action type
`payload`	Yes	Action content: { to: recipient label, phone: number to dial/text, body: call script or message text }

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond the annotations: it clarifies that the tool returns a draft and the patient MUST review before commit, and that it does not execute anything. This aligns with the readOnlyHint annotation and provides clear transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, consisting of two sentences that efficiently convey the purpose and key behavioral constraints without extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool without an output schema, the description mentions the return value (draft with id and script) and the workflow (review before commit). This, combined with the sibling tools like action_commit, provides sufficient context for correct usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents the 'kind' enum and 'payload' object. The description does not add new parameter details beyond what the schema provides, so the baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool proposes a real-world action (phone call or SMS) on the patient's behalf, specifying the verb 'propose' and the resource 'real-world action'. It distinguishes itself from siblings like action_commit by noting that it only returns a draft and does not execute anything.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states the tool returns a draft that must be reviewed before commit and does not execute anything, thereby guiding when to use this tool versus action_commit. However, it does not explicitly list alternative tools or when-not-to-use scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

action_statusAction StatusA

Read-only

Inspect

Check the status and outcome of an action (proposed/executing/completed/failed). Use after commit to report the result back to the patient.

ParametersJSON Schema

Name	Required	Description	Default
`action_id`	Yes	ID returned by action_propose

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the safety profile is clear. The description adds some behavioral context by listing statuses (proposed/executing/completed/failed) and mentioning reporting, but does not disclose further traits like permissions or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, the first fully covering the tool's purpose and outcome types, the second providing a usage direction. No superfluous words; every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read operation with one parameter and no output schema, the description is nearly complete. It specifies the action and possible statuses. It could have elaborated on the exact return structure (e.g., JSON fields) but the current text is sufficient for most use cases.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the schema already describes the single parameter action_id as 'ID returned by action_propose'. The main description does not add additional parameter meaning beyond what the schema provides, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Check' and targets the resource 'status and outcome of an action', listing possible statuses. It distinguishes from sibling tools like action_propose and action_commit by implying its role as the read/check step, though not explicitly naming alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The phrase 'Use after commit to report the result back to the patient' provides specific usage context and a workflow hint. However, it does not include when-not-to-use instructions or explicitly mention alternative tools for other scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

care_gapsPreventive Care GapsA

Read-only

Inspect

Check which preventive-care screenings/immunizations a patient may be due for (blood pressure, cholesterol, colorectal/cervical/breast cancer screening, flu, diabetes A1c), from their own connected records. Decision support based on USPSTF/ACIP/ADA guidelines — not a diagnosis or directive.

ParametersJSON Schema

Name	Required	Description	Default
`subject`	No	Patient reference (e.g. 'Patient/pt-1')

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and destructiveHint, so the description adds value by specifying the tool is based on USPSTF/ACIP/ADA guidelines and is not a diagnosis or directive. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with parenthetical list examples; no wasted words. Front-loaded with the core action and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (preventive care decision support), the description covers what screenings are included, the data source, and the advisory nature. Adequate without an output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Parameter schema has 100% coverage with a description for 'subject'. The description adds context that the tool operates on the patient's 'own connected records', complementing the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description precisely states the tool checks preventive-care screenings/immunizations a patient may be due for, listing specific examples (blood pressure, cholesterol, cancer screenings, flu, A1c). It clearly distinguishes from siblings by focusing on preventive care gaps based on clinical guidelines.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for checking due preventive care from a patient's own records. It provides context but does not explicitly exclude scenarios or name alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

context_getGet Health ContextA

Read-only

Inspect

Retrieve a pre-built context envelope with patient-centric FHIR resources. Returns bounded, policy-stamped, time-limited context.

ParametersJSON Schema

Name	Required	Description	Default
`context_id`	Yes	Context envelope ID

Tool Definition Quality

A3.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds value beyond annotations by stating the returned context is 'bounded, policy-stamped, time-limited.' This provides behavioral context that is not in the annotations, which already indicate read-only and non-destructive behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that effectively front-loads the purpose and key characteristics, with no wasted words. Every part contributes to understanding.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (one parameter, no output schema), the description provides adequate context for basic understanding. However, it could mention what kind of resource the envelope contains or how to interpret the output.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the single parameter 'context_id' is described in the schema. The description does not add additional meaning beyond what the schema provides, warranting the baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a pre-built context envelope with patient-centric FHIR resources using a specific verb and resource. It does not explicitly differentiate from sibling tools, but the unique 'context' concept sets it apart.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, nor any context on prerequisites or typical use cases. The agent must infer usage solely from the purpose.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

curatr_apply_fixApply Data Quality FixAInspect

Apply patient-approved data quality fixes to a FHIR resource. Creates a linked Provenance record with full attribution. Requires step-up authorization (X-Step-Up-Token) and human confirmation (X-Human-Confirmed: true) for clinical resources like Condition.

ParametersJSON Schema

Name	Required	Description
`fixes`	Yes	List of field fixes to apply. Each fix has 'field_path' (dot-notation, e.g. 'Condition.code.coding[0].system') and 'new_value' (the corrected value).
`resource_id`	Yes	ID of the resource to fix
`resource_type`	Yes	FHIR resource type to fix (e.g. 'Condition')
`patient_intent`	Yes	Plain-language reason for the fix, provided by the patient (recorded in Provenance).

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses creation of a linked Provenance record and step-up authorization needs, adding behavioral context beyond the annotations (which only indicate readOnlyHint=false and destructiveHint=false). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences: one for primary action (apply fix + Provenance), one for authorization requirements. No wasted words; front-loads key behavior.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Missing output schema and no mention of return value or side effects (e.g., whether resource is updated in place). Given it's a write tool with no output schema, the description should explain what the agent gets back (e.g., success indicator, updated resource). Incomplete for a tool of this complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all parameters with descriptions, so baseline is 3. The description adds no extra meaning beyond the schema; it only restates 'patient-approved' context. No parameter details beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it applies patient-approved data quality fixes to FHIR resources and creates a Provenance record. Differentiates from sibling tools like curatr_evaluate (evaluation) and action_commit (generic commit) by specifying the 'apply fix' and 'patient-approved' context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit authorization requirements (step-up token, human confirmation) for clinical resources, which guides when to use. However, it lacks explicit when-not-to-use guidance or direct comparison to siblings like action_propose or fhir_commit_write, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

curatr_evaluateEvaluate Data QualityA

Read-only

Inspect

Evaluate a FHIR resource for data quality issues. Checks coding elements against public terminology services (tx.fhir.org for SNOMED/LOINC, NLM for ICD-10-CM, RXNAV for RxNorm) and structural rules. Returns issues in plain language with patient-facing impact descriptions and resolution suggestions. Read-only — no step-up required.

ParametersJSON Schema

Name	Required	Description	Default
`resource_id`	Yes	ID of the resource to evaluate
`resource_type`	Yes	FHIR resource type to evaluate (e.g. 'Condition')

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds behavioral details: it checks against specific public terminology services, applies structural rules, and returns plain-language issues with patient-facing impact and resolution suggestions. This enriches the annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loading the core action and then details. Every sentence adds value: purpose, what it checks, output format, and safety guarantee. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple terminology services, structural rules, plain-language output) and the absence of an output schema, the description covers the essential aspects: what it evaluates, how it evaluates, and what it returns. It could mention whether the evaluation is synchronous or the format of the issues, but it is sufficiently complete for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema coverage is 100% with clear parameter descriptions ('ID of the resource to evaluate', 'FHIR resource type to evaluate'). The description does not add parameter-specific details beyond what the schema provides, so the description's added value is limited.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool evaluates FHIR resources for data quality issues, specifying checks on coding elements against public terminology services (SNOMED, LOINC, ICD-10-CM, RxNorm) and structural rules. It differentiates from siblings like curatr_apply_fix by focusing on evaluation rather than applying fixes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'Read-only — no step-up required' implying safe usage without special permissions, but does not explicitly guide when to choose this tool over siblings like fhir_validate or curatr_apply_fix. The context of data quality evaluation is clear, but alternative usage scenarios are not addressed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fetchFetch Health RecordA

Read-only

Inspect

ChatGPT-connector-compatible fetch of one FHIR resource by id ('ResourceType/id', as returned by search). Returns the full document (PHI-redacted server-side) with metadata.

ParametersJSON Schema

Name	Required	Description	Default
`id`	Yes	Resource reference: 'ResourceType/id'

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, non-destructive behavior. Description adds server-side PHI redaction and metadata return, which is useful context beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently convey core purpose, input format, and output characteristics with no wasted words. Front-loaded with key action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with one parameter and no output schema, description adequately explains inputs and outputs. Doesn't mention error handling or performance, but not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema already describes parameter, but description reinforces the format and source ('as returned by search'), adding practical context for correct usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool fetches one FHIR resource by id in 'ResourceType/id' format, distinguishes from search tools by referencing 'as returned by search', and specifies outputs (full document, PHI-redacted, with metadata).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies usage after search results, but does not explicitly exclude alternatives like fhir_read or state when not to use. Provides context for typical workflow.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_commit_writeCommit FHIR WriteA

Destructive

Inspect

Commit a previously proposed write. Requires step-up authorization token. This is a destructive operation.

ParametersJSON Schema

Name	Required	Description	Default
`resource`	Yes	The FHIR resource to commit
`operation`	Yes

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint: true, and description reinforces this. Adds step-up authorization requirement, providing useful extra behavior context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, 14 words, front-loaded with key action. No fluff, every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers main purpose, authorization, and destructiveness. However, lacks details on return value, how to use parameters effectively, and what constitutes a 'previously proposed write'. Adequate but has gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 50% schema description coverage (resource has description, operation does not), the description adds no parameter guidance. Could explain expected format for resource or operation choices, but omits.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'commit' and resource 'previously proposed write', distinguishing from siblings like fhir_propose_write. Also mentions authorization and destructive nature, making purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies use after a proposal action and specifies authorization requirement, but does not explicitly state when not to use or compare with alternatives like fhir_read.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_compiled_truthCompiled Truth TimelineA

Read-only

Inspect

Return the current best understanding of a FHIR resource plus the append-only evidence trail (Provenance entries) of how it got there. Use this before presenting resource-specific facts to a patient — surfaces curation_state and quality_score so the agent can say not just WHAT the record says but WHY it says it. Redacted, audited. Response includes _meta.ui.resourceUri pointing to an embeddable review UI.

ParametersJSON Schema

Name	Required	Description	Default
`resource_id`	Yes	ID of the resource
`resource_type`	Yes	FHIR resource type (e.g. 'Condition', 'AllergyIntolerance')

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and destructiveHint=false. The description adds that the response is 'Redacted, audited' and includes a `_meta.ui.resourceUri` pointing to an embeddable review UI. This provides extra behavioral context beyond annotations, such as redaction and audit features, without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences. The first sentence concisely states the core function. The second provides usage guidance and highlights key features (curation_state, quality_score, UI link). No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 2 required parameters, clear annotations, and no output schema, the description sufficiently covers what the tool does and what it returns (compiled truth, evidence trail, curation_state, quality_score, resourceUri). It is complete for its intended use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description does not add additional semantics beyond the schema's definitions for resource_type and resource_id. It mentions FHIR types in examples but does not enhance parameter understanding further.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool returns the current best understanding of a FHIR resource along with an append-only evidence trail. It uses a specific verb ('Return') and identifies the resource type ('FHIR resource') and the additional provenance entries. This clearly distinguishes it from siblings like fhir_read or fhir_search which return raw resources or search results.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises using the tool before presenting resource-specific facts to a patient, highlighting that it surfaces curation_state and quality_score. This provides a clear when-to-use scenario. It does not explicitly mention when not to use it or alternatives, but the given context is sufficient for most cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_get_tokenMint Step-Up TokenA

Read-only

Inspect

Get a fresh step-up authorization token for write operations. Call this before fhir_propose_write, fhir_commit_write, or curatr_apply_fix. Tokens expire after 5 minutes. Returns the token string — pass it as _stepUpToken in subsequent write tool calls.

ParametersJSON Schema

Name	Required	Description	Default
`tenant_id`	Yes	Tenant ID to scope the token to

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and destructiveHint=false, and the description's 'Get' aligns. Adds context about token expiry (5 minutes) and return value (token string), enhancing transparency beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each purposeful: first states purpose, second gives usage context, third details expiry and output. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one required parameter and no output schema, the description covers purpose, usage order, expiry, and output handling completely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only one parameter (tenant_id) with 100% schema coverage. The description does not add new detail beyond what the schema provides, so baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'step-up authorization token', specifying its purpose for write operations. It differentiates from sibling tools by naming specific subsequent tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use this tool: 'Call this before fhir_propose_write, fhir_commit_write, or curatr_apply_fix'. Also mentions token expiry. Lacks explicit when-not-to-use, but context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_interpret_labsInterpret Lab ResultsA

Read-only

Inspect

Interpret lab Observations against reference ranges — flags each value low/normal/high/critical (HL7 v3 ObservationInterpretation) and returns clinician + consumer summaries. Decision support, not diagnosis. Read-tier.

ParametersJSON Schema

Name	Required	Description
`bundle`	No	A FHIR Bundle of Observations to interpret
`subject`	No	Patient reference (e.g. 'Patient/pt-1') — interpret the tenant's stored Observations for this subject
`observation`	No	A single FHIR Observation to interpret

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, so the description's addition of 'Decision support, not diagnosis' and 'Read-tier' adds valuable behavioral context beyond what annotations provide, clarifying non-diagnostic use and read-only nature.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose, and ends with a critical caveat. Every sentence adds value with no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the purpose and safety are clear, the description does not explain how multiple optional parameters interact (e.g., is bundle preferred over subject?), nor does it mention that reference ranges may depend on patient demographics. With no output schema, the return format details are absent. Adequate but has gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions in the input schema. The tool description does not add any additional meaning or constraints beyond what is already in the schema, so the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool interprets lab Observations against reference ranges, flags values (low/normal/high/critical), and returns clinician and consumer summaries. It distinguishes itself from sibling tools like fhir_read by adding interpretative functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes 'Decision support, not diagnosis' and 'Read-tier,' which provide some usage context but no explicit comparison to alternatives or when-not-to-use guidance. It lacks explicit differentiation from similar tools like fhir_search or fhir_validate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_lastnLatest ObservationsA

Read-only

Inspect

Get the last N observations per code. Standard FHIR $lastn (since R4). Returns most recent observations by storage order.

ParametersJSON Schema

Name	Required	Description
`max`	No	Max observations per code (default 1)
`code`	No	LOINC code filter
`patient`	No	Patient reference filter

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the description's mention of 'Returns most recent observations by storage order' adds minimal behavioral context. It doesn't discuss potential limitations or performance implications.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no unnecessary words. The first sentence states the core action, and the second provides standard information. Well-structured and concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool is a standard FHIR operation and has no output schema, the description is adequate but does not explain response format or pagination. For a simple tool with high schema coverage, it is minimally complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so each parameter (max, code, patient) is documented in the schema. The description adds 'per code' context but does not elaborate on parameter usage beyond what the schema provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get the last N observations per code'), identifies the FHIR standard operation ('$lastn'), and distinguishes from sibling tools like fhir_search by specifying 'per code' grouping. The title 'Latest Observations' reinforces the purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for retrieving recent observations but does not explicitly state when to prefer this tool over alternatives (e.g., fhir_search with sorting). No guidance on when not to use it or specific prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_permission_evaluateEvaluate Access PermissionA

Read-only

Inspect

Evaluate R6 Permission resources for access control decisions. Returns permit/deny based on stored Permission rules. Separates access control (Permission) from consent records (Consent).

ParametersJSON Schema

Name	Required	Description
`action`	Yes	Action to evaluate
`subject`	No	Subject reference (e.g., 'Practitioner/dr-1')
`resource`	No	Resource reference to evaluate access for

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. The description confirms it evaluates and returns results, adding context about separating from consent. No additional behavioral traits (e.g., side effects, auth needs) are disclosed. Consistent with annotations, no contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each serving a purpose: states function, outputs, and differentiation from consent. No redundant or extraneous text. Efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, but the description notes the return is 'permit/deny', which is sufficient for the tool's simplicity. However, it does not cover prerequisites (e.g., authentication) or how multiple rules are handled. Adequate but could be more complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description does not add any parameter-level details beyond what the schema provides (action, subject, resource). No additional meaning or usage context is given for parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool evaluates R6 Permission resources for access control decisions, returning 'permit/deny'. It distinguishes this from consent records, making the purpose specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions separating access control (Permission) from consent (Consent), hinting at when to use this tool instead of a consent evaluator. However, it does not explicitly state when to use or not use this tool relative to the many sibling tools (e.g., fhir_read, curatr_evaluate). Guidance is implied but not strong.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_propose_writePropose FHIR WriteA

Read-only

Inspect

Propose a write — validates the resource and returns a preview. Does NOT commit. Safe to call without step-up authorization.

ParametersJSON Schema

Name	Required	Description	Default
`resource`	Yes	The FHIR resource to write
`operation`	Yes	Write operation type

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only and non-destructive behavior. The description adds value by clarifying that no commit occurs and that no special authorization is needed, enhancing transparency beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at two sentences, front-loads the core action ('Propose a write'), and contains no superfluous words. Every sentence contributes essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of an output schema, the description should explain what the 'preview' contains (e.g., validation errors, success confirmation). It does not, leaving a gap in understanding what the tool returns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description does not add new parameter details beyond what the schema provides; it simply mentions validation and preview, which are already implied by the tool's purpose.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: propose a write by validating the resource and returning a preview. It explicitly distinguishes from actual commits, and the verb 'propose' accurately captures the dry-run nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description notes that the tool is safe to call without step-up authorization, implying it should be used before committing. However, it does not explicitly mention alternatives like fhir_validate or fhir_commit_write, leaving some ambiguity about when to use each.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_readRead FHIR ResourceA

Read-only

Inspect

Read a specific FHIR resource by type and ID. Supports FHIR R4 US Core v9 stable resources and FHIR R6 ballot3 experimental resources. Returns redacted resource with PHI protection.

ParametersJSON Schema

Name	Required	Description	Default
`resource_id`	Yes	The resource ID
`resource_type`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false, establishing a safe read operation. The description adds behavioral details: returns a redacted resource with PHI protection and supports specific FHIR versions. This adds value beyond annotations, though it doesn't cover edge cases like error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with no superfluous text. The primary action is front-loaded, and the second sentence adds necessary version and privacy context. Every word serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple nature of a read operation with comprehensive annotations and schema, the description covers all essential aspects: action, parameters, supported resources, versions, and response characteristics (redaction). No gaps for typical use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 50% (both parameters are documented in schema, but resource_type enum values lack individual descriptions). The description mentions 'by type and ID' but doesn't enrich beyond the schema's basic definitions. It adds some context about supported FHIR versions, marginally compensating for the coverage gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Read') and the target resource ('specific FHIR resource by type and ID'). It distinguishes from sibling tools like fhir_search (which retrieves resources based on criteria) by specifying direct retrieval via type and ID.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides usage context by listing supported FHIR versions (R4 US Core and R6 ballot) and noting PHI protection. However, it lacks explicit guidance on when not to use this tool or alternatives (e.g., fhir_search for batch queries). The implied usage is clear but could be more directive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_searchSearch FHIR ResourcesA

Read-only

Inspect

Search for FHIR resources. Supports FHIR R4 US Core v9 stable resources and FHIR R6 ballot3 experimental resources. Supports patient, code, status, _lastUpdated, _count, _sort parameters. Returns paginated, redacted Bundle.

ParametersJSON Schema

Name	Required	Description
`code`	No	Code filter — matches code.coding[].code in JSON (e.g., '2339-0' for Glucose)
`_sort`	No	Sort order: '_lastUpdated' (asc) or '-_lastUpdated' (desc, default)
`_count`	No	Max results (1-50, capped for token safety)
`status`	No	Status filter (e.g., 'final', 'active', 'completed')
`patient`	No	Patient reference filter (e.g., 'Patient/pt-1')
`_lastUpdated`	No	Date filter with prefix (e.g., 'ge2024-01-01', 'le2024-12-31')
`resource_type`	Yes

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. Description adds behavioral context: supports two FHIR versions, returns paginated and redacted Bundle. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose and supported versions, then parameters and return type. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description adequately mentions return format (paginated, redacted Bundle). Covers supported versions and key parameters. Could mention read-only nature but annotations cover it. Reasonably complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 86%, so baseline is 3. Description merely lists parameters already detailed in schema, with minimal extra context (e.g., code matching example, sort defaults). No significant added meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it searches for FHIR resources, specifies supported FHIR versions (R4 and R6), lists key parameters, and mentions the return type (paginated, redacted Bundle). This distinguishes it from siblings like fhir_read (single resource) and fhir_lastn.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implicitly suggests it is the general search tool but does not explicitly state when to use it versus alternatives. No 'when not to use' or exclusions provided, though sibling names offer some context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_seedSeed Demo DataAInspect

Seed a tenant with a realistic Patient + Observations + Condition bundle for live testing. Use this at the start of a demo session to populate data. Returns created resource IDs and a ready-to-use step_up_token.

ParametersJSON Schema

Name	Required	Description	Default
`tenant_id`	No	Tenant to seed (default: desktop-demo)

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and destructiveHint=false. The description adds that the tool returns 'created resource IDs and a ready-to-use step_up_token', providing behavioral details beyond annotations. It does not specify idempotency or overwrite behavior, but the context is sufficient for a non-destructive seeding tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: the first states the action and content, the second advises when to use it and what it returns. Every word is purposeful, with no superfluous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 1-parameter, non-nested tool with no output schema, the description covers the purpose, usage context, and return values (resource IDs and step_up_token), providing complete information for an AI agent to invoke it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema provides 100% coverage for the single optional parameter (tenant_id) with a description and default. The description's mention of 'Seed a tenant' aligns with the parameter, adding minimal extra value. A baseline of 3 is justified, but the clarity of the default earns a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Seed') and resource ('tenant with realistic Patient + Observations + Condition bundle'), clearly distinguishing it from sibling tools like fhir_commit_write or fhir_read, which handle general operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool ('at the start of a demo session'), providing clear context. It does not enumerate alternatives or when not to use it, but the purpose is narrow enough that no exclusion is needed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_statsObservation StatisticsA

Read-only

Inspect

Compute statistics (count, min, max, mean) over numeric Observation values. Standard FHIR $stats (since R4). Only supports valueQuantity. Filter by patient and/or code.

ParametersJSON Schema

Name	Required	Description	Default
`code`	No	LOINC code to filter Observations (e.g., '2339-0' for Glucose)
`patient`	No	Patient reference filter (e.g., 'Patient/pt-1')

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly hint. Description adds behavioral detail: follows standard $stats operation, limited to valueQuantity, and filter capabilities. Does not describe response format but overall context is sufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences covering purpose, standard compliance, limitation, and filters. No redundant information; highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity and absence of output schema, description covers essential aspects. Could mention response aggregation type or limits, but not necessary for basic usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Parameter descriptions in schema are already detailed (examples for code and patient). Tool description only briefly restates filtering. With 100% schema coverage, description adds little beyond baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it computes statistics (count, min, max, mean) over numeric Observation values. Specifies the FHIR standard ($stats) and resource type. Distinguishes from sibling tools like fhir_search or fhir_read by its specific aggregative purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Mentions FHIR standard, only supports valueQuantity, and filter options (patient/code). Provides clear conditions but lacks explicit when-not-to-use or comparison to alternative tools like fhir_search with aggregations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_subscription_topicsList Subscription TopicsA

Read-only

Inspect

List available SubscriptionTopics for event-driven subscriptions. R6 moves topic-based subscriptions toward Normative. Agents discover what events they can subscribe to.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds context about R6 normative status and discovery purpose, enhancing understanding beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences: defines purpose, provides background, reiterates use. Minor redundancy but overall concise and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Simple tool but lacks description of return value (list of topic definitions). No output schema, so description should compensate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has no parameters, so description does not need to explain any. Baseline for 0 parameters is 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states verb 'List' and resource 'SubscriptionTopics'. Distinguishes purpose as discovering available subscription topics for events. However, does not explicitly differentiate from sibling tools like 'fhir_search' or 'fhir_read'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage: agents use this to discover events they can subscribe to. No explicit guidance on when not to use or alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fhir_validateValidate FHIR ResourceA

Read-only

Inspect

Validate a proposed FHIR R6 resource against structural rules. Returns OperationOutcome.

ParametersJSON Schema

Name	Required	Description	Default
`resource`	Yes	The FHIR resource to validate

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds that the return type is OperationOutcome, confirming a safe, read-only validation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with key action, no redundancy. Perfectly concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description mentions OperationOutcome return. Low complexity; annotations cover safety. Could add more detail on structural rules scope, but adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema parameter description is adequate (100% coverage), but tool description adds context about FHIR R6 and structural rules, enhancing meaning beyond the schema field.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool validates a FHIR R6 resource against structural rules and returns OperationOutcome. Distinguishable from siblings like fhir_read or fhir_search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for validating proposed resources before committing, but no explicit when-to-use or alternatives among siblings (e.g., fhir_commit_write).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

guardrail_conformanceGuardrail Conformance ScorecardA

Read-only

Inspect

Run the guardrail conformance self-test on the connected HealthClaw deployment and return the graded scorecard (A-F across PHI redaction, immutable audit, step-up auth, human-in-the-loop, tenant isolation, medical disclaimers). Uses synthetic data only. Set fresh=true to force a new run instead of the cached result.

ParametersJSON Schema

Name	Required	Description	Default
`fresh`	No	Force a fresh probe run instead of the cached (<=10 min old) result

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds critical context: 'Uses synthetic data only' and explains caching behavior, which adds significant transparency beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no wasted words. The action is front-loaded, and all information is relevant and concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple input schema (one optional boolean) and no output schema, the description adequately explains the return value (graded scorecard A-F across categories), synthetic data usage, and caching behavior. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema already describes the 'fresh' parameter well (100% coverage). The description adds explanatory value by stating 'set fresh=true to force a new run instead of the cached result', reinforcing the parameter's purpose.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool runs a guardrail conformance self-test and returns a graded scorecard (A-F across six specific categories). It specifies the resource (HealthClaw deployment) and distinguishes itself from the wide range of sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates when to use the tool: to run the guardrail conformance self-test on the connected HealthClaw deployment. However, it does not explicitly mention when not to use it or provide alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

questionnaire_extractExtract Form Data to FHIRA

Destructive

Inspect

SDC $extract — extract FHIR resources from a completed QuestionnaireResponse into a transaction Bundle. Write tier; requires step-up unless dry_run=true.

ParametersJSON Schema

Name	Required	Description
`dry_run`	No	Preview the Bundle without committing
`questionnaire`	No	The referenced Questionnaire (optional if resolvable by reference)
`questionnaire_response`	Yes	Completed QuestionnaireResponse

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds write-tier and step-up context beyond annotations (destructiveHint=true). Explains that dry_run=true avoids actual writing, clarifying safe usage. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences front-loaded with key purpose and immediate usage condition. No extraneous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers main purpose, write behavior, and dry_run condition. Absence of output schema is mitigated by mentioning 'transaction Bundle'. Missing details on return structure are acceptable given schema covers inputs well.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so baseline is 3. Description mentions dry_run behavior but doesn't add significant new detail beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'extract FHIR resources from a completed QuestionnaireResponse into a transaction Bundle', identifying specific verb (extract) and resource (QuestionnaireResponse). Distinguished from sibling 'questionnaire_populate' which fills rather than extracts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance: 'Write tier; requires step-up unless dry_run=true'. Indicates when to use write vs preview mode, and hints at permissions needed. No explicit when-not-to-use or sibling comparison.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

questionnaire_populatePre-fill Health FormA

Read-only

Inspect

SDC $populate — pre-fill a Questionnaire for a subject. Returns a QuestionnaireResponse. Read tier; mints a tenant token for non-public tenants.

ParametersJSON Schema

Name	Required	Description
`questionnaire`	No	Inline Questionnaire (overrides questionnaire_id)
`questionnaire_id`	No	Stored Questionnaire id
`subject_reference`	Yes	Subject reference, e.g. 'Patient/p1'

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds context beyond annotations: it specifies 'SDC $populate', declares it as a 'Read tier' operation, and notes token minting for non-public tenants. This supplements the readOnlyHint=true annotation without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (two sentences) and front-loaded with the core purpose. Every sentence adds value with no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the tool's main function, side effect (token minting), and return type (QuestionnaireResponse). While it does not elaborate on pagination or error states, the annotations and schema provide sufficient completeness for a read operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description does not provide additional detail beyond what the input schema already covers for each parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'pre-fill', the resource 'Questionnaire', and the target 'subject'. It distinguishes from sibling tools like questionnaire_extract, which likely performs the inverse operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not explicitly state when to use this tool versus alternatives. It mentions 'Read tier' and token minting, but no direct comparison with siblings like questionnaire_extract.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rx_transfer_requestRequest Prescription TransferA

Read-only

Inspect

Draft a prescription-transfer request: assembles the patient's active medications and stages a phone call to the RECEIVING pharmacy asking it to pull the prescriptions from the current pharmacy (how US transfers actually work). Schedule II medications are refused (never transferable — new prescription required). Returns a draft the patient MUST review; execute with action_commit after explicit approval.

ParametersJSON Schema

Name	Required	Description
`medication_names`	No	Limit to these medication names (default: all active orders)
`to_pharmacy_name`	Yes	Receiving pharmacy name
`to_pharmacy_phone`	Yes	Receiving pharmacy phone number
`from_pharmacy_name`	No	Current pharmacy name (optional)
`from_pharmacy_phone`	No	Current pharmacy phone (optional)

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and destructiveHint=false. The description adds behavioral context: it stages a phone call, refuses Schedule II drugs, and returns a draft requiring review. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with purpose and mechanism, followed by a constraint and workflow guidance. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers the full workflow: what it does, how US transfers work, legal limitations, and the required next step (review and commit). Adequate for a 5-parameter tool with no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents each parameter. The description adds context about the default behavior for medication_names and explains the overall transfer process, but does not significantly enhance parameter meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool drafts a prescription-transfer request, explains the US transfer process, specifies it refuses Schedule II medications, and distinguishes it from the sibling tool action_commit which executes the draft.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool (to draft a transfer request) and that it must be followed by action_commit after patient approval. It does not explicitly list alternatives, but the workflow is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

searchSearch Health RecordsA

Read-only

Inspect

ChatGPT-connector-compatible search over the tenant's FHIR records. Query is a FHIR search string (e.g. 'Observation?code=4548-4' or 'Patient?name=smith'); bare resource type works too. Returns compact results: id, title, url. Reads are PHI-redacted and audit-logged server-side.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	FHIR search string: 'ResourceType?params' or just 'ResourceType'

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, destructiveHint), description adds that reads are PHI-redacted and audit-logged server-side, and returns compact results. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, zero waste. Front-loaded with purpose and compatibility, then query format, then return characteristics. Efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter search tool with no output schema, description covers query format, return structure, and behavioral notes. Missing pagination/error info, but adequate given simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage 100% for single parameter. Description adds format examples (e.g., 'Observation?code=4548-4'), clarifying acceptable FHIR search strings beyond schema's generic description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it searches FHIR records with compatibility note. Provides examples. Does not explicitly differentiate from sibling fhir_search, but adds context of compact results and PHI-redaction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage with ChatGPT connector, but no explicit when-to-use or alternatives. Sibling fhir_search exists but no guidance on choosing between them.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

shl_generateGenerate SMART Health LinkAInspect

Generate a SMART Health Link (shlink:/ QR payload) sharing the patient's record with a clinic. Fetches the guardrailed share-bundle from HealthClaw (step-up required — pass _stepUpToken), encrypts it client-side (the SHL server never sees plaintext), uploads ciphertext, and returns the shlink URI, viewer link, and the patient's private manage link. ALWAYS get the patient's explicit consent before generating, and deliver the manage link ONLY to the patient.

ParametersJSON Schema

Name	Required	Description
`label`	No	Short label shown in SHL viewers (<=80 chars), e.g. 'Records for Winters Healthcare'. No PHI beyond what the patient approves.
`profile`	No	intake = identified record for clinic check-in (default); deidentified = strips name/contact/institutional IDs
`patient_id`	No	Optional patient id filter for multi-patient tenants
`expires_in_days`	No	Link lifetime in days (default 7, max 90)

Tool Definition Quality

A4.3/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description discloses that the tool writes (not read-only), performs client-side encryption, uploads ciphertext, and requires a step-up token. It adds significant detail beyond annotations, such as consent and security constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is well-structured with purpose first, then process, then warnings. At 4 sentences, it is fairly concise but could be slightly tighter. Good front-loading of key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description adequately explains the complex workflow (fetch, encrypt, upload, return) and lists all three return values. It covers security and consent. No output schema exists, but the description sufficiently fills that gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, and the description does not add meaningfully beyond what the schema already provides for each parameter (e.g., label length, profile enum, expires default). Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Generate', the resource 'SMART Health Link', and the purpose 'sharing the patient's record with a clinic'. It distinguishes itself from the listed sibling tools, none of which involve SHL generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description provides explicit guidance: always get patient consent, deliver manage link only to patient, and mentions step-up token requirement. It lacks explicit alternatives but given the sibling list, no other SHL tool exists, making the guidance sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sources_checkCheck Data SourcesA

Read-only

Inspect

Survey ALL connected health data sources (Fasten, HealthEx, Health Bank One, MEDENT, Flexpa, Epic/Health Skillz, wearables) at once — returns each source's connection status and the patient's record counts by type. Use when the patient asks what's connected or to check for data across services.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. Description adds behavioral context: it surveys all sources at once and returns specific data. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, followed by usage scenario. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description explains return values (status and record counts). Sufficient for a simple, parameterless tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters in input schema, so no need for additional param info. Baseline score of 4 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Survey') and resource ('all connected health data sources'), lists example sources, and clearly states what it returns (connection status and record counts). It distinguishes from siblings like wearables_sync_status by focusing on all sources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'when the patient asks what's connected or to check for data across services.' Does not mention when not to use, but the context is clear enough for an agent to decide.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

wearables_sync_statusWearables Sync StatusA

Read-only

Inspect

List wearable connections (Garmin, Oura, Polar, Suunto, Whoop, Fitbit, Strava, Ultrahuman) for a tenant, with last sync time, observation count, and status. Use this to tell a patient what's connected, when data last arrived, and surface a connection-management UI (via _meta.ui.resourceUri) so they can connect more providers. Data flows into HealthClaw as FHIR Observations with LOINC codes — agents read it via fhir_search like any other Observation.

ParametersJSON Schema

Name	Required	Description	Default
`tenant_id`	No	Tenant to inspect. Defaults to the incoming X-Tenant-Id header.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and destructiveHint=false, so the description adds value by detailing the returned fields (last sync time, observation count, status) and mentioning the UI URI. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with purpose and specific details. Every sentence earns its place with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read-only tool with one optional parameter and no output schema, the description fully covers purpose, usage, return information, and integration with FHIR and UI. It is complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter tenant_id, which already describes its default behavior. The description adds no additional parameter meaning, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists wearable connections for a tenant with specific brands, last sync time, observation count, and status. It distinguishes from siblings like fhir_search by noting that data flows into HealthClaw as FHIR Observations retrievable via that tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use: 'Use this to tell a patient what's connected, when data last arrived, and surface a connection-management UI.' It provides clear context but does not explicitly mention when not to use or suggest alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

healthclaw-guardrails

Server Details

Tool Definition Quality

Available Tools

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Discussions

Your Connectors

Resources