ampel

by io.tooloracle

Server Details

AmpelOracle — 50-tool compliance traffic-light: Go/Caution/Stop signals for ESG, MiCA, AML.

Status: Healthy
Last Tested: 2026-05-14 12:32
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

B3/5.0

Tool DescriptionsC

Average 3.3/5 across 50 of 50 tools scored. Lowest: 2.4/5.

Server CoherenceB

Disambiguation4/5

Most tools have specific, distinct purposes (e.g., bridge_*, contract_*, whatif_* groups). Some overlap exists between evidence_summary and evidence_pack, and between check_contract and contract_analyze, but descriptions clarify differences. With 50 tools, minor ambiguity is inevitable.

Naming Consistency3/5

Names use snake_case but mix verb_noun (assess_all, create_entity) and noun_verb (article_status, bridge_report) patterns. Subgroups like bridge_* and contract_* are consistent internally, but no uniform pattern across the entire set. Some names like 'bus_status' and 'llm_clause_check' break the pattern.

Tool Count2/5

50 tools is very high for a single MCP server. While the domain is broad (DORA compliance), the count exceeds typical well-scoped ranges (3-15). Potentially could be split into multiple servers (e.g., assessments, contracts, reporting) to improve manageability.

Completeness4/5

The tool set covers most aspects of DORA compliance: entity management, assessments, contracts, incident lifecycle, reporting, regulatory watch, and gap analysis. Minor gaps exist (e.g., no entity deletion, no manual evidence upload), but core workflows are supported.

Available Tools

50 tools

article_statusAInspect

Detailed Ampel for a specific DORA article. Each check with GREEN/YELLOW/RED conditions and evidence.

ParametersJSON Schema

Name	Required	Description	Default
`article`	No	e.g. Art. 28
`entity_id`	No	Entity ID

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It adds context about condition/evidence breakdown but does not disclose side effects, permissions, or error behavior. Moderate transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with key phrase 'Detailed Ampel', no extraneous words. Efficient and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks output schema, and description only hints at return structure (GREEN/YELLOW/RED with evidence) without detailing JSON fields, pagination, or error scenarios. Partially complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and parameter descriptions are already clear. The description adds no extra meaning beyond the schema, so baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a detailed Ampel (traffic light) for a DORA article, with specific conditions and evidence, distinguishing it from broader sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like 'assess_all' or 'bus_status'; lacks when-not-to-use and prerequisite information.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

assess_allAInspect

Re-run full assessment for an entity. Recomputes Ampel statuses from all available evidence.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID (optional)

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden for behavioral disclosure. It states that it recomputes statuses, but fails to mention side effects, required permissions, idempotency, or potential impacts. This is insufficient for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no filler, front-loaded with purpose. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with one optional parameter and no output schema. The description covers the main action but lacks details on return values, when to call, or what 'full assessment' entails. Adequate but leaves gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% as the only parameter 'entity_id' has a description. The tool description adds no additional semantics beyond what the schema provides, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Re-run full assessment for an entity' and specifies 'Recomputes Ampel statuses from all available evidence'. This distinguishes it from siblings like 'cross_oracle_assess' or 'run_trial_assessment' which are more focused.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for re-running a full assessment, but does not explicitly state when to use it versus alternatives, nor does it provide any prerequisites or when-not guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

audit_trailCInspect

Chain-linked audit log with integrity check.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Max entries
`entity_id`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of disclosing behavioral traits. 'Chain-linked' and 'integrity check' hint at verification but do not clarify whether the tool is read-only, mutates data, requires special permissions, or has rate limits. The agent is left guessing about side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (6 words) but is under-specified. It is not verbose, but it sacrifices informativeness. Every word is earned, but the description could be improved by adding more detail without becoming overly long.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description should explain what the tool returns (e.g., a list of audit entries, integrity verification result). It fails to do so, making the tool incomplete for an agent to understand the full context of invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 50% (only 'limit' has a description). The tool description adds no meaning to parameters; it does not explain 'entity_id' or how 'limit' interacts with the audit log. With half the parameters undocumented and no compensation from the description, the agent lacks clarity on usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Chain-linked audit log with integrity check' vaguely indicates the resource (audit log) but lacks a verb to specify the action (e.g., get, list, verify). It distinguishes from siblings like 'entity_list' by mentioning chain-linking and integrity, but without a clear action, an AI agent may struggle to know if this is a read or write operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. Sibling tools like 'bridge_status' or 'freshness_check' are unrelated, but the description does not state that this tool is for retrieving audit entries or performing integrity checks, leaving the agent without context for decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

azure_ad_checkCInspect

Live Azure AD integration: MFA registration %, risky users, conditional access policies. DORA Art. 9 evidence. Requires Azure AD config in integrations_config.json.

ParametersJSON Schema

Name	Required	Description	Default
`force_refresh`	No	Force fresh API call (default true)

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavior. It mentions 'Live' but does not clarify caching behavior, rate limits, or potential side effects. The force_refresh parameter implies caching but the description does not explain default behavior or impact.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence listing key outputs, followed by a requirement. It is efficient and front-loaded with purpose, though it could be better structured with separate sentences for behavioral notes.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers purpose, example outputs, and a prerequisite. However, it lacks details on caching behavior, pagination, or error handling. Given the tool has only one parameter and no output schema, the description is minimally adequate but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the parameter description in the schema already explains force_refresh. The description does not add additional parameter semantics beyond what is in the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a live Azure AD integration and lists specific outputs (MFA registration %, risky users, conditional access policies) and ties to DORA Art. 9 evidence. It is specific and actionable, though it could more explicitly differentiate from sibling health check tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The only guidance is the prerequisite that Azure AD config must be in integrations_config.json. There is no indication of when to use this tool versus alternatives like health_check or assess_all, nor any mention of when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bafin_approve_sendBInspect

Approve BaFin report for submission (4-eyes principle). Creates signed approval evidence.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No
`report_id`	No
`approver_name`	No
`approver_role`	No

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries full burden. It discloses that the tool creates evidence (a write operation) and involves the 4-eyes principle, but does not mention whether it is destructive, idempotent, or requires specific permissions. Missing details like rate limits or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no fluff. Every word adds value, front-loading the core action and principle.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 undocumented parameters and no output schema, the description is incomplete. It does not explain what 'signed approval evidence' entails, how to use parameters, or what the tool returns. Significant gaps remain for an agent to correctly invoke it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% and the description provides no explanation of the four parameters (entity_id, report_id, approver_name, approver_role). The description offers no value beyond the schema, failing to compensate for the lack of schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool's purpose: approving a BaFin report for submission under the 4-eyes principle, and creating signed approval evidence. It uses specific verb 'Approve' and resource 'BaFin report', distinguishing it from sibling tools like 'bafin_report_draft' or 'assess_all'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions the '4-eyes principle', implying it is for a second approval step, but does not explicitly state when to use this tool versus alternatives (e.g., after drafting, before submission). No exclusion criteria or prerequisites are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bafin_report_draftAInspect

Generate ITS 2024/1772 compliant BaFin incident report draft. All mandatory fields per DORA Art. 19/20. Preview mode — requires board approval.

ParametersJSON Schema

Name	Required	Description
`title`	No
`entity_id`	No
`root_cause`	No
`description`	No
`incident_id`	No
`remediation`	No
`report_type`	No	initial \| intermediate \| final
`classification`	No	major \| significant \| minor
`affected_clients`	No
`affected_services`	No

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes preview mode (generates draft, doesn't send) and board approval need. No annotations exist, but description doesn't detail return format, storage behavior, or permission requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no redundancy, front-loaded with compliance context and behavioral hint. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

10 params, no output schema, no annotations. Description lacks parameter guidance, return value info, and clarification on mandatory fields. Preview mode explanation is minimal.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 10 params with only 20% coverage (2 have descriptions). Description adds no parameter details beyond compliance reference. For a complex tool, this is insufficient compensation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it generates a BaFin incident report draft compliant with ITS 2024/1772 and DORA Art. 19/20. It distinguishes from sibling tools like bafin_approve_send by noting 'Preview mode — requires board approval.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions preview mode and board approval requirement, implying when to use (draft stage) and when not (final submission). Lacks explicit alternatives but context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

board_summaryAInspect

Executive board summary: overall score, top 5 risks, overdue findings, SLA breaches, concentration risk, evidence health, owner workload. Designed for management/board reporting.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID

Tool Definition Quality

A3.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description only lists output contents. It fails to disclose any behavioral traits such as data source, permissions, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences cover the main components and purpose with no waste. Every word is informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the single parameter and no output schema, the description adequately sets expectations for what the tool returns. It doesn't cover prerequisites or data generation, but the tool is simple enough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter 'entity_id', which is described in the schema as 'Entity ID'. The description does not add meaning beyond this, so baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description lists specific components (overall score, top 5 risks, etc.) and states the tool is for management/board reporting. This clearly differentiates it from sibling tools like evidence_summary or score_trend.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Designed for management/board reporting,' providing clear context for when to use. It does not mention when not to use or alternatives, but the context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bridge_approveBInspect

Approve or reject a bridge resolution. On approval: creates signed evidence, upgrades Ampel to GREEN, logs to audit chain.

ParametersJSON Schema

Name	Required	Description
`reject`	No	Set true to reject instead of approve
`approved_by`	No	Name + role of approver (e.g. Dr. Mueller, CISO)
`resolution_id`	No	Resolution ID from bridge_resolve
`rejection_reason`	No	Reason for rejection (if rejecting)

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavior. While it covers approval effects (creates signed evidence, upgrades Ampel to GREEN, logs to audit chain), it omits rejection side effects, permissions required, and whether the action is reversible.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first states purpose, second details approval outcomes. No redundant information; every word contributes meaning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of annotations and output schema, the description is incomplete. It does not explain return values, error handling, prerequisite states (e.g., resolution must exist), or behavior on rejection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All 4 parameters have schema descriptions (100% coverage). The description adds context by linking approval to consequences, but does not elaborate on parameter formats or usage beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool's action: approve or reject a bridge resolution. It uses specific verbs ('Approve or reject') and a distinct resource ('bridge resolution'), making its purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lacks explicit guidance on when to use this tool versus siblings like bridge_resolve or bridge_status. It does not mention prerequisites, preconditions, or scenarios where approval/rejection is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bridge_reportBInspect

Bridge gap analysis: classifies gaps by DATA/EVIDENCE/POLICY/WORKFLOW with closure path, owner, effort level.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID (optional)

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavioral traits. It states the tool classifies gaps but does not indicate whether it is read-only, requires authentication, has side effects, or any limitations. This leaves significant ambiguity for an AI agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence that conveys purpose, classification dimensions, and output components without unnecessary words. Every element earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (one optional parameter, no output schema), the description is mostly complete. However, it does not describe the return format or how the report is delivered, which could help the agent use it effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter, and the schema already describes 'Entity ID (optional)'. The description adds no additional meaning beyond the schema, so the baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb phrase 'Bridge gap analysis' and explicitly lists the classification categories (DATA/EVIDENCE/POLICY/WORKFLOW) and output elements (closure path, owner, effort level). This clearly distinguishes it from sibling tools like 'gap_report' and 'bridge_status'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives such as 'gap_report' or other 'bridge_' tools. It lacks any context about prerequisites, appropriate scenarios, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bridge_resolveAInspect

Start bridge resolution workflow. Generates templates (Risk Acceptance, Contract Renegotiation, Concentration Policy, Exit Strategy), tracks approval process. Call bridge_approve to sign off.

ParametersJSON Schema

Name	Required	Description
`check_id`	No	Check to resolve: art30_c1, art30_c2, art30_c3, art8_c3, art31_c1
`entity_id`	No	Entity ID (optional)
`expiry_days`	No	Days until resolution expires (default 30)

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses that the tool generates templates and tracks approval, implying state changes, but does not explicitly state side effects (e.g., creation, mutation), required permissions, or potential destructive actions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no redundancy. Front-loads the primary action and key details (template types, approval tracking, follow-up tool). Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately covers the tool's purpose, generated outputs, and workflow sequence. It could mention the initial state or prerequisites (e.g., if a check must exist), but overall is sufficient for an LLM to understand usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and each parameter has a clear description. The description does not add new meaning beyond the schema; it only reiterates the overall workflow. Baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action ('Start bridge resolution workflow') and the specific resources generated ('templates for Risk Acceptance, Contract Renegotiation, Concentration Policy, Exit Strategy'). It differentiates from siblings like bridge_approve (explicitly mentioned as follow-up) and bridge_status (status queries).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly indicates it is the first step ('Start') and directs to use bridge_approve for sign-off. Provides context but lacks explicit when-not-to-use or alternative tools for similar tasks, such as bridge_report or bridge_status.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bridge_statusBInspect

Check status of all bridge resolution workflows for an entity. Shows open, pending, closed, rejected.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID (optional)

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description must fully convey behavioral traits. It indicates a read operation ('Check status'), but does not disclose permissions needed, rate limits, side effects, or behavior when entity_id is omitted. This leaves ambiguity for the agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, consisting of two sentences with no wasted words. It efficiently conveys the purpose and the types of statuses returned.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one optional parameter, no output schema, no annotations), the description covers the essential aspects. It explains the action and output states. However, it could mention the default behavior when entity_id is not provided.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters with a description for entity_id. The tool description adds minimal value by implying the tool operates 'for an entity', but does not clarify the optionality or effect of omitting the parameter. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks status of bridge resolution workflows and lists the statuses shown (open, pending, closed, rejected). It is specific about the verb and resource, but does not differentiate from sibling tools like article_status or bus_status, which may have similar purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There is no mention of prerequisites, exclusions, or context for when this tool is appropriate compared to other status-checking tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bus_statusCInspect

Oracle Event Bus status: events, cross-refs, connected oracles.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID for cross-refs

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavior. It only lists what information is returned but does not state that it is a read-only operation, nor does it mention side effects, permissions, or error handling. This is a significant gap for a status tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words, front-loading the core information. However, it may be too terse for full clarity, but it meets conciseness standards without repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description should provide more context (e.g., return format, error cases). It is minimally adequate for a simple status check but lacks details on what the status response contains.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the parameter 'entity_id' having a description. The tool description mentions 'cross-refs' which aligns with the parameter but adds no additional semantic nuance beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states 'Oracle Event Bus status: events, cross-refs, connected oracles' which clearly identifies the resource and what aspects are covered. It distinguishes from sibling 'bridge_status' by mentioning specific components of the event bus.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'bridge_status' or 'health_check'. No prerequisites or exclusions are mentioned, leaving the agent to infer context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_contractBInspect

Check DORA Art. 30 contract clauses for a provider. Returns PASS/WARN/BLOCK with missing clauses and bridge classification.

ParametersJSON Schema

Name	Required	Description
`is_cif`	No	Is this a CIF (Critical/Important Function) provider?
`entity_id`	No
`cif_clauses`	No	CIF clauses if applicable
`provider_id`	No	Provider ID
`exit_strategy`	No	Exit strategy documented?
`standard_clauses`	No	Present standard clauses: service_description, data_location, data_protection, service_availability_sla, incident_notification, audit_right, termination_notice, cooperation_with_authorities

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description must fully disclose behavior. It mentions the output type but does not state whether the tool is read-only, has side effects, requires authentication, or modifies state. This is insufficient for a tool with 6 parameters and no annotation safety net.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence contains the core purpose and output, no fluff. Could be slightly improved with structured bullet points, but it is effectively concise for the information conveyed.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite 6 parameters and no output schema, the description lacks return format details, prerequisites, and edge case behavior. For a tool with this complexity, the description is too sparse to be fully useful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 83% (5 of 6 parameters described). The description adds context about output but does not explain entity_id or provide additional parameter guidance beyond what the schema already offers. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it checks DORA Art. 30 contract clauses and returns PASS/WARN/BLOCK with missing clauses and bridge classification. Distinguishes from siblings like contract_analyze (broader analysis) and contract_status (workflow status).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives. The description only states what it does, without indicating when not to use or recommending other tools for different needs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

collect_art10BInspect

Collect live Art. 10 evidence from NVD, CISA KEV, CERT-Bund. Auto-assesses.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID (optional)

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It states 'Collect' and 'Auto-assesses' but doesn't clarify side effects (e.g., creation of records, destructive actions) or required permissions. The minimal description leaves ambiguity about the tool's full behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single sentence, but could be structured with front-loaded key actions. However, it includes essential information without excess.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, and the description does not mention what the tool returns (e.g., evidence records, assessment results). Given the complexity of collecting from multiple sources and auto-assessing, the lack of return information makes it incomplete for an agent to understand the outcome.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has one optional parameter with a description, and schema coverage is 100%. The tool description adds no additional meaning beyond the schema, so baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it collects live Article 10 evidence from specific sources (NVD, CISA KEV, CERT-Bund) and auto-assesses. Verb and resource are specific, but it doesn't fully distinguish from sibling assessment tools like 'assess_all' or 'run_trial_assessment' which may overlap in functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Usage is implied (when needing to collect evidence from listed sources) but no explicit guidance on when to use this tool versus alternatives, nor any conditions or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

contract_analyzeAInspect

Analyze contract against 15 DORA Art. 30 mandatory clauses. Returns compliance status per clause with confidence score, extracted text, gap reasoning, suggested fix.

ParametersJSON Schema

Name	Required	Description	Default
`document_id`	No	Document ID from contract_upload

Tool Definition Quality

A3.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavioral traits. It only describes the output (compliance status, confidence, etc.) but does not mention side effects, authorization needs, rate limits, or whether the tool is read-only.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently enumerates all key output components. No unnecessary words; every part adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool analyzing 15 clauses and returning multiple fields per clause, the description covers the return structure (status, confidence, text, reasoning, fix) reasonably well. It lacks details on execution time or prerequisites but is largely complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only one parameter and 100% schema coverage, the description adds minimal extra meaning by stating the document ID comes from contract_upload. The baseline is 3, and the description does not significantly enhance understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies the tool analyzes contracts against 15 DORA Art. 30 mandatory clauses, clearly indicating the verb 'analyze' and the resource 'contract with specific clauses'. It distinguishes from siblings like check_contract and llm_clause_check by focusing on DORA compliance.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage after uploading a contract (document_id from contract_upload) but does not explicitly state when to use this tool versus alternatives, nor does it provide exclusions or prerequisites beyond the document ID.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

contract_statusBInspect

Overview of all analyzed contracts per entity. Shows clause gaps, review status, document versions.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the burden of behavioral disclosure. It indicates a read operation by stating 'shows' information, but does not detail permissions, data freshness, or any side effects. Adequate but not transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no wasted words. The first sentence immediately conveys the core purpose, making it front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains what the tool shows, but no output schema exists. It would benefit from specifying the output format, scope (e.g., all contracts or filterable), or whether version history is included. Adequate for a simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has one parameter 'entity_id' with a description already covering its purpose. The tool description does not add further context beyond what the schema provides, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides an overview of analyzed contracts per entity, showing clause gaps, review status, and document versions. This distinguishes it from sibling tools like 'check_contract' or 'contract_analyze' which likely have different focus.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies it's for getting a status overview, but lacks when-not-to-use or mention of related tools like 'contract_analyze' or 'gap_report'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

contract_uploadAInspect

Upload contract text for DORA Art. 30 analysis. Creates document record with SHA-256 hash, version tracking, audit trail.

ParametersJSON Schema

Name	Required	Description
`entity_id`	No	Entity ID
`file_name`	No	Original file name
`contract_text`	No	Contract text (extracted from PDF)
`document_type`	No	ict_outsourcing_agreement \| dpa \| sla \| master_service_agreement
`provider_name`	No	Provider name

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses that the tool creates a document record with hash, version tracking, and audit trail, which indicates a write operation. However, it does not address side effects, security requirements, or error behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two short sentences with no extraneous words. It front-loads the action and purpose, making it easy to scan.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description lacks information about return values (no output schema), error handling, or parameter relationships. For a tool with 5 parameters and no annotations, the description should provide more guidance on usage context and expectations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (all parameters have descriptions). The tool description does not significantly add meaning beyond the schema; it repeats the document_type enum but adds no extra context. Baseline is 3 per the rubric.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Upload contract text'), the specific purpose ('for DORA Art. 30 analysis'), and what is created ('document record with SHA-256 hash, version tracking, audit trail'). It effectively distinguishes this tool from sibling tools like contract_analyze and contract_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies the tool is used for uploading contracts for DORA Art. 30 analysis, providing a clear usage context. However, it does not explicitly state when not to use it or mention alternative tools for other contract operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_entityCInspect

ParametersJSON Schema

Name	Required	Description
`lei`	No
`name`	No	Entity name
`entity_type`	No	Type
`jurisdiction`	No

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so description carries full burden. The description lacks any behavioral details such as side effects, permissions required, or return behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence is concise and front-loaded, but extreme brevity risks being underspecific.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, no output schema, and 4 parameters with incomplete schema descriptions, the tool description is insufficient to fully guide the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 50%, but tool description adds no meaningful information beyond what is in the schema. Parameters like 'lei' and 'jurisdiction' have no descriptions in schema or tool description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses verb 'Register' and resource 'regulated entity', clearly stating the tool's action. However, sibling tools like 'onboard_entity' may overlap, and no differentiation is provided.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, no prerequisites or context provided. Agent must infer usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_trialAInspect

Create temporary trial entity (48h) for self-service DORA assessment. No login needed.

ParametersJSON Schema

Name	Required	Description
`providers`	No	Comma-separated provider names: AWS,SWIFT,Finastra
`entity_name`	No	Institute name
`entity_type`	No	credit_institution\|payment_institution\|insurance_undertaking\|asset_management\|credit_institution_casp
`jurisdiction`	No	DE\|AT\|FR\|etc

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries the burden. It discloses temporary duration and no-login requirement, but omits behavioral traits like what happens after 48h (auto-destruction?), whether it's idempotent, or any rate limits. Adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with key info (purpose, duration, login requirement). No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, description covers core purpose but lacks details on return values (e.g., trial ID, expiration timestamp) or how to use the created entity. Sufficient for a simple creation tool, but leaves gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters with descriptions, so baseline is 3. The tool description adds no extra parameter-level meaning. There is a schema enum mismatch (providers enum ['AWS','SWIFT'] vs schema description mentioning 'Finastra'), but that's a schema issue, not description's fault.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Create temporary trial entity (48h) for self-service DORA assessment. No login needed.' This is a specific verb+resource with duration and purpose, distinguishing it from siblings like create_entity for permanent entities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage for self-service without login, but lacks explicit when-to-use or when-not-to-use compared to alternative tools like create_entity. No guidance on prerequisites or limitations beyond the 48h lifespan.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cross_oracle_assessAInspect

Enterprise cross-oracle assessment. Runs 18 checks across CyberShield (NIS2/ISO 27001), SupplyChainOracle (LkSG/CSRD), HealthGuard (MDR/GDPR), CFOCoPilot (XRechnung), TaxOracle (DAC6), LegalTechOracle (DORA contracts). Auto-stores evidence and updates Ampel status.

ParametersJSON Schema

Name	Required	Description	Default
`checks`	No	Comma-separated check IDs or omit for all
`entity_id`	No	Entity to assess

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses key behaviors: auto-storing evidence and updating Ampel status, which indicates a write operation. It does not mention rate limits or authorization, but for a tool that runs checks and stores results, this is fairly transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no fluff. The first sentence defines the purpose, the second adds behavioral side effects. Everything earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 parameters, no output schema, and moderate complexity (multiple oracles), the description covers the core behavior: what checks are run, that evidence is stored, and status updates. It does not specify what is returned, but given the action-oriented nature, it is sufficiently complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters adequately. The description adds minimal extra meaning beyond the schema, merely mentioning '18 checks' while the schema already describes the checks parameter as 'Comma-separated check IDs or omit for all'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs an enterprise cross-oracle assessment running 18 checks across multiple specific frameworks (CyberShield, SupplyChainOracle, etc.), which uniquely distinguishes it from sibling tools like assess_all or cross_regulation_check.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for enterprise cross-oracle assessments but does not explicitly state when to use this tool versus alternatives like assess_all or cross_regulation_check, nor does it provide any 'when not to use' guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cross_regulation_checkAInspect

Tag findings with cross-regulation impact (DORA + MiCA + AMLR). Shows which DORA findings also affect MiCA insider info or AMLR screening.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavior. It is ambiguous whether the tool only shows (read-only) or actually tags (modifies) findings. The word 'Tag' suggests mutation, but 'Shows' implies display, leading to confusion about side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences front-load the action and context. No unnecessary words or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with one optional parameter, no output schema, and no annotations, the description explains the core function but fails to specify return format or behavior when no cross-impact exists. It is moderately complete but could be improved.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for a single parameter 'entity_id'. The description adds no additional meaning beyond the schema's property description, so baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'Tag findings' and clearly identifies the resources (findings) and the cross-regulation scope (DORA + MiCA + AMLR). It distinguishes itself from siblings like 'regulation_impact' by focusing on cross-regulatory overlap.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description clearly implies the context of cross-regulation tagging but does not explicitly state when to use this tool versus alternatives (e.g., 'regulation_impact'). No exclusions or when-not-to-use guidance provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cve_asset_mapAInspect

Map CVE/vulnerability to internal ICT providers and systems. Auto-creates findings for critical matches. DORA Art. 10.

ParametersJSON Schema

Name	Required	Description
`cve_id`	No	CVE identifier
`vendor`	No	Vendor/software name to check
`entity_id`	No

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses the key behavioral trait of auto-creating findings for critical matches, but lacks details on side effects, permissions, or rate limits. Partial transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, zero waste, front-loaded with the core action. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite good purpose clarity, the description lacks details: no explanation of entity_id input, no output format, no prerequisites for auto-creation, and minimal context on DORA Art. 10 relevance. For a tool that creates data, this is insufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 67% (two of three parameters described). The description does not add meaning beyond the existing schema descriptions for cve_id and vendor, and entity_id remains undescribed. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool maps CVE/vulnerability to internal ICT providers and systems, auto-creates findings for critical matches, and references DORA Art. 10. This verb+resource combination is specific and distinguishes it from sibling tools like assess_all or gap_report.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for mapping vulnerabilities and auto-creating findings but provides no explicit guidance on when to use this tool versus its siblings, nor any when-not-to-use conditions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dependency_graphCInspect

Full provider dependency graph: providers, systems, checks, blast radius, SPOF detection.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description bears full responsibility. It mentions 'blast radius' and 'SPOF detection' but does not disclose whether the tool is read-only, performance characteristics, or data freshness. The agent cannot infer behavioral constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that conveys the core purpose efficiently. However, it could be structured with bullet points for clarity but remains appropriately concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and only one parameter, the description is insufficient. It does not explain the graph's format, interactivity, or scope of data. Users may need additional context to use it effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single parameter 'entity_id' described as 'Entity ID'. The description adds no additional meaning beyond the schema, leading to a baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Full provider dependency graph' and lists key features like blast radius and SPOF detection. It distinguishes this tool from siblings by focusing on graph visualization rather than reports or status checks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'incident_flow' or 'health_check'. The description does not mention when not to use it or any prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

entity_listAInspect

List all registered regulated entities.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must disclose behavior. It implies a read operation but does not mention pagination, rate limits, auth requirements, or any side effects. Minimal beyond the basic action.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single sentence stating core purpose with no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple zero-parameter list tool, the description provides the minimum viable information. However, given the presence of many sibling tools, additional context such as sort order, scope, or relationship to other list tools would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters in schema, baseline is 4. No additional parameter info needed as there are none.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action 'list' and the resource 'all registered regulated entities', distinguishing it from sibling tools like create_entity or onboard_entity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. With many sibling list tools (e.g., bus_status), context on when to prefer entity_list is missing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

escalation_statusCInspect

Get findings, SLA breaches, escalation status per entity.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID (empty=all)

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose any behavioral traits beyond the basic action. It does not mention permissions, side effects, or limitations, leaving the agent underinformed for a read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that is front-loaded with the verb and key actions. No extraneous text, though listing multiple resources could be slightly clearer.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (one optional parameter, no output schema), the description covers the basic purpose. However, it lacks details on the return format or how to interpret results, which would be helpful for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers the parameter 'entity_id' with a description ('Entity ID (empty=all)'). The description adds 'per entity' which aligns with the schema, but adds minimal additional meaning beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves findings, SLA breaches, and escalation status per entity. It uses a specific verb ('Get') and lists the resources, but does not differentiate from siblings like 'bridge_status' or 'contract_status'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, nor any exclusions. The context of use is only implied by the description, leaving the agent without clear direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

evidence_packBInspect

Export evidence pack for article/check/entity. Pruefer-ready: evidence, assessments, findings, audit trail, signatures.

ParametersJSON Schema

Name	Required	Description
`article`	No	DORA article e.g. Art. 10
`check_id`	No	Specific check ID
`entity_id`	No	Entity ID

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description lists output contents (evidence, assessments, etc.), providing some transparency. But it does not state whether the operation is read-only, requires special permissions, or has side effects. Annotations are absent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no filler, front-loaded with key purpose. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and three optional params, the description is adequate but lacks details on parameter combinations, output format, and multi-article support.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%; description adds no additional meaning beyond the schema's descriptions. Baseline of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Export' and the resource 'evidence pack', and lists included components. However, it does not differentiate well from siblings like 'evidence_summary', which has similar scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., evidence_summary, gap_report). No prerequisites or restrictions mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

evidence_summaryBInspect

All evidence artefacts for an entity with hashes and expiry dates.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID (optional)

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Description implies a read-only retrieval but does not explicitly state side effects, authorization needs, or performance traits. It provides basic behavioral context but lacks thoroughness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single 10-word sentence captures purpose efficiently without unnecessary clutter. Every word contributes meaning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Description adequately states the output (artefacts with hashes and expiry dates) but omits details on default behavior when entity_id is omitted, result structure, or limitations (e.g., pagination). Adequate for a simple retrieval tool but incomplete given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage for the sole parameter (entity_id described as optional). Description adds minimal value, only implying filtering by entity. Baseline 3 as schema already documents the parameter sufficiently.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool returns evidence artefacts with hashes and expiry dates for an entity. Verb 'All' implies listing. However, it does not differentiate from siblings like evidence_pack or freshness_check.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. With many sibling tools (e.g., evidence_pack, freshness_check), the agent receives no context for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

freshness_checkAInspect

Run freshness watchdog. Expires stale evidence, downgrades GREEN->YELLOW->GREY if evidence too old.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID (optional, checks all)

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses mutation behavior (expires/downgrades) but lacks details on reversibility, scope (all entities vs single), or side effects. Adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no waste. First sentence is a clear call-to-action ('Run freshness watchdog'), second explains effect. Efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one optional parameter, description covers purpose and main effect. Missing details like return value or idempotency, but acceptable given low complexity. No output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one optional parameter 'entity_id' described. Description adds no further meaning; baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool runs a freshness watchdog, expires stale evidence, and downgrades statuses (GREEN->YELLOW->GREY). This is a specific verb+resource and distinguishes it from sibling tools like evidence_pack or evidence_summary.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage as a routine maintenance task ('Run freshness watchdog'), but no explicit when-to-use or when-not-to-use. No alternatives mentioned among siblings, leaving the agent to infer context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gap_reportCInspect

DORA compliance gaps. RED/GREY/YELLOW items with priority and required actions.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID (optional)

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It does not disclose behavioral traits such as read-only nature, side effects, or required permissions for the optional entity_id. The description only hints at output content without explaining impact.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence that directly states the core purpose. It is concise without wasted words, though a slightly more structured format could improve clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, yet the description only vaguely mentions color-coded items and priority/actions. It lacks details on return format, error handling, pagination, or behavior when entity_id is omitted, making it incomplete for an agent to fully understand the tool's output.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has one optional parameter 'entity_id' with full description coverage (100%). The description does not add any additional meaning beyond the schema, so baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as reporting DORA compliance gaps with color-coded status (RED/GREY/YELLOW) and includes priority and actions. It distinguishes from sibling tools like 'generate_report' or 'score_trend' by specifying the regulation and output type, though it could be more explicit about generating a report.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'generate_report' or 'evidence_summary'. No when-to-use or when-not-to-use context is provided, leaving the agent without decision-making support.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_reportBInspect

Generate data-driven DORA Ampel PDF report. Score, gap analysis, provider register, audit trail integrity.

ParametersJSON Schema

Name	Required	Description	Default
`format`	No	json (meta) or pdf (download)	json
`entity_id`	No	Entity ID (optional)

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It mentions generating a PDF report, but the input schema includes a 'format' parameter allowing both 'json' and 'pdf', which is not reflected. No details on side effects, permissions, or response structure are provided, making it incomplete.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence with no wasted words. However, it lacks structure (e.g., bullet points) that could improve readability, but it is still efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and only two parameters, the description fails to explain the output format or parameter roles (e.g., that 'format' controls JSON vs PDF output). This leaves significant gaps for an agent to correctly invoke the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds no extra meaning beyond the schema; it neither explains the 'format' parameter's options (json vs pdf) nor clarifies the optionality of 'entity_id'. Thus, it does not enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates a 'data-driven DORA Ampel PDF report' and lists specific content types (score, gap analysis, provider register, audit trail integrity), making the purpose distinct from sibling tools like 'gap_report' or 'assess_all'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives such as 'gap_report' or 'assess_all', and does not mention any preconditions or exclusions, leaving the agent without decision context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_trial_reportCInspect

Generate watermarked trial report with score, gaps, and CTA.

ParametersJSON Schema

Name	Required	Description	Default
`trial_id`	No
`entity_id`	No

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It mentions 'watermarked', which is a useful detail, but does not clarify if the tool has side effects, requires specific permissions, or its safety profile.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no extraneous words, making it concise. However, it could be structured better to include more key information without losing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has two parameters, no output schema, and many siblings, the description is incomplete. It does not explain the output format, the meaning of 'CTA', or how to use the parameters effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description should explain parameters. However, it does not mention 'trial_id' or 'entity_id' at all, leaving the agent with no guidance on what these strings represent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'generate' and the resource 'trial report', and specifies key content (score, gaps, CTA). It differentiates from siblings like 'generate_report' and 'gap_report' by being more specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives (e.g., gap_report, generate_report). No prerequisites or context for usage are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

health_checkCInspect

Server + DB status.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

C2.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, and the description fails to disclose behavioral traits such as whether the check is instant, what it validates, or potential side effects. For a health check, some expectations about speed or connectivity are omitted.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (one short phrase) and front-loaded. It is not verbose, but it could benefit from a complete sentence structure for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the description should explain return values or what constitutes a 'status'. It is incomplete for an agent to understand the tool's full behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With zero parameters and 100% schema coverage, the description adds modest context by specifying the focus on server and database. While baseline is 3, the slight expansion on purpose justifies a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Server + DB status' conveys a basic purpose of checking server and database health, but it is vague about what 'status' entails and does not distinguish it from sibling tools like article_status or bus_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. With many sibling status tools, the lack of usage context is a significant gap.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

incident_flowCInspect

DORA incident lifecycle: log, classify, notify (BaFin), close. Each step creates signed evidence.

ParametersJSON Schema

Name	Required	Description
`title`	No
`action`	No	log \| classify \| notify \| close
`severity`	No
`entity_id`	No	Entity ID
`root_cause`	No
`description`	No
`incident_id`	No
`report_type`	No
`classification`	No
`lessons_learned`	No

Tool Definition Quality

C2.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that each step creates signed evidence, which is key behavioral context. However, no annotations exist, and the description omits other traits like authentication needs, authorization requirements, rate limits, or consequences of actions (e.g., whether operations are reversible). Partial transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence efficiently conveys the lifecycle steps and side effect. No wasted words, and key information is front-loaded. Could be slightly more structured but is appropriately concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 10 parameters, no output schema, and no annotations, the description is too brief. Missing details on parameter dependencies per action, return values (signed evidence format), error handling, and workflow sequencing. Significant gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20% (only 'action' and 'entity_id' have descriptions). The tool description adds no parameter-specific meaning. With 10 parameters and minimal schema descriptions, the description should compensate but does not.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states specific verbs (log, classify, notify, close) and resource (DORA incident lifecycle), making purpose clear. However, it does not differentiate from sibling tools like bafin_approve_send or bafin_report_draft, which are related to BaFin compliance but not identical.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. Does not specify prerequisites, when each action is appropriate, or how the lifecycle steps should be ordered. Agent must infer usage entirely.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

llm_clause_checkAInspect

LLM-based DORA Art. 30 contract analysis. Paste contract text, get clause-by-clause PRESENT/PARTIAL/MISSING for all 15 mandatory clauses. Uses Claude API.

ParametersJSON Schema

Name	Required	Description	Default
`contract_text`	No	Contract text (plain text from PDF). Paste key sections.
`provider_name`	No	Provider name e.g. Salesforce

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must bear the burden of behavioral disclosure. It does note that the tool uses 'Claude API', implying an LLM-based, non-deterministic operation. However, it does not discuss potential issues like rate limits, cost, or reliability, leaving gaps for the agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two sentences, front-loading the purpose and expected output. Every word adds value, with no redundant or vague phrasing.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 params, no output schema), the description covers the key aspects: what it does, input requirements, and output format (clause-by-clause PRESENT/PARTIAL/MISSING). It could mention handling of long texts or edge cases, but overall it is sufficiently complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Both parameters are fully described in the schema (100% coverage). The description adds value by giving examples ('e.g. Salesforce' for provider_name) and instructing to 'Paste key sections' for contract_text, which aids correct usage beyond the schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool performs DORA Art. 30 contract analysis using LLM, producing clause-by-clause status for all 15 mandatory clauses. It clearly distinguishes from sibling tools like 'contract_analyze' or 'cross_regulation_check' by focusing on a specific regulation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates usage for pasting contract text and obtaining a structured clause analysis, implying it is intended for DORA compliance checks. However, it lacks explicit when-not-to-use guidance or alternatives among siblings, which slightly reduces clarity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onboard_entityAInspect

Full entity onboarding: creates initial RED assessments for all 39 checks, collects auto-evidence from live sources, re-assesses, and returns readiness score.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID to onboard

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses the multi-step nature (creates assessments, collects evidence, re-assesses) and the output (readiness score). It does not mention duration, permissions, or side effects, but still provides solid behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, information-dense sentence that covers key steps and output. It is front-loaded and efficient, though splitting into two sentences could improve readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex process and one parameter, the description adequately covers purpose, steps, and output (readiness score). Lacks detail on side effects, but overall sufficient for an AI agent to understand the tool's scope.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only one parameter (entity_id) with 100% schema description coverage. The description adds no extra meaning beyond 'Entity ID to onboard', so baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it performs full entity onboarding, including creating RED assessments for 39 checks, collecting auto-evidence, re-assessing, and returning a readiness score. This differentiates it from sibling tools like 'create_entity' or 'assess_all' which likely cover only parts of the process.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for initial onboarding but does not explicitly state when to use versus alternatives like 'readiness_check' or 'assess_all'. No prerequisites or exclusions are mentioned, leaving potential ambiguity for an AI agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pingAInspect

Quick connectivity test.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It implies a non-destructive read operation but does not explicitly state safety, side effects, or return format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (three words) and front-loaded. Every word is purposeful with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with no parameters or output schema, the description is functionally complete. It could mention the return type but is sufficient for a basic ping.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With zero parameters, schema coverage is trivially 100%. The baseline for 0 params is 4, and the description adds no parameter info as none exist.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Quick connectivity test' clearly states the tool's purpose with a specific verb (test) and resource (connectivity). It distinguishes itself from siblings like health_check by using the standard 'ping' term.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. The description does not mention scenarios or exclusions, leaving the agent to infer usage from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

policy_draftBInspect

Generate DORA policy/framework document draft for a specific article. 8 templates available (Art. 5,6,8,10,11,17,28,30). Uses entity data for customization.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No
`dora_article`	No	dora_art5\|dora_art6\|dora_art8\|dora_art10\|dora_art11\|dora_art17\|dora_art28\|dora_art30

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full burden of disclosing behavioral traits. It states the tool generates a draft and uses entity data for customization, but it does not clarify if this operation is read-only or has side effects (e.g., saving drafts, overwriting existing data). It also does not mention authentication requirements or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise: two sentences with no unnecessary words. The first sentence front-loads the main purpose, and the second adds essential details (templates and data source). Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 2 parameters, no output schema, and no annotations, the description is fairly complete. It explains what the tool does, the article templates available, and that it customizes using entity data. However, it does not mention the output format (e.g., text, file) or whether entity_id is required for customization, leaving minor gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 2 parameters, with 50% description coverage (only dora_article has a description). The description adds context: it explicitly lists the 8 allowable values for dora_article (Art. 5,6,8,10,11,17,28,30), which is not fully enumerated in the schema. It also states 'Uses entity data for customization,' clarifying the purpose of entity_id beyond its schema type.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates a DORA policy/framework document draft for a specific article, specifying the action and resource. It also lists the 8 available article templates, distinguishing the tool's scope. However, it does not explicitly differentiate from sibling tools, such as bafin_report_draft or gap_report, which might have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide explicit guidelines on when to use this tool versus alternatives. It does not specify prerequisites, such as requiring an existing entity or article selection, nor does it mention conditions where alternatives like 'assess_all' or 'gap_report' would be more appropriate. The usage context is only implied.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

provider_country_riskBInspect

Enrich provider dependencies with OECD economic risk: GDP, unemployment, CLI per provider country. DORA Art. 28-31 relevant.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavioral traits. 'Enrich' suggests data augmentation but does not clarify whether the tool is read-only or modifies data, what permissions are required, or any side effects. The economic indicators are listed, but behavioral impact is unspecified.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, front-loaded with the core purpose and key specifics (OECD risk, DORA articles). Every part adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool without an output schema, the description adequately covers the enrichment scope (GDP, unemployment, CLI) and regulatory relevance. It lacks details on output format or return values, but the simplicity of the tool reduces the need for extensive completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description adds no additional meaning beyond the schema's 'Entity ID' definition; it does not explain how 'entity_id' maps to provider countries or the expected format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the action ('Enrich'), the resource ('provider dependencies'), and the economic indicators (GDP, unemployment, CLI) with regulatory context (DORA Art. 28-31). However, it does not explicitly differentiate from sibling tools like 'cross_regulation_check' or 'regulation_impact', which slightly reduces clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'DORA Art. 28-31 relevant', implying a regulatory context, but provides no explicit guidance on when to use this tool versus alternatives (e.g., when to choose this over 'regulation_impact'). No when-not-to-use or exclusion criteria are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

readiness_checkBInspect

Full DORA readiness score + Ampel per article. Returns GREEN/YELLOW/RED/GREY for all 26 articles, score 0-100, days until deadline.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID (optional)

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It only states what is returned, not whether the operation is read-only, requires authentication, or has side effects. No mention of error conditions or data freshness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence that front-loads the key information: purpose, specific output details, and scope. Every word adds value; no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple retrieval tool with one optional parameter and no output schema, the description covers the return structure and scale (26 articles). It could mention the output format more explicitly but is largely complete given low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter entity_id, described in the schema as 'Entity ID (optional)'. The description does not add further meaning beyond the schema, so baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a full DORA readiness score with Ampel per article, listing specific output (GREEN/YELLOW/RED/GREY for 26 articles, score 0-100, days until deadline). The verb 'returns' and resource 'readiness' are specific, distinguishing it from siblings like article_status which likely focus on individual articles.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like article_status or assess_all. The description does not mention contexts, prerequisites, or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_providerBInspect

Register an ICT third-party provider for DORA Art. 28 Register of Information. Stores provider data and creates evidence.

ParametersJSON Schema

Name	Required	Description
`lei`	No	Legal Entity Identifier
`services`	No	Services provided
`entity_id`	No
`criticality`	No	critical, important, standard
`contract_end`	No
`headquarters`	No	Country e.g. Luxembourg, Germany
`data_location`	No	Where data is stored e.g. EU (Frankfurt)
`provider_name`	No	Provider name e.g. Amazon Web Services EMEA SARL
`provider_type`	No	cloud_infrastructure, saas_application, core_banking, cybersecurity, etc.
`certifications`	No
`contract_start`	No
`annual_cost_eur`	No
`substitutability`	No

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description only mentions 'stores data and creates evidence' but omits key behaviors like idempotency, authorization requirements, or side effects on existing records.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no redundancy, conveying the essential action and purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 13 parameters, no annotations, and no output schema, the description is too brief to provide complete guidance on tool usage and behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

At 54% schema coverage, several parameters lack descriptions in the schema and the tool description does not compensate by explaining them or adding constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Register'), the resource ('ICT third-party provider'), and the regulatory context ('DORA Art. 28 Register of Information'), distinguishing it from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'provider_country_risk' or 'check_contract', leaving the AI to infer context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

regulation_impactBInspect

Show cross-regulation impacts for a specific DORA article. Maps DORA → MiCA + AMLR.

ParametersJSON Schema

Name	Required	Description	Default
`dora_article`	No	DORA article ID (e.g. dora_art28)

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; the description does not disclose behavioral traits such as read-only nature, side effects, or performance characteristics. The minimal description leaves significant gaps for an agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences are concise and front-loaded with the core purpose. However, slightly more structure (e.g., listing output type) could improve it.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description does not explain the mapping output format. It is adequate for a simple lookup but incomplete for understanding the tool's full behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with a description for the dora_article parameter. The description adds no extra meaning beyond what the schema already provides, meeting the baseline but not exceeding it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Show' and the resource 'cross-regulation impacts', specifying the mapping from DORA to MiCA and AMLR. This distinguishes it from siblings like cross_regulation_check or assess_all.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, nor any exclusion criteria. The description only states what it does without context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reg_watchdogAInspect

AI Regulatory Watchdog: scrapes EBA/ESMA/BaFin/CERT-Bund for DORA updates. Returns alerts with affected articles and severity. Run daily via cron or on-demand.

ParametersJSON Schema

Name	Required	Description	Default
`days_back`	No	Check items from last N days (default: 7)

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavior. It states the tool 'scrapes' (implying web requests) and returns alerts, but omits details like rate limits, output format, failure modes, or whether it modifies data. Basic behavior is clear but lacks depth.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, no unnecessary words. The purpose and usage are front-loaded. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description partially covers the tool's function and return (alerts with articles and severity). However, it lacks details on the format of alerts, error handling, and network dependencies. Adequate for a simple tool but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter, days_back, is fully described in the input schema (100% coverage), so the description need not elaborate. Baseline 3 is appropriate; the description adds no additional meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as a watchdog that scrapes specific regulatory bodies (EBA, ESMA, BaFin, CERT-Bund) for DORA updates, returning alerts with affected articles and severity. It distinguishes itself from sibling tools like article_status or assess_all by its scraping focus.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states 'Run daily via cron or on-demand,' giving clear usage context. It does not provide exclusionary guidance or alternatives, but the use case is sufficiently scoped.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

retest_findingAInspect

Re-test a finding: collect fresh evidence, reassess check, auto-close if GREEN. Full closed-loop.

ParametersJSON Schema

Name	Required	Description	Default
`finding_id`	No	Finding ID to re-test

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description discloses key behavioral traits: evidence collection, reassessment, and auto-closure if GREEN. It frames the action as a 'full closed-loop', indicating state modification. However, it does not mention permissions, failure handling, or side effects beyond status changes.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, front-loaded with the primary action ('Re-test a finding'), followed by sequential steps and a summary. No redundant information; every clause adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains the process and outcome condition (auto-close if GREEN). It is reasonably complete for a simple tool with one parameter, but it omits what the return value or response looks like (e.g., success message, updated finding object).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (one parameter 'finding_id' with description). The tool description does not add any additional meaning beyond the schema's own description. Baseline score of 3 applies as the schema already documents the parameter adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Re-test') and resource ('finding'), and clearly outlines the steps: collect fresh evidence, reassess check, auto-close if GREEN. It distinguishes itself from siblings like 'assess_all' (bulk assessment) and 'update_finding' (manual update) by emphasizing an automated closed-loop process.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for re-testing a single finding but does not explicitly state when to use this tool versus alternatives like 'update_finding' or 'assess_all'. No exclusion criteria or comparative guidance is provided, leaving the agent to infer context from sibling names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

run_escalationBInspect

Trigger escalation engine: auto-create findings, check SLA, escalate.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden but only mentions actions in vague terms (auto-create findings, check SLA, escalate). It does not disclose side effects, whether it modifies data, or if it requires specific permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short and to the point, with three key phrases separated by commas. It avoids redundancy but could be slightly more structured, e.g., listing outcomes separately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no parameters and no output schema, the description provides a basic understanding. However, it lacks context about triggers, results, or integration with other tools, leaving some gaps for an agent to infer.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, and the schema coverage is 100% (empty). The description does not need to add parameter details, so it meets the baseline for zero-parameter tools.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it triggers the escalation engine and lists specific actions (auto-create findings, check SLA, escalate), which differentiates it from sibling tools like escalation_status or retest_finding. However, it could be more precise about the exact outcome.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as escalation_status or retest_finding. The description lacks context on prerequisites or situations where this tool is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

run_trial_assessmentCInspect

Run complete DORA+MiCA assessment for trial entity. Returns score, gaps, automation potential.

ParametersJSON Schema

Name	Required	Description	Default
`trial_id`	No
`entity_id`	No

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description should provide behavioral details such as side effects, idempotency, or authorization needs. It only states the tool 'runs an assessment' and returns scores, with no mention of data mutation, resource locking, or performance implications.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence with no redundant words. It efficiently conveys the core purpose and outputs, but could benefit from a brief list of parameters or use cases.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description should elaborate on the return format or structure of 'score, gaps, automation potential.' It remains vague, and the lack of annotations reduces completeness further. The agent is left without sufficient context to handle the output effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% (no parameters described in schema), and the description does not mention any parameters. Despite the schema containing two string properties, the description fails to clarify their purpose or required formats, completely shifting the burden to the agent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Run complete DORA+MiCA assessment for trial entity,' which clearly identifies the action and scope. However, it does not differentiate from sibling tools like 'assess_all,' which may have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is given on when to use this tool versus alternatives. The description lacks context about prerequisites or scenarios where this tool is preferred, forcing the agent to infer usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_trendBInspect

Score trend over time: weekly deltas, trajectory, peer benchmark. Shows improvement or decline.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must carry full burden. It states 'Shows improvement or decline' but omits whether the tool is read-only, requires permissions, or has side effects. Minimal behavioral disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with key purpose. No redundant words. Efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one param and no output schema, the description is adequate but lacks details on output format, time range, or limitations. Slightly incomplete for full self-service.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage and one parameter 'entity_id' described as 'Entity ID', the description adds no extra meaning. Baseline 3, no added value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool shows 'Score trend over time: weekly deltas, trajectory, peer benchmark. Shows improvement or decline.' This distinguishes from siblings like static assessments, but could better specify what 'score' refers to.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus siblings (e.g., assess_all, gap_report). No explicit context or alternatives mentioned, leaving the agent to infer usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

servicenow_syncCInspect

ServiceNow incident + change management sync. DORA Art. 17/21 evidence. Returns 30-day incident stats, classification, resolution rates.

ParametersJSON Schema

Name	Required	Description	Default
`days_back`	No	Days to look back (default 30)

Tool Definition Quality

C2.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavioral traits. It claims to 'sync' (implying possible write operations) but primarily describes read-like output (returning stats). No mention of side effects, authentication needs, or whether data is modified, limiting agent understanding.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no filler. The first sentence states purpose, the second describes output. Front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple one-parameter tool, the description outlines what it returns but omits whether it writes or is read-only, and does not explain the output structure. Without an output schema, more detail on return format would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter 'days_back', which has a clear description. The description mentions '30-day incident stats', but this could mislead if the parameter is set to a different value. Still, the baseline is 3 given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it syncs ServiceNow incidents and change management, and returns DORA evidence plus stats. However, the verb 'sync' is ambiguous (could imply bidirectional update), and it does not clearly differentiate from sibling tools like 'incident_flow' or 'assess_all'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is given on when to use this tool versus alternatives. There is no mention of prerequisites, scenarios, or exclusions, leaving the agent to infer context solely from the description.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_findingBInspect

Update finding lifecycle: claim, set remediation plan, request re-test, close, or accept risk. Status flow: open -> in_progress -> retest_pending -> closed | risk_accepted.

ParametersJSON Schema

Name	Required	Description
`actor`	No	Who is performing this action
`owner`	No	New owner (for claim)
`action`	No	claim \| plan \| request_retest \| close \| accept_risk
`reason`	No	Close reason (for close)
`finding_id`	No	Finding ID
`accepted_by`	No	Name (for accept_risk)
`expiry_days`	No	Risk acceptance expiry days (default 90)
`remediation_plan`	No	Remediation plan text (for plan)

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must disclose behavior. It mentions status changes but omits side effects, authentication needs, rate limits, or error conditions. The status flow helps but is insufficient for full transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences covering actions and status flow with no fluff. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers core functionality (actions and transitions) but lacks details on prerequisites, error handling, and required parameters. For 8 parameters and no output schema, more context is needed for full completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with each parameter described. Description adds context by grouping actions and showing status flow, but does not significantly enhance understanding beyond schema. Baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool updates finding lifecycle with specific actions (claim, plan, retest, close, accept risk) and includes a status flow. However, it does not distinguish from sibling tool 'retest_finding', which may cause confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage through enumeration of actions and status flow, but no explicit guidance on when to use this tool vs alternatives like 'retest_finding' or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

whatif_providerAInspect

Simulate provider failure: which articles/checks are affected, score impact, risk level.

ParametersJSON Schema

Name	Required	Description	Default
`entity_id`	No	Entity ID
`provider_name`	No	Provider name (e.g. AWS, Finastra)

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must cover behavioral traits. It indicates simulation (non-destructive) and outputs, but does not explicitly state that it is read-only or has no side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that effectively front-loads the action and expected outputs. Every word contributes value, with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simulation tool with two parameters and no output schema, the description adequately explains the output (affected articles/checks, score impact, risk level). It could mention that it is non-destructive, but overall it is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers both parameters with descriptions (100% coverage). The description adds no additional meaning beyond what the schema provides, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: simulate provider failure and show affected articles/checks, score impact, and risk level. It differentiates from siblings like 'whatif_stale' by focusing on provider failure specifically.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for simulating provider failure impacts. It does not explicitly state when not to use or mention alternatives, but the context is clear given the sibling tool names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

whatif_staleBInspect

Simulate stale evidence: what happens if a check stays stale for N days.

ParametersJSON Schema

Name	Required	Description
`days`	No	Days stale (default 30)
`check_id`	No	Check ID
`entity_id`	No	Entity ID

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description only mentions simulation without disclosing read-only nature, side effects, or result format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no fluff, front-loaded with key action and parameter.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, no details on simulation output or errors; incomplete for a simulation tool with 3 parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters with descriptions; tool description adds no extra meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool simulates stale evidence over N days, with a specific verb and resource. It's distinct from sibling tools like whatif_provider.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for assessing stale check impact but provides no explicit guidance on when to use vs alternatives or when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?