clawhup

Server Details

Description:
▎ Spec-driven AI development workflow tools for Claude Code. Manage PRD → Design → Impl → Agent Task lifecycles, search and execute AI Prompts, and ▎ track traceable software delivery — all from Cl

Status: Healthy
Last Tested: 2026-05-07 05:32
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

C2.5/5.0

Tool DescriptionsC

Average 2.5/5 across 29 of 29 tools scored. Lowest: 1.3/5.

Server CoherenceA

Disambiguation4/5

Most tools target distinct actions and resources (e.g., agent tasks, documents, prompts). However, potential overlap exists between 'clawhup_create_decision' and document creation of type 'decision', and between 'clawhup_list_decisions' and listing documents of type 'decision', which could cause slight confusion.

Naming Consistency3/5

Tools follow a prefix 'clawhup_' and mostly verb_noun pattern, but inconsistencies exist: 'agent_task_*' vs 'agent_step_report' (no 'task'), 'doc_create' vs 'doc_create_spec', and 'prompt_detail' uses noun instead of verb. Mixed patterns like 'get_versions' and 'list_my_prompts' show lack of uniform convention.

Tool Count4/5

29 tools cover multiple subdomains (agents, documents, prompts, decisions). While on the higher side, each tool serves a clear purpose and the count reflects the breadth of the platform. Could be slightly trimmed but remains reasonable.

Completeness3/5

Core workflows are covered (e.g., full agent lifecycle, document CRUDL, prompt management). Notable gaps: missing delete for documents, no get/update for decisions, and limited material item operations (only add). These missing operations may cause agents to hit dead ends.

Available Tools

29 tools

clawhup_add_material_itemDInspect

新增素材项。

ParametersJSON Schema

Name	Required	Description	Default
`key`	No
`type`	Yes
`value`	Yes
`weight`	No
`enabled`	No
`libraryId`	Yes

Tool Definition Quality

D1.7/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. '新增' (add/insert) implies a write operation, but the description fails to disclose any behavioral traits: it doesn't mention authentication requirements, whether this is idempotent, what happens on duplicate keys, what the return value looks like, or any error conditions. For a mutation tool with 6 parameters and no annotations, this complete lack of behavioral information is inadequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is maximally concise - a single phrase with no wasted words. While this conciseness comes at the cost of being under-specified, from a pure structural perspective, every character serves the core purpose of stating the tool's action. There is no unnecessary elaboration or redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (6 parameters, 3 required, no output schema, no annotations, and 0% schema description coverage), this description is completely inadequate. It fails to explain what the tool does beyond the name, provides no parameter guidance, no behavioral context, and no usage guidelines. For a creation/mutation tool with multiple parameters, this minimal description leaves the agent unable to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, meaning none of the 6 parameters have descriptions in the schema. The tool description provides absolutely no information about any parameters - it doesn't explain what 'libraryId', 'type', 'value', 'key', 'weight', or 'enabled' mean, their expected formats, or how they interact. With zero parameter guidance in either schema or description, the agent cannot understand what values to provide.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '新增素材项' (Add material item) is a tautology that essentially restates the tool name 'clawhup_add_material_item'. It provides no additional specificity about what 'material item' means, what resource it operates on, or how it differs from sibling tools like 'clawhup_create_material_library' or 'clawhup_patch_material_item'. The description is too vague to be helpful beyond the name itself.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides absolutely no guidance on when to use this tool versus alternatives. There is no mention of prerequisites, appropriate contexts, or distinctions from related tools like 'clawhup_patch_material_item' (for updates) or 'clawhup_list_material_items' (for reading). The agent receives no help in determining when this specific creation tool should be selected.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_agent_step_reportCInspect

上报 Agent 步骤执行结果。

ParametersJSON Schema

Name	Required	Description
`status`	Yes	步骤状态: succeeded / failed / skipped
`stepNo`	Yes	步骤编号
`taskId`	Yes	任务 ID
`stepType`	Yes	步骤类型: llm_call / mcp_tool / human_gate
`toolName`	No	工具名称（stepType=mcp_tool 时）
`errorCode`	No	错误代码
`latencyMs`	No	耗时（毫秒）
`outputJson`	No	步骤输出 JSON
`errorMessage`	No	错误信息

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden of behavioral disclosure. It states 'reporting' but doesn't clarify if this is a write operation, what system it updates, whether it's idempotent, or what happens on failure. This is inadequate for a tool with 9 parameters and no output schema, as critical behavioral traits are missing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that directly states the tool's purpose without redundancy. It's appropriately sized and front-loaded, with no wasted words, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (9 parameters, no annotations, no output schema), the description is incomplete. It doesn't explain the reporting mechanism, expected outcomes, error handling, or how it integrates with the agent workflow. This leaves significant gaps for the agent to understand the tool's role and behavior in context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with each parameter well-documented in the schema (e.g., 'status' with allowed values, 'stepType' with types). The description adds no additional parameter semantics beyond what the schema provides, so it meets the baseline of 3 for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '上报 Agent 步骤执行结果' (Report Agent step execution results) states the verb ('report') and resource ('Agent step execution results'), but it's vague about what 'reporting' entails—whether it's logging, updating a database, or sending notifications. It doesn't differentiate from siblings like 'clawhup_agent_task_complete' or 'clawhup_agent_task_fail', which might handle task-level outcomes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites (e.g., after a step execution), exclusions, or how it relates to sibling tools for task-level reporting, leaving the agent to infer usage from context alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_agent_task_completeCInspect

标记 Agent 任务完成。

ParametersJSON Schema

Name	Required	Description	Default
`taskId`	Yes	任务 ID

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It states 'mark complete' which implies a mutation (status update), but doesn't disclose behavioral traits such as permissions required, whether it's idempotent, side effects (e.g., notifications), or error handling. This is a significant gap for a mutation tool with zero annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence '标记 Agent 任务完成' (Mark Agent task complete), which is appropriately concise and front-loaded. It wastes no words, though it could be more informative without sacrificing brevity. Every word earns its place, but it's slightly under-specified.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complexity (mutation tool with no annotations and no output schema), the description is incomplete. It doesn't explain what 'marking complete' does behaviorally, return values, or error conditions. For a tool that likely changes task state, more context is needed to understand its full impact and usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the single parameter 'taskId' documented as '任务 ID' (task ID). The description adds no meaning beyond the schema—it doesn't explain format constraints, valid ranges, or how the ID relates to task states. Baseline 3 is appropriate since the schema does the heavy lifting, but no extra value is added.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '标记 Agent 任务完成' (Mark Agent task complete) states a clear verb ('mark complete') and resource ('Agent task'), but it's vague about what 'marking complete' entails—does it update status, trigger notifications, or finalize something? It distinguishes from siblings like 'clawhup_agent_task_fail' (fail) and 'clawhup_agent_task_start' (start) by indicating completion, but lacks specificity on the action's scope or effects.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., task must be started), exclusions (e.g., don't use if already completed), or related tools like 'clawhup_agent_task_fail' for error cases. The description implies usage after task execution but offers no explicit context or rules.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_agent_task_createCInspect

创建 Agent 任务。

ParametersJSON Schema

Name	Required	Description
`goal`	No	任务目标详细描述
`title`	Yes	任务标题
`specId`	No	绑定的规格编号
`pipeline`	No	链路模板: feature_dev / release_doc / bug_fix
`planJson`	No	执行计划 JSON
`projectId`	Yes	项目 ID
`autoConfirm`	No	true=human_gate 自动通过，Spec 已确认时使用

Tool Definition Quality

C2.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. '创建' (Create) implies a write/mutation operation, but the description doesn't specify permissions required, whether the task is immediately active, what happens on failure, or the expected response format. For a creation tool with zero annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise ('创建 Agent 任务。') but this brevity comes at the cost of being under-specified rather than efficient. While it's front-loaded (the entire description is in one phrase), it fails to provide necessary context that would help an agent understand and use the tool effectively.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a task creation tool with 7 parameters, no annotations, and no output schema, the description is incomplete. It doesn't explain what an 'Agent task' is in this context, how it relates to other agent operations, what the creation process entails, or what the agent should expect as a result. The minimal description leaves too many open questions for effective tool selection and invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema fully documents all 7 parameters with clear descriptions. The tool description adds no additional parameter information beyond what's already in the schema. According to guidelines, when schema coverage is high (>80%), the baseline score is 3 even with no param info in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '创建 Agent 任务' (Create Agent task) is a tautology that merely restates the tool name in Chinese. It lacks specificity about what an 'Agent task' entails, what resources it creates, or how it differs from sibling tools like clawhup_agent_task_start or clawhup_agent_task_complete. The purpose is vague and provides no meaningful distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There are multiple sibling tools related to agent tasks (e.g., clawhup_agent_task_start, clawhup_agent_task_complete, clawhup_agent_task_get), but the description offers no context about prerequisites, sequencing, or appropriate scenarios for creating a task versus other operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_agent_task_failCInspect

标记 Agent 任务失败。

ParametersJSON Schema

Name	Required	Description
`taskId`	Yes	任务 ID
`errorCode`	No	错误代码
`errorMessage`	No	错误信息

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. '标记 Agent 任务失败' suggests a mutation operation that changes task state to failed, but doesn't describe side effects (e.g., whether this triggers notifications, affects related tasks, or is reversible), authentication needs, rate limits, or error handling. For a mutation tool with zero annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that directly states the tool's purpose with no wasted words. It's appropriately sized and front-loaded, making it easy to understand at a glance without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a mutation operation with 3 parameters, no annotations, and no output schema), the description is incomplete. It doesn't explain what happens after marking a task as failed, potential return values, error conditions, or how it integrates with the broader task management system. For a tool that likely changes system state, more context is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for all three parameters (taskId, errorCode, errorMessage). The description doesn't add any parameter-specific information beyond what the schema provides, such as format examples or usage context. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '标记 Agent 任务失败' (Mark Agent task as failed) states a clear verb ('标记' - mark) and resource ('Agent 任务' - Agent task), but it's somewhat vague about what 'marking as failed' entails operationally. It distinguishes from siblings like 'clawhup_agent_task_complete' by indicating failure vs. completion, but lacks specificity about the failure mechanism or system impact.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives is provided. The description implies it's for marking tasks as failed, but doesn't specify prerequisites (e.g., task must be in progress), conditions for failure, or how it differs from simply not completing a task. Sibling tools like 'clawhup_agent_task_complete' suggest a workflow, but no context is given for choosing between them.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_agent_task_getCInspect

获取 Agent 任务详情。

ParametersJSON Schema

Name	Required	Description	Default
`full`	No	true=返回完整步骤列表（默认 false，只返最近 5 步 + totalSteps）
`taskId`	Yes	任务 ID

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It implies a read operation ('获取' - get) but doesn't disclose behavioral traits like whether it's safe (non-destructive), requires authentication, has rate limits, or returns structured data. For a tool with no annotations, this leaves critical gaps in understanding its behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese ('获取 Agent 任务详情'), which is appropriately sized and front-loaded. There's no wasted text, making it concise, though it could benefit from slight elaboration for clarity without losing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a read operation with one parameter) and lack of annotations and output schema, the description is incomplete. It doesn't explain what 'details' include, potential errors, or return format, leaving the agent with insufficient context to use it effectively beyond basic parameter input.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with 'taskId' documented as '任务 ID' (task ID). The description adds no meaning beyond this, as it doesn't explain parameter usage, format, or constraints. With high schema coverage, the baseline is 3, but the description doesn't compensate with additional insights.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '获取 Agent 任务详情' (Get Agent task details) states a clear verb ('获取' - get) and resource ('Agent 任务详情' - Agent task details), but it's vague about what 'details' include and doesn't distinguish from sibling tools like 'clawhup_agent_task_list' (which likely lists tasks) or 'clawhup_agent_task_complete' (which modifies tasks). It's adequate but lacks specificity about scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a task ID), differentiate from 'clawhup_agent_task_list' for listing vs. getting details, or specify use cases like retrieving status or metadata. Without such context, an agent might misuse it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_agent_task_listCInspect

列出 Agent 任务。

ParametersJSON Schema

Name	Required	Description
`q`	No	关键词搜索
`size`	No	每页条数，默认 50，最大 200
`cursor`	No	游标（首页不传）
`projectId`	No	项目 ID（不传则查全部）

Tool Definition Quality

C2.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. The description only states '列出 Agent 任务' (List Agent tasks), which implies a read-only operation but doesn't disclose any behavioral traits like pagination behavior (though the schema hints at cursor-based pagination), authentication requirements, rate limits, or what constitutes an 'Agent task' in this context. For a tool with no annotations, this minimal description is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at just three Chinese characters ('列出 Agent 任务'), which is appropriately brief for a simple listing tool. It's front-loaded with the core action ('列出' - list) and avoids unnecessary elaboration. However, it's arguably too minimal given the lack of sibling differentiation and behavioral context, slightly reducing its effectiveness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 parameters, no output schema, no annotations, and multiple sibling tools), the description is incomplete. It doesn't explain what 'Agent tasks' are, how they differ from other task types in the system, what the output format looks like, or when to use this versus other agent task tools. While the schema covers parameters well, the description fails to provide necessary context for proper tool selection and understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with all four parameters ('q', 'size', 'cursor', 'projectId') clearly documented in the schema itself. The description adds no additional parameter information beyond what's already in the schema. According to the scoring rules, when schema_description_coverage is high (>80%), the baseline is 3 even with no param info in the description, which applies here.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '列出 Agent 任务' (List Agent tasks) is a tautology that essentially restates the tool name 'clawhup_agent_task_list' without adding meaningful differentiation. While it clearly indicates a listing operation, it doesn't specify what 'Agent tasks' are or how this differs from sibling tools like 'clawhup_agent_task_get' (which presumably retrieves a single task). The purpose is vague beyond the basic verb-noun pairing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There are multiple sibling tools related to agent tasks (e.g., 'clawhup_agent_task_get' for single task retrieval, 'clawhup_agent_task_create' for creation), but the description offers no context about when this list tool is appropriate versus those other options. This leaves the agent without any usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_agent_task_report_filesCInspect

上报 Agent 任务变更文件列表，记录本次任务修改了哪些源文件

ParametersJSON Schema

Name	Required	Description	Default
`files`	Yes	变更文件列表
`taskId`	Yes	任务 ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool 'records' file changes, implying a write/mutation operation, but doesn't specify whether this requires authentication, has side effects (e.g., updating a database), or what happens on failure. For a tool that likely modifies state, this lack of detail is a significant gap, though it doesn't contradict any annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that directly states the tool's purpose without redundancy. It's front-loaded with the core action and resource, making it easy to parse quickly. No words are wasted, and it aligns well with the tool's name.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool likely performs a mutation (reporting/recording changes) with no annotations and no output schema, the description is incomplete. It doesn't cover behavioral aspects like authentication needs, error handling, or what the tool returns. For a 2-parameter tool with 100% schema coverage but missing critical context for safe invocation, this is inadequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters (taskId and files) and their nested properties. The description adds no additional meaning beyond what the schema provides—it mentions 'taskId' and 'files' implicitly but without extra context like format examples or constraints. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('上报' meaning 'report' or 'submit') and the resource ('Agent 任务变更文件列表' meaning 'Agent task change file list'), specifying it records which source files were modified in a task. It distinguishes from siblings like clawhup_agent_task_complete or clawhup_agent_task_fail by focusing on file-level changes rather than task status updates. However, it doesn't explicitly differentiate from all siblings (e.g., clawhup_agent_step_report might also involve reporting).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., must be called after task completion), exclusions, or how it relates to siblings like clawhup_agent_task_complete or clawhup_agent_task_fail. Usage is implied only through the action of reporting file changes, with no explicit context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_agent_task_startCInspect

启动 Agent 任务。

ParametersJSON Schema

Name	Required	Description	Default
`taskId`	Yes	任务 ID

Tool Definition Quality

C2/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full responsibility for behavioral disclosure but offers none. '启动' (start) implies a state-changing operation, but the description doesn't reveal whether this initiates execution, changes status, requires specific permissions, has side effects, or what happens to the task afterward. For a mutation tool with zero annotation coverage, this represents a critical information gap about its behavior and consequences.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise - a single four-character phrase in Chinese. While this represents efficient communication, it's arguably too brief given the tool's likely complexity. However, every character serves a purpose (verb + object), and there's no redundant information. The structure is front-loaded but lacks any elaboration that might be necessary for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given this is a mutation tool (implied by 'start') with no annotations and no output schema, the description is inadequate. It doesn't explain what 'starting' means operationally, what state changes occur, what permissions are required, or what the expected outcome is. The single parameter is well-documented in the schema, but the core behavioral context is missing for what appears to be a state transition operation in a workflow system.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100% with the single parameter 'taskId' documented as '任务 ID' (task ID). The description doesn't add any parameter information beyond what's already in the schema. Since schema coverage is high, the baseline score of 3 is appropriate - the schema does the parameter documentation work, and the description neither compensates nor detracts.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '启动 Agent 任务' (Start Agent task) is a tautology that essentially restates the tool name 'clawhup_agent_task_start' without adding meaningful clarification. It doesn't specify what 'starting' entails operationally or what resource is being acted upon beyond the generic 'Agent task' mentioned in the name. While it identifies the basic action, it fails to distinguish this tool from sibling tools like 'clawhup_agent_task_create' or 'clawhup_agent_task_complete'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There are multiple related agent task tools (create, get, list, complete, fail, wait), but the description doesn't indicate whether this should be used after creation, instead of creation, or in what workflow sequence. No prerequisites, timing considerations, or exclusion criteria are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_agent_task_waitCInspect

Agent 任务进入等待人工确认状态。

ParametersJSON Schema

Name	Required	Description	Default
`reason`	No	等待原因
`taskId`	Yes	任务 ID

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It states the outcome (task enters waiting state) but doesn't describe what this means operationally: Does it pause task execution? Send notifications? Require specific permissions? How long can it wait? What happens to associated resources? For a state-changing tool with zero annotation coverage, this is insufficient behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient Chinese sentence that directly states the tool's purpose. It's appropriately sized for what it conveys, though it could be more front-loaded with additional context. No wasted words or redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a state-changing tool with no annotations and no output schema, the description is incomplete. It doesn't explain what 'waiting for manual confirmation' entails operationally, what permissions are required, what the expected response looks like, or how this differs from other task state transitions. Given the complexity of task management and lack of structured metadata, more context is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters (taskId and reason). The description doesn't add any parameter semantics beyond what's in the schema - it doesn't explain what constitutes a valid taskId, provide examples of waiting reasons, or clarify parameter interactions. Baseline 3 is appropriate when schema does the documentation work.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Agent 任务进入等待人工确认状态' (Agent task enters waiting for manual confirmation state) states what the tool does - it transitions an agent task to a waiting state requiring human confirmation. However, it's somewhat vague about the exact action (is it a state change command? a notification?) and doesn't clearly distinguish from sibling tools like clawhup_agent_task_complete or clawhup_agent_task_fail which also manage task states.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites (e.g., task must be in a specific state), when this is appropriate versus completing/failing a task, or what triggers the need for human confirmation. The agent must infer usage from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_create_decisionBInspect

为已 implemented 的 L1 Spec 创建决策卡片，记录"决定了什么"和"为什么"。

ParametersJSON Schema

Name	Required	Description
`why`	No	为什么（≤500 字）
`what`	Yes	决定了什么（≤500 字）
`docId`	Yes	关联的 implemented L1 Spec 文档 ID
`topic`	Yes	功能主题标签（如 spec-decisions）

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions creating a decision card but doesn't specify whether this is a write operation, what permissions are required, if it's idempotent, or what happens on success/failure. For a creation tool with zero annotation coverage, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that directly states the tool's purpose without any fluff. It's appropriately sized and front-loaded with the core action, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a creation operation with 4 parameters), no annotations, and no output schema, the description is minimally adequate. It clarifies the purpose but lacks details on behavioral traits, error handling, or output format. It's complete enough to understand what the tool does but not how it behaves or what it returns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with each parameter well-documented in the schema (e.g., 'docId' as '关联的 implemented L1 Spec 文档 ID'). The description adds no additional parameter semantics beyond what's in the schema, so it meets the baseline of 3 where the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('创建决策卡片' - create decision card) and the resource ('为已 implemented 的 L1 Spec' - for implemented L1 Specs), specifying what information is recorded ('决定了什么' and '为什么'). It doesn't explicitly differentiate from sibling tools like 'clawhup_patch_decision' or 'clawhup_list_decisions', but the purpose is well-defined.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides minimal guidance, stating it's for '已 implemented 的 L1 Spec' (implemented L1 Specs), which implies a prerequisite context. However, it doesn't explain when to use this tool versus alternatives like 'clawhup_patch_decision' (for updates) or 'clawhup_get_decision' (for retrieval), nor does it mention any exclusions or specific scenarios for application.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_create_promptDInspect

创建 Prompt。

ParametersJSON Schema

Name	Required	Default
`tags`	No
`title`	Yes
`content`	No
`outputType`	No	text
`description`	No
`timeoutSeconds`	No

Tool Definition Quality

D1.3/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. The description only states '创建 Prompt。' (Create Prompt), which implies a write/mutation operation but provides no information about permissions required, side effects, rate limits, error conditions, or what happens upon successful creation. For a creation tool with 6 parameters and no annotation coverage, this is completely inadequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise ('创建 Prompt。') but this represents under-specification rather than effective brevity. A single sentence with minimal content fails to provide necessary information. While technically brief, it doesn't earn its place by adding value beyond what's already obvious from the tool name.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

This is a creation/mutation tool with 6 parameters, no annotations, no output schema, and 0% schema description coverage. The description provides only the most basic action statement without explaining what a 'Prompt' is, what parameters mean, what happens after creation, or how this differs from related tools. For this level of complexity with no structured documentation support, the description is completely inadequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description mentions no parameters at all. With 6 parameters in the input schema and 0% schema description coverage, the description fails to compensate for the complete lack of parameter documentation. Parameters like 'outputType' with an enum, 'timeoutSeconds' with constraints, and 'title' (required) are completely unexplained in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '创建 Prompt。' (Create Prompt) is a tautology that essentially restates the tool name 'clawhup_create_prompt' in Chinese. While it indicates the tool creates something called a 'Prompt', it doesn't specify what a 'Prompt' is in this context or what resources are involved. It doesn't distinguish this tool from sibling tools like 'clawhup_create_session' or 'clawhup_create_material_library' beyond the object type.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There are multiple sibling tools related to prompts (e.g., 'clawhup_update_prompt', 'clawhup_delete_prompt', 'clawhup_execute_prompt', 'clawhup_list_my_prompts'), but the description doesn't indicate when creation is appropriate versus updating, executing, or listing prompts. No context, prerequisites, or exclusions are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_delete_promptDInspect

删除 Prompt。

ParametersJSON Schema

Name	Required	Description	Default
`promptId`	Yes

Tool Definition Quality

D1.3/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It only states the action '删除' (delete) without disclosing behavioral traits such as whether deletion is permanent, requires permissions, has side effects, or returns confirmation. This is inadequate for a destructive operation with zero annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise ('删除 Prompt。') but under-specified, not efficient. It consists of two words and a period, lacking necessary details for a deletion tool. This brevity results in insufficient information rather than effective structure.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's destructive nature, no annotations, 0% schema coverage, no output schema, and multiple sibling deletion tools, the description is completely inadequate. It fails to provide essential context about behavior, parameters, or usage, leaving critical gaps for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no information about the single required parameter 'promptId'. It does not explain what a promptId is, how to obtain it, or its format (e.g., numeric ID). The description fails to compensate for the lack of schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '删除 Prompt。' is a tautology that restates the tool name 'clawhup_delete_prompt' in Chinese ('删除' means 'delete'), providing no additional clarity about what a 'Prompt' resource is or what deletion entails. It distinguishes from siblings like 'clawhup_create_prompt' by the verb, but lacks specificity about the resource scope or effects.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There are sibling tools like 'clawhup_delete_session' and 'clawhup_delete_material_item', but the description offers no context on prerequisites, conditions, or distinctions between deletion operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_doc_add_relationCInspect

添加文档关联（基于/替代/实现/参考）。

ParametersJSON Schema

Name	Required	Description
`docId`	Yes	源文档 ID
`relType`	Yes	关联类型: based_on / supersedes / implements / references
`targetId`	Yes	目标文档 ID

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool adds a relation, implying a write/mutation operation, but doesn't specify permissions required, whether it's idempotent, what happens on duplicate relations, or error conditions. The description lacks details on side effects, rate limits, or response format, which are critical for a mutation tool with no structured safety hints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that front-loads the core action. It wastes no words, though it could be more structured (e.g., separating purpose from examples). The brevity is appropriate given the straightforward tool name, but it risks under-specification rather than true conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with 3 parameters, no annotations, and no output schema, the description is incomplete. It doesn't cover behavioral aspects like error handling, permissions, or what the tool returns. The schema handles parameter documentation well, but the description fails to compensate for the lack of annotations and output schema, leaving gaps in understanding how to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear parameter descriptions in Chinese (e.g., '源文档 ID' for docId). The description adds minimal value beyond the schema by listing relation types in parentheses, but doesn't explain their semantics or usage scenarios. Since the schema already documents parameters thoroughly, the baseline score of 3 is appropriate, as the description doesn't significantly enhance understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '添加文档关联（基于/替代/实现/参考）' states the action (add document relation) and lists relation types, providing a basic purpose. However, it's vague about what 'document relation' means operationally and doesn't distinguish this tool from potential siblings like clawhup_doc_update or clawhup_doc_create, which might also involve document relationships. The Chinese text adds specificity through the parenthetical but remains somewhat abstract.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites (e.g., existing documents), exclusions, or compare it to sibling tools like clawhup_doc_update that might handle similar functions. Usage is implied through the action but without contextual boundaries, leaving the agent to guess based on the tool name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_doc_createCInspect

创建产研文档（PRD/设计/计划/决策/测试/发布/手册/复盘）。Spec 请用 clawhup_doc_create_spec。

ParametersJSON Schema

Name	Required	Description
`code`	No	文档编码（可选，空则自动生成）
`type`	Yes	文档类型: prd / design / adr / plan / testplan / release / runbook / postmortem / quick（spec 用 clawhup_doc_create_spec；quick 适用于 typo/微调场景，不支持 specLevel/parentDocId，Agent Task ≤5 步）
`title`	Yes	文档标题
`topic`	No	功能主题标签（如 user-auth）
`milestone`	No	迭代版本号（如 v1.0）
`projectId`	Yes	项目 ID
`description`	No	文档描述

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It states '创建' (create) which implies a write/mutation operation, but doesn't mention permissions, side effects, error conditions, or what happens on success (e.g., returns a document ID). This leaves significant gaps in understanding how the tool behaves beyond the basic action.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise—just two short sentences that directly state the purpose and reference another tool. There's no fluff or redundancy, making it efficient. However, the second sentence about 'Spec' feels slightly tacked on without clear integration, preventing a perfect score.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a creation tool with 7 parameters, no annotations, and no output schema, the description is insufficient. It doesn't cover behavioral aspects like permissions, side effects, or return values, nor does it explain the relationship with sibling tools. Given the complexity and lack of structured data, more context is needed for the agent to use this tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all 7 parameters with clear descriptions. The description adds no additional parameter information beyond what's in the schema, such as format examples or constraints. This meets the baseline for high schema coverage but doesn't enhance understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('创建产研文档') and the resource type (PRD/design/plan/decision/test/release/manual/postmortem), making the purpose specific and understandable. However, it doesn't explicitly differentiate from sibling tools like clawhup_doc_create_spec or clawhup_doc_update, which would be needed for a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like clawhup_doc_create_spec or clawhup_doc_update. It mentions 'Spec 请用 clawhup_doc_create_spec' which hints at an alternative for spec creation, but doesn't explain the relationship or when to choose one over the other, leaving usage unclear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_doc_create_specCInspect

创建 Spec 规格文档（需指定编码和级别）。

ParametersJSON Schema

Name	Required	Description
`code`	Yes	Spec 编码（唯一）
`title`	Yes	Spec 标题
`topic`	No	功能主题标签
`milestone`	No	迭代版本号
`projectId`	Yes	项目 ID
`specLevel`	Yes	Spec 级别: L0/L1/L2/L3
`description`	No	Spec 描述
`parentDocId`	No	父 Spec ID（L2 必须指定 L1，L3 必须指定 L2）

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It states this is a creation operation ('创建'), implying it's a write/mutation tool, but doesn't mention permissions, side effects, error conditions, or what happens on success/failure. For a mutation tool with 8 parameters and no annotation coverage, this is insufficient behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that conveys the core action and key requirements. There's no wasted language, though it could be slightly more informative given the tool's complexity. The structure is front-loaded with the main purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a creation tool with 8 parameters, no annotations, and no output schema, the description is inadequate. It doesn't explain what happens after creation (e.g., returns a document ID), doesn't mention validation rules beyond what's in the schema, and provides no error handling information. The agent would struggle to use this tool correctly without additional context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so all parameters are documented in the schema. The description adds minimal value by mentioning that code and level must be specified, which aligns with the required parameters in the schema. However, it doesn't provide additional context about parameter interactions (e.g., how parentDocId relates to specLevel) beyond what the schema already states.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('创建' - create) and resource ('Spec 规格文档'), making the purpose evident. However, it doesn't differentiate from sibling tools like 'clawhup_doc_create' or 'clawhup_spec_create', which appear to perform similar creation operations. The mention of '需指定编码和级别' (need to specify code and level) adds specificity but not enough to distinguish from alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'clawhup_doc_create' or 'clawhup_spec_create'. It mentions that code and level must be specified, which is a requirement but not usage context. There's no information about prerequisites, dependencies, or typical scenarios for invoking this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_doc_getBInspect

获取文档详情（含正文内容和关联文档）。

ParametersJSON Schema

Name	Required	Description	Default
`docId`	Yes	文档 ID
`includeContent`	No	true=返回正文 contentTemplate（默认 false,返回收窄视图）

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. The description mentions it retrieves document details including body content and related documents, which implies a read-only operation, but it doesn't explicitly state whether this requires authentication, has rate limits, or what happens on errors. For a tool with no annotation coverage, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded: a single sentence in Chinese that directly states the tool's purpose. There's no wasted verbiage or unnecessary elaboration, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (single required parameter, no output schema, no annotations), the description is minimally adequate. It explains what the tool does but lacks details on behavioral aspects like error handling or return format. For a simple retrieval tool, this might suffice, but it doesn't provide a complete picture for optimal agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the single parameter 'docId' described as '文档 ID' (document ID). The description doesn't add any additional parameter information beyond what the schema provides, such as format examples or constraints. With high schema coverage, the baseline score of 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '获取文档详情（含正文内容和关联文档）' translates to 'Get document details (including body content and related documents).' This specifies the verb (get/retrieve) and resource (document details) with additional scope information. However, it doesn't explicitly differentiate from sibling tools like 'clawhup_doc_list' or 'clawhup_doc_get' might be implied, but no direct comparison is made.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention when this tool is appropriate compared to other document-related tools like 'clawhup_doc_list' (for listing documents) or 'clawhup_doc_create' (for creating documents). There's no context about prerequisites or typical use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_doc_listCInspect

列出项目文档，支持按类型(prd/design/spec/adr/plan/testplan/release/runbook/postmortem)和功能主题筛选。默认隐藏 archived/deprecated 状态的 Spec，传 includeArchived=true 可查看全量。

ParametersJSON Schema

Name	Required	Description
`q`	No	关键词搜索，模糊匹配 title / code
`size`	No	每页条数，默认 50，最大 200
`type`	No	文档类型筛选: prd / design / spec / adr / plan / testplan / release / runbook / postmortem
`topic`	No	功能主题筛选（如 user-auth, model-gateway）
`cursor`	No	游标（首页不传）
`projectId`	Yes	项目 ID
`includeArchived`	No	true=包含 archived/deprecated spec，默认 false（仅返回活跃 spec）

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions filtering but doesn't describe pagination behavior (implied by 'cursor' parameter), rate limits, authentication requirements, or what the return format looks like. For a list operation with 6 parameters, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose and key filtering capabilities. There's no wasted verbiage, though it could be slightly more structured by separating purpose from filtering details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a list tool with 6 parameters and 100% schema coverage but no output schema, the description is minimally adequate. It covers what the tool does but lacks behavioral context (pagination, return format) and usage guidance. The absence of annotations and output schema means the description should do more to compensate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 6 parameters thoroughly. The description adds minimal value by mentioning type and topic filtering, but doesn't provide additional semantic context beyond what's in the schema. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb '列出' (list) and resource '项目文档' (project documents), making the purpose explicit. It also specifies filtering capabilities by type and topic, which adds useful detail. However, it doesn't differentiate from sibling tools like 'clawhup_doc_get' or 'clawhup_spec_list', which prevents a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'clawhup_doc_get' (for single document retrieval) or 'clawhup_spec_list' (for listing specs specifically). It mentions filtering capabilities but doesn't clarify usage context or prerequisites beyond implied project context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_doc_summarizeCInspect

为文档生成 AI 摘要（≤200字），结果存入 ai_summary 字段

ParametersJSON Schema

Name	Required	Description	Default
`docId`	Yes	文档 ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the summary is AI-generated and stored in a specific field, which is useful. However, it doesn't disclose critical traits: whether this is a read-only or mutating operation (likely mutating since it '存入' - stores to a field), potential side effects, authentication needs, rate limits, or error conditions. The description is insufficient for a tool that likely modifies document data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (one sentence) and front-loaded with the core action. Every word earns its place: it specifies the action (generate AI summary), target (document), output constraint (≤200 words), and storage location (ai_summary field). No wasted words or redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given this is likely a mutating tool (stores summary in document field) with no annotations and no output schema, the description is incomplete. It doesn't explain what happens on success (e.g., returns confirmation, updated document), failure modes, or important behavioral constraints. For a tool that modifies document data, more context about the operation's impact is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with one parameter (docId) clearly documented in the schema. The description adds no parameter-specific information beyond what the schema provides. According to scoring rules, when schema coverage is high (>80%), the baseline is 3 even with no param info in description, which applies here.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '为文档生成 AI 摘要' (generate AI summary for a document) with specific output characteristics (≤200 words) and storage location (ai_summary field). It distinguishes from siblings like clawhup_doc_get (retrieves document) and clawhup_doc_update (modifies document) by focusing on summarization. However, it doesn't explicitly contrast with other AI generation tools like clawhup_generate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., document must exist), compare to other summarization methods, or indicate scenarios where summarization is appropriate versus retrieving full content with clawhup_doc_get. Usage is implied but not explicitly stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_doc_updateCInspect

更新文档的描述或正文内容。

ParametersJSON Schema

Name	Required	Description
`docId`	Yes	文档 ID
`status`	No	Spec 状态: confirmed / frozen / implemented
`decision`	No	审核决策: approved / changes_requested / rejected / comment
`aiSummary`	No	Spec 摘要（≤300字），write-time 由 Claude Code 生成后上传
`description`	No	文档描述
`changeSummary`	No	修改原因，随 contentTemplate 更新时传入，自动写入修改记录
`reviewComment`	No	审核意见原文，status 变更时自动落库
`contentTemplate`	No	文档正文内容（Markdown）

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. While '更新' (update) implies a mutation operation, the description doesn't specify whether this requires specific permissions, whether the update is partial or complete, what happens to existing content not mentioned in parameters, or what the response looks like. For a mutation tool with zero annotation coverage, this leaves significant behavioral questions unanswered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that gets straight to the point. There's no wasted language or unnecessary elaboration. However, it could be slightly more specific about what '文档' (document) refers to in this context, given the sibling tools include various document-related operations.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with 3 parameters, no annotations, and no output schema, the description is inadequate. It doesn't explain the update behavior (partial vs. complete), doesn't mention error conditions or permissions required, and provides no information about what the tool returns. The agent would need to guess about important behavioral aspects of this update operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with all parameters clearly documented in the schema itself. The description mentions '描述或正文内容' (description or body content) which maps to the 'description' and 'contentTemplate' parameters, but adds no additional semantic context beyond what the schema already provides. The baseline score of 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states '更新文档的描述或正文内容' which translates to 'Update document description or body content' - this clearly indicates the verb (update) and resource (document). However, it doesn't distinguish this from sibling tools like 'clawhup_doc_create' or 'clawhup_doc_get', nor does it specify what type of document is being updated. The purpose is understandable but lacks differentiation from related tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites, when this should be used instead of creating a new document, or how it relates to other document operations like 'clawhup_doc_create' or 'clawhup_doc_get'. The agent must infer usage from the tool name and description alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_generate_streamDInspect

流式生成并返回聚合结果。

ParametersJSON Schema

Name	Required	Description	Default
`promptId`	Yes
`provider`	No
`modelName`	No
`sessionId`	No
`variables`	No
`materialLibraryIds`	No

Tool Definition Quality

D1.6/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure but offers minimal information. It mentions 'streaming' and 'aggregated results' but doesn't explain what streaming entails (real-time chunks? progressive output?), what 'aggregated' means (combined from multiple sources? summarized?), authentication requirements, rate limits, side effects, or error behavior. For a 6-parameter tool with complex inputs, this is inadequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise - a single Chinese sentence. While this is efficient and front-loaded, it's arguably too brief given the tool's complexity. However, within its limited scope, it wastes no words and gets straight to the point about streaming generation with aggregation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 6 parameters (including nested objects), no annotations, no output schema, and 0% schema description coverage, the description is completely inadequate. It doesn't explain what the tool actually does beyond the name, provides no parameter guidance, no behavioral context, and no differentiation from related tools. The agent would struggle to use this correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% (no parameter descriptions in schema), and the tool description provides absolutely no information about any of the 6 parameters (promptId, provider, modelName, sessionId, variables, materialLibraryIds). The description doesn't explain what these parameters mean, their relationships, or how they affect the streaming generation process. With 0% schema coverage, the description fails completely to compensate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '流式生成并返回聚合结果' (streaming generation and returns aggregated results) is vague and tautological - it essentially restates the name 'clawhup_generate_stream' without specifying what is being generated, what resources are involved, or how it differs from sibling tools like 'clawhup_generate' or 'clawhup_execute_prompt'. It lacks specific verb+resource clarity and sibling differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There are multiple related tools (clawhup_generate, clawhup_execute_prompt, clawhup_create_prompt) but no indication of when this streaming generation approach is appropriate versus batch generation or other methods. No context, prerequisites, or exclusions are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_get_prompt_versionsCInspect

查询 Prompt 版本列表。

ParametersJSON Schema

Name	Required	Description	Default
`promptId`	Yes

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It implies a read-only query operation ('查询'), but doesn't specify permissions, rate limits, pagination, or return format. For a tool with no annotation coverage, this leaves significant gaps in understanding its behavior and constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that directly states the tool's purpose with zero waste. It's appropriately sized and front-loaded, making it easy to parse quickly without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a query operation with one parameter), lack of annotations, no output schema, and 0% schema coverage, the description is incomplete. It doesn't provide enough context for an agent to understand how to use it effectively, missing details on parameters, behavior, and output expectations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no parameter information beyond what the schema provides. With 0% schema description coverage and one required parameter ('promptId'), the description doesn't explain what 'promptId' is, its format, or how it relates to the query. This fails to compensate for the low schema coverage, leaving the parameter undocumented.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '查询 Prompt 版本列表' (Query Prompt version list) clearly states the action (query/list) and resource (Prompt versions), but it's vague about scope and doesn't differentiate from sibling tools like 'clawhup_get_prompt_version' (singular) or 'clawhup_prompt_detail'. It specifies the verb and resource but lacks precision on what 'list' entails compared to similar tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. With siblings like 'clawhup_get_prompt_version' (singular), 'clawhup_prompt_detail', and 'clawhup_list_my_prompts', the description offers no context on selection criteria, prerequisites, or exclusions, leaving usage ambiguous.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_list_decisionsBInspect

按 topic 列出 active 决策卡片，L1 创建前必须调用此工具。

ParametersJSON Schema

Name	Required	Description	Default
`topic`	Yes	按 topic 查询 active 决策卡片，L1 创建前必须调用此工具

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions that the tool lists 'active' decision cards, implying a read-only operation, but doesn't clarify what 'active' means, whether there are pagination limits, authentication requirements, rate limits, or error conditions. For a tool with zero annotation coverage, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core functionality ('按 topic 列出 active 决策卡片') followed by the usage guideline. There's no wasted text, and it's appropriately sized for a simple listing tool. However, it could be slightly more structured by separating the purpose and guideline into distinct parts for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (simple listing with one parameter), 100% schema coverage, and no output schema, the description is minimally adequate. It covers the purpose and a key usage guideline but lacks behavioral details (e.g., what 'active' means, return format) that would be helpful for an AI agent. Without annotations or output schema, more context about the tool's behavior would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the parameter 'topic' documented as '按 topic 查询 active 决策卡片，L1 创建前必须调用此工具'. The description adds no additional parameter information beyond what's already in the schema. According to the rules, when schema_description_coverage is high (>80%), the baseline score is 3 even with no param info in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '按 topic 列出 active 决策卡片' (list active decision cards by topic). It specifies the verb ('列出' - list) and resource ('决策卡片' - decision cards) with a filtering condition ('active' and 'by topic'). However, it doesn't explicitly differentiate from sibling tools like 'clawhup_get_decision' or 'clawhup_list_sessions', which might have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: 'L1 创建前必须调用此工具' (must call this tool before L1 creation). This indicates a prerequisite or mandatory step in a workflow. However, it doesn't specify when NOT to use it or mention alternatives among the many sibling tools, which could help avoid confusion in tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_list_my_promptsCInspect

列出当前用户的 Prompt。

ParametersJSON Schema

Name	Required	Description	Default
`outputType`	No

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states this is a list operation, implying it's likely read-only and non-destructive, but doesn't confirm this or describe other behaviors like pagination, sorting, or error handling. The description adds minimal value beyond the basic action.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that directly states the tool's purpose without any unnecessary words. It's appropriately sized and front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a list operation with one parameter), lack of annotations, and no output schema, the description is incomplete. It doesn't explain what 'prompts' are in this context, how results are returned, or the purpose of the 'outputType' parameter, leaving significant gaps for an AI agent to understand and use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description mentions no parameters, while the input schema has one parameter ('outputType') with an enum but 0% schema description coverage. Since no parameters are described, the baseline is 4, but this is reduced to 3 because the schema provides some structure (enum values) that the description doesn't explain or contextualize.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('列出' meaning 'list') and resource ('当前用户的 Prompt' meaning 'current user's prompts'), providing a specific verb+resource combination. However, it doesn't distinguish this tool from sibling tools like 'clawhup_search_prompts' or 'clawhup_prompt_detail', which might offer similar functionality with different scopes or filters.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'clawhup_search_prompts' for filtered searches or 'clawhup_prompt_detail' for detailed views, nor does it specify any prerequisites or contexts for usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_project_listBInspect

列出所有项目。

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. '列出所有项目' implies a read-only operation that returns all projects, but it doesn't specify critical details like pagination, sorting, rate limits, authentication requirements, or error conditions. For a list operation with zero annotation coverage, this is inadequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence ('列出所有项目') that directly states the tool's purpose with zero wasted words. It's front-loaded and perfectly sized for a simple list tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (0 parameters, no output schema), the description is minimally complete but lacks depth. It doesn't address behavioral aspects (e.g., return format, pagination) or usage context, which are important even for simple tools. With no annotations and no output schema, more detail would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0 parameters with 100% coverage, so no parameter documentation is needed. The description doesn't add parameter details, which is appropriate. A baseline of 4 is given because the schema fully handles parameters, and the description doesn't need to compensate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '列出所有项目' (List all projects) clearly states the verb ('list') and resource ('projects'), making the purpose immediately understandable. It distinguishes from siblings like 'clawhup_doc_list' or 'clawhup_spec_list' by specifying 'projects' as the resource. However, it doesn't explicitly differentiate from potential project-related siblings (none are listed), so it's not a perfect 5.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., authentication), when it's appropriate (e.g., for browsing vs. searching), or what to use instead (e.g., if you need filtered results). With many sibling tools present, this lack of context is a significant gap.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_prompt_detailAInspect

获取 ClawHup Prompt 详情，包含所需变量（参数）说明。执行前先调用此工具了解参数。

ParametersJSON Schema

Name	Required	Description	Default
`promptId`	Yes	Prompt ID

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It states this is a retrieval operation ('获取' - get) which implies it's likely read-only, but doesn't explicitly confirm this or mention other behavioral traits like authentication requirements, rate limits, or error conditions. The description adds some context about its preparatory role, but lacks comprehensive behavioral disclosure for a tool with no annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two clear sentences that each serve distinct purposes: the first states what the tool does, the second provides usage guidance. There's no wasted language or redundancy. It could potentially be slightly more structured by separating the purpose and guidance more explicitly, but it's efficiently written.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter retrieval tool with no annotations and no output schema, the description provides adequate but minimal context. It explains the purpose and when to use it, but doesn't describe what the returned '详情' (details) include or the format of the response. Given the tool's apparent simplicity and the schema's good coverage, the description is complete enough for basic understanding but lacks output information that would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with one parameter (promptId) fully documented in the schema. The description mentions '包含所需变量（参数）说明' (including required variable/parameter explanations), which aligns with but doesn't add meaningful semantics beyond what the schema already provides. Since schema coverage is high, the baseline score of 3 is appropriate - the description doesn't compensate but doesn't need to.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '获取 ClawHup Prompt 详情，包含所需变量（参数）说明' (Get ClawHup Prompt details, including required variable/parameter explanations). It specifies the verb ('获取' - get/retrieve) and resource ('ClawHup Prompt 详情' - ClawHup Prompt details), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'clawhup_get_prompt_version' or 'clawhup_get_prompt_versions', which appear to serve similar retrieval functions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: '执行前先调用此工具了解参数' (Call this tool before execution to understand parameters). This clearly indicates when to use this tool (as a prerequisite step before executing other operations that require parameter knowledge) and implies its preparatory nature. The guidance is specific and actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_save_prompt_versionDInspect

保存 Prompt 版本。

ParametersJSON Schema

Name	Required	Default
`tags`	No
`title`	No
`content`	Yes
`promptId`	Yes
`outputType`	No
`description`	No
`changeSummary`	No	MCP save version
`timeoutSeconds`	No

Tool Definition Quality

D1.3/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. The description '保存 Prompt 版本' (Save Prompt version) implies a write operation that creates or persists data, but it doesn't specify whether this is a destructive action, what permissions are required, what happens on success/failure, or any rate limits. For a mutation tool with zero annotation coverage, this lack of behavioral information is a significant gap that could lead to misuse.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise ('保存 Prompt 版本。') but this is under-specification rather than effective brevity. It's a single short sentence that fails to convey necessary information, making it inefficient rather than well-structured. While it's front-loaded (the only sentence), it doesn't earn its place by adding value beyond the tool name.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (8 parameters, 2 required), zero schema description coverage, no annotations, and no output schema, the description is completely inadequate. It doesn't explain what the tool does beyond the name, provides no parameter guidance, offers no behavioral context for a mutation operation, and gives no indication of return values or error conditions. This leaves the agent with insufficient information to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 8 parameters with 0% description coverage, meaning none of the parameters have descriptions in the schema. The tool description provides no information about any parameters—it doesn't mention 'promptId', 'content', 'tags', 'title', or other fields. With 0% schema coverage and no parameter guidance in the description, the agent has no semantic understanding of what these parameters mean or how to use them correctly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '保存 Prompt 版本' (Save Prompt version) is a tautology that essentially restates the tool name 'clawhup_save_prompt_version' in Chinese. It doesn't specify what 'saving a prompt version' entails operationally or what resource it acts upon beyond the obvious. While it indicates a verb ('保存' - save) and resource ('Prompt 版本' - prompt version), it lacks specificity about what constitutes a prompt version or how this differs from related tools like 'clawhup_update_prompt' or 'clawhup_create_prompt'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There are multiple sibling tools related to prompts (e.g., 'clawhup_create_prompt', 'clawhup_update_prompt', 'clawhup_get_prompt_version', 'clawhup_rollback_prompt_version'), but the description doesn't clarify when saving a version is appropriate versus creating a new prompt or updating an existing one. No context, prerequisites, or exclusions are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_search_promptsBInspect

搜索 ClawHup 平台上的 AI Prompt。支持关键词搜索、按类型/标签筛选。

ParametersJSON Schema

Name	Required	Description
`tag`	No	标签筛选
`limit`	No	返回数量，默认 10
`query`	No	搜索关键词
`outputType`	No	输出类型筛选

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions the tool '支持关键词搜索、按类型/标签筛选' (supports keyword search and filtering by type/tag), which gives some behavioral context about filtering capabilities. However, it lacks details on permissions, rate limits, pagination (beyond the 'limit' parameter), or what the return format looks like (no output schema). For a search tool with zero annotation coverage, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose ('搜索 ClawHup 平台上的 AI Prompt') and adds supporting details ('支持关键词搜索、按类型/标签筛选'). There is no wasted text, and it's appropriately sized for a search tool with clear parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (search with filtering), 100% schema coverage, no annotations, and no output schema, the description is adequate but has gaps. It covers the purpose and basic capabilities, but lacks behavioral context (e.g., permissions, rate limits) and output details, which are important for a search operation. It's minimally viable but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, with all parameters documented in the schema (query for keywords, tag for filtering, limit for result count, outputType for type filtering). The description adds minimal value by mentioning '关键词搜索、按类型/标签筛选' (keyword search and filtering by type/tag), which loosely maps to 'query', 'tag', and 'outputType', but doesn't provide additional semantics beyond what the schema already states. Baseline 3 is appropriate given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('搜索' meaning 'search') and resource ('ClawHup 平台上的 AI Prompt'), specifying it searches for AI Prompts on the ClawHup platform. It distinguishes itself from siblings like 'clawhup_prompt_detail' (which gets details) and 'clawhup_list_my_prompts' (which lists without search), but doesn't explicitly name these alternatives, keeping it at 4 instead of 5.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by mentioning support for keyword search and filtering by type/tag, suggesting it's for finding prompts with specific criteria. However, it doesn't explicitly state when to use this tool versus alternatives like 'clawhup_list_my_prompts' or 'clawhup_prompt_detail', nor does it provide exclusions or prerequisites, making it only implied guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clawhup_update_prompt_contentDInspect

更新 Prompt 模板内容。

ParametersJSON Schema

Name	Required	Description	Default
`content`	Yes
`promptId`	Yes

Tool Definition Quality

D1.7/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden for behavioral disclosure. It only states '更新 Prompt 模板内容' (Update Prompt template content), which implies a mutation operation but provides no details about permissions required, whether changes are reversible, rate limits, error conditions, or what happens to existing content. This is inadequate for a mutation tool with zero annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at just 6 Chinese characters ('更新 Prompt 模板内容'), which translates to 'Update Prompt template content'. It's front-loaded with the core action and resource. While it's under-specified, every character serves a purpose without waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description is completely inadequate for a mutation tool with 2 required parameters, 0% schema description coverage, no annotations, and no output schema. It doesn't explain what the tool does beyond the obvious, provides no parameter guidance, no behavioral context, and no differentiation from similar tools. This leaves the agent with insufficient information to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning neither parameter (promptId, content) has descriptions in the schema. The description adds no information about what these parameters mean, their format, constraints, or examples. For a tool with 2 required parameters and no schema documentation, this represents a significant gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '更新 Prompt 模板内容' (Update Prompt template content) is a tautology that restates the tool name 'clawhup_update_prompt_content' in Chinese. It specifies the verb 'update' and resource 'Prompt template content', but doesn't differentiate from sibling tools like 'clawhup_update_prompt' or 'clawhup_prompt_detail'. The purpose is clear but lacks distinction from alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, context, or exclusions. With sibling tools like 'clawhup_update_prompt' and 'clawhup_prompt_detail', there's no indication of how this tool differs or when it should be selected.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?