clawhup
Server Details
Description: ▎ Spec-driven AI development workflow tools for Claude Code. Manage PRD → Design → Impl → Agent Task lifecycles, search and execute AI Prompts, and ▎ track traceable software delivery — all from Cl
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 2.2/5 across 60 of 60 tools scored. Lowest: 1.3/5.
Most tools have distinct purposes, but there is notable overlap between some pairs, such as clawhup_generate and clawhup_execute_prompt, which both generate content from prompts, and clawhup_doc_create and clawhup_spec_create, which both create documents. The descriptions help differentiate them, but an agent might still misselect in ambiguous scenarios.
All tool names follow a consistent snake_case pattern with a clawhup_ prefix and a clear verb_noun structure, such as clawhup_create_session or clawhup_get_prompt_version. This uniformity makes the tool set predictable and easy to navigate.
With 60 tools, the count is excessive for a single server, making it overwhelming and difficult for an agent to manage. This many tools suggests poor scoping, as the server covers multiple domains like agent tasks, prompts, documents, and materials, which could be better split into separate, focused servers.
The tool set provides comprehensive coverage for its domains, including full CRUD operations for sessions, prompts, materials, and documents, along with specialized actions like generation and sharing. Minor gaps exist, such as no explicit update for shares or detailed agent task management beyond basic states, but agents can generally work around these.
Available Tools
60 toolsclawhup_add_material_itemDInspect
新增素材项。
| Name | Required | Description | Default |
|---|---|---|---|
| key | No | ||
| type | Yes | ||
| value | Yes | ||
| weight | No | ||
| enabled | No | ||
| libraryId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. '新增' (add/insert) implies a write operation, but the description fails to disclose any behavioral traits: it doesn't mention authentication requirements, whether this is idempotent, what happens on duplicate keys, what the return value looks like, or any error conditions. For a mutation tool with 6 parameters and no annotations, this complete lack of behavioral information is inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is maximally concise - a single phrase with no wasted words. While this conciseness comes at the cost of being under-specified, from a pure structural perspective, every character serves the core purpose of stating the tool's action. There is no unnecessary elaboration or redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (6 parameters, 3 required, no output schema, no annotations, and 0% schema description coverage), this description is completely inadequate. It fails to explain what the tool does beyond the name, provides no parameter guidance, no behavioral context, and no usage guidelines. For a creation/mutation tool with multiple parameters, this minimal description leaves the agent unable to use the tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 0%, meaning none of the 6 parameters have descriptions in the schema. The tool description provides absolutely no information about any parameters - it doesn't explain what 'libraryId', 'type', 'value', 'key', 'weight', or 'enabled' mean, their expected formats, or how they interact. With zero parameter guidance in either schema or description, the agent cannot understand what values to provide.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '新增素材项' (Add material item) is a tautology that essentially restates the tool name 'clawhup_add_material_item'. It provides no additional specificity about what 'material item' means, what resource it operates on, or how it differs from sibling tools like 'clawhup_create_material_library' or 'clawhup_patch_material_item'. The description is too vague to be helpful beyond the name itself.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides absolutely no guidance on when to use this tool versus alternatives. There is no mention of prerequisites, appropriate contexts, or distinctions from related tools like 'clawhup_patch_material_item' (for updates) or 'clawhup_list_material_items' (for reading). The agent receives no help in determining when this specific creation tool should be selected.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_agent_step_reportCInspect
上报 Agent 步骤执行结果。
| Name | Required | Description | Default |
|---|---|---|---|
| status | Yes | 步骤状态: succeeded / failed / skipped | |
| stepNo | Yes | 步骤编号 | |
| taskId | Yes | 任务 ID | |
| stepType | Yes | 步骤类型: llm_call / mcp_tool / human_gate | |
| toolName | No | 工具名称(stepType=mcp_tool 时) | |
| errorCode | No | 错误代码 | |
| latencyMs | No | 耗时(毫秒) | |
| outputJson | No | 步骤输出 JSON | |
| errorMessage | No | 错误信息 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden of behavioral disclosure. It states 'reporting' but doesn't clarify if this is a write operation, what system it updates, whether it's idempotent, or what happens on failure. This is inadequate for a tool with 9 parameters and no output schema, as critical behavioral traits are missing.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that directly states the tool's purpose without redundancy. It's appropriately sized and front-loaded, with no wasted words, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (9 parameters, no annotations, no output schema), the description is incomplete. It doesn't explain the reporting mechanism, expected outcomes, error handling, or how it integrates with the agent workflow. This leaves significant gaps for the agent to understand the tool's role and behavior in context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with each parameter well-documented in the schema (e.g., 'status' with allowed values, 'stepType' with types). The description adds no additional parameter semantics beyond what the schema provides, so it meets the baseline of 3 for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '上报 Agent 步骤执行结果' (Report Agent step execution results) states the verb ('report') and resource ('Agent step execution results'), but it's vague about what 'reporting' entails—whether it's logging, updating a database, or sending notifications. It doesn't differentiate from siblings like 'clawhup_agent_task_complete' or 'clawhup_agent_task_fail', which might handle task-level outcomes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites (e.g., after a step execution), exclusions, or how it relates to sibling tools for task-level reporting, leaving the agent to infer usage from context alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_agent_task_completeCInspect
标记 Agent 任务完成。
| Name | Required | Description | Default |
|---|---|---|---|
| taskId | Yes | 任务 ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states 'mark complete' which implies a mutation (status update), but doesn't disclose behavioral traits such as permissions required, whether it's idempotent, side effects (e.g., notifications), or error handling. This is a significant gap for a mutation tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence '标记 Agent 任务完成' (Mark Agent task complete), which is appropriately concise and front-loaded. It wastes no words, though it could be more informative without sacrificing brevity. Every word earns its place, but it's slightly under-specified.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given complexity (mutation tool with no annotations and no output schema), the description is incomplete. It doesn't explain what 'marking complete' does behaviorally, return values, or error conditions. For a tool that likely changes task state, more context is needed to understand its full impact and usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'taskId' documented as '任务 ID' (task ID). The description adds no meaning beyond the schema—it doesn't explain format constraints, valid ranges, or how the ID relates to task states. Baseline 3 is appropriate since the schema does the heavy lifting, but no extra value is added.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '标记 Agent 任务完成' (Mark Agent task complete) states a clear verb ('mark complete') and resource ('Agent task'), but it's vague about what 'marking complete' entails—does it update status, trigger notifications, or finalize something? It distinguishes from siblings like 'clawhup_agent_task_fail' (fail) and 'clawhup_agent_task_start' (start) by indicating completion, but lacks specificity on the action's scope or effects.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., task must be started), exclusions (e.g., don't use if already completed), or related tools like 'clawhup_agent_task_fail' for error cases. The description implies usage after task execution but offers no explicit context or rules.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_agent_task_createCInspect
创建 Agent 任务。
| Name | Required | Description | Default |
|---|---|---|---|
| goal | No | 任务目标详细描述 | |
| title | Yes | 任务标题 | |
| specId | No | 绑定的规格编号 | |
| pipeline | No | 链路模板: feature_dev / release_doc / bug_fix | |
| planJson | No | 执行计划 JSON | |
| projectId | Yes | 项目 ID | |
| autoConfirm | No | true=human_gate 自动通过,Spec 已确认时使用 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. '创建' (Create) implies a write/mutation operation, but the description doesn't specify permissions required, whether the task is immediately active, what happens on failure, or the expected response format. For a creation tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise ('创建 Agent 任务。') but this brevity comes at the cost of being under-specified rather than efficient. While it's front-loaded (the entire description is in one phrase), it fails to provide necessary context that would help an agent understand and use the tool effectively.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a task creation tool with 7 parameters, no annotations, and no output schema, the description is incomplete. It doesn't explain what an 'Agent task' is in this context, how it relates to other agent operations, what the creation process entails, or what the agent should expect as a result. The minimal description leaves too many open questions for effective tool selection and invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the input schema fully documents all 7 parameters with clear descriptions. The tool description adds no additional parameter information beyond what's already in the schema. According to guidelines, when schema coverage is high (>80%), the baseline score is 3 even with no param info in the description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '创建 Agent 任务' (Create Agent task) is a tautology that merely restates the tool name in Chinese. It lacks specificity about what an 'Agent task' entails, what resources it creates, or how it differs from sibling tools like clawhup_agent_task_start or clawhup_agent_task_complete. The purpose is vague and provides no meaningful distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple sibling tools related to agent tasks (e.g., clawhup_agent_task_start, clawhup_agent_task_complete, clawhup_agent_task_get), but the description offers no context about prerequisites, sequencing, or appropriate scenarios for creating a task versus other operations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_agent_task_failCInspect
标记 Agent 任务失败。
| Name | Required | Description | Default |
|---|---|---|---|
| taskId | Yes | 任务 ID | |
| errorCode | No | 错误代码 | |
| errorMessage | No | 错误信息 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. '标记 Agent 任务失败' suggests a mutation operation that changes task state to failed, but doesn't describe side effects (e.g., whether this triggers notifications, affects related tasks, or is reversible), authentication needs, rate limits, or error handling. For a mutation tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that directly states the tool's purpose with no wasted words. It's appropriately sized and front-loaded, making it easy to understand at a glance without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a mutation operation with 3 parameters, no annotations, and no output schema), the description is incomplete. It doesn't explain what happens after marking a task as failed, potential return values, error conditions, or how it integrates with the broader task management system. For a tool that likely changes system state, more context is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for all three parameters (taskId, errorCode, errorMessage). The description doesn't add any parameter-specific information beyond what the schema provides, such as format examples or usage context. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '标记 Agent 任务失败' (Mark Agent task as failed) states a clear verb ('标记' - mark) and resource ('Agent 任务' - Agent task), but it's somewhat vague about what 'marking as failed' entails operationally. It distinguishes from siblings like 'clawhup_agent_task_complete' by indicating failure vs. completion, but lacks specificity about the failure mechanism or system impact.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives is provided. The description implies it's for marking tasks as failed, but doesn't specify prerequisites (e.g., task must be in progress), conditions for failure, or how it differs from simply not completing a task. Sibling tools like 'clawhup_agent_task_complete' suggest a workflow, but no context is given for choosing between them.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_agent_task_getCInspect
获取 Agent 任务详情。
| Name | Required | Description | Default |
|---|---|---|---|
| taskId | Yes | 任务 ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It implies a read operation ('获取' - get) but doesn't disclose behavioral traits like whether it's safe (non-destructive), requires authentication, has rate limits, or returns structured data. For a tool with no annotations, this leaves critical gaps in understanding its behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese ('获取 Agent 任务详情'), which is appropriately sized and front-loaded. There's no wasted text, making it concise, though it could benefit from slight elaboration for clarity without losing brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a read operation with one parameter) and lack of annotations and output schema, the description is incomplete. It doesn't explain what 'details' include, potential errors, or return format, leaving the agent with insufficient context to use it effectively beyond basic parameter input.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with 'taskId' documented as '任务 ID' (task ID). The description adds no meaning beyond this, as it doesn't explain parameter usage, format, or constraints. With high schema coverage, the baseline is 3, but the description doesn't compensate with additional insights.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '获取 Agent 任务详情' (Get Agent task details) states a clear verb ('获取' - get) and resource ('Agent 任务详情' - Agent task details), but it's vague about what 'details' include and doesn't distinguish from sibling tools like 'clawhup_agent_task_list' (which likely lists tasks) or 'clawhup_agent_task_complete' (which modifies tasks). It's adequate but lacks specificity about scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a task ID), differentiate from 'clawhup_agent_task_list' for listing vs. getting details, or specify use cases like retrieving status or metadata. Without such context, an agent might misuse it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_agent_task_listCInspect
列出 Agent 任务。
| Name | Required | Description | Default |
|---|---|---|---|
| q | No | 关键词搜索 | |
| size | No | 每页条数,默认 50,最大 200 | |
| cursor | No | 游标(首页不传) | |
| projectId | No | 项目 ID(不传则查全部) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. The description only states '列出 Agent 任务' (List Agent tasks), which implies a read-only operation but doesn't disclose any behavioral traits like pagination behavior (though the schema hints at cursor-based pagination), authentication requirements, rate limits, or what constitutes an 'Agent task' in this context. For a tool with no annotations, this minimal description is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise at just three Chinese characters ('列出 Agent 任务'), which is appropriately brief for a simple listing tool. It's front-loaded with the core action ('列出' - list) and avoids unnecessary elaboration. However, it's arguably too minimal given the lack of sibling differentiation and behavioral context, slightly reducing its effectiveness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (4 parameters, no output schema, no annotations, and multiple sibling tools), the description is incomplete. It doesn't explain what 'Agent tasks' are, how they differ from other task types in the system, what the output format looks like, or when to use this versus other agent task tools. While the schema covers parameters well, the description fails to provide necessary context for proper tool selection and understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with all four parameters ('q', 'size', 'cursor', 'projectId') clearly documented in the schema itself. The description adds no additional parameter information beyond what's already in the schema. According to the scoring rules, when schema_description_coverage is high (>80%), the baseline is 3 even with no param info in the description, which applies here.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '列出 Agent 任务' (List Agent tasks) is a tautology that essentially restates the tool name 'clawhup_agent_task_list' without adding meaningful differentiation. While it clearly indicates a listing operation, it doesn't specify what 'Agent tasks' are or how this differs from sibling tools like 'clawhup_agent_task_get' (which presumably retrieves a single task). The purpose is vague beyond the basic verb-noun pairing.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple sibling tools related to agent tasks (e.g., 'clawhup_agent_task_get' for single task retrieval, 'clawhup_agent_task_create' for creation), but the description offers no context about when this list tool is appropriate versus those other options. This leaves the agent without any usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_agent_task_startCInspect
启动 Agent 任务。
| Name | Required | Description | Default |
|---|---|---|---|
| taskId | Yes | 任务 ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full responsibility for behavioral disclosure but offers none. '启动' (start) implies a state-changing operation, but the description doesn't reveal whether this initiates execution, changes status, requires specific permissions, has side effects, or what happens to the task afterward. For a mutation tool with zero annotation coverage, this represents a critical information gap about its behavior and consequences.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise - a single four-character phrase in Chinese. While this represents efficient communication, it's arguably too brief given the tool's likely complexity. However, every character serves a purpose (verb + object), and there's no redundant information. The structure is front-loaded but lacks any elaboration that might be necessary for clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a mutation tool (implied by 'start') with no annotations and no output schema, the description is inadequate. It doesn't explain what 'starting' means operationally, what state changes occur, what permissions are required, or what the expected outcome is. The single parameter is well-documented in the schema, but the core behavioral context is missing for what appears to be a state transition operation in a workflow system.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100% with the single parameter 'taskId' documented as '任务 ID' (task ID). The description doesn't add any parameter information beyond what's already in the schema. Since schema coverage is high, the baseline score of 3 is appropriate - the schema does the parameter documentation work, and the description neither compensates nor detracts.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '启动 Agent 任务' (Start Agent task) is a tautology that essentially restates the tool name 'clawhup_agent_task_start' without adding meaningful clarification. It doesn't specify what 'starting' entails operationally or what resource is being acted upon beyond the generic 'Agent task' mentioned in the name. While it identifies the basic action, it fails to distinguish this tool from sibling tools like 'clawhup_agent_task_create' or 'clawhup_agent_task_complete'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple related agent task tools (create, get, list, complete, fail, wait), but the description doesn't indicate whether this should be used after creation, instead of creation, or in what workflow sequence. No prerequisites, timing considerations, or exclusion criteria are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_agent_task_waitCInspect
Agent 任务进入等待人工确认状态。
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | 等待原因 | |
| taskId | Yes | 任务 ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states the outcome (task enters waiting state) but doesn't describe what this means operationally: Does it pause task execution? Send notifications? Require specific permissions? How long can it wait? What happens to associated resources? For a state-changing tool with zero annotation coverage, this is insufficient behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient Chinese sentence that directly states the tool's purpose. It's appropriately sized for what it conveys, though it could be more front-loaded with additional context. No wasted words or redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a state-changing tool with no annotations and no output schema, the description is incomplete. It doesn't explain what 'waiting for manual confirmation' entails operationally, what permissions are required, what the expected response looks like, or how this differs from other task state transitions. Given the complexity of task management and lack of structured metadata, more context is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters (taskId and reason). The description doesn't add any parameter semantics beyond what's in the schema - it doesn't explain what constitutes a valid taskId, provide examples of waiting reasons, or clarify parameter interactions. Baseline 3 is appropriate when schema does the documentation work.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Agent 任务进入等待人工确认状态' (Agent task enters waiting for manual confirmation state) states what the tool does - it transitions an agent task to a waiting state requiring human confirmation. However, it's somewhat vague about the exact action (is it a state change command? a notification?) and doesn't clearly distinguish from sibling tools like clawhup_agent_task_complete or clawhup_agent_task_fail which also manage task states.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites (e.g., task must be in a specific state), when this is appropriate versus completing/failing a task, or what triggers the need for human confirmation. The agent must infer usage from the name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_append_session_messageDInspect
写入会话消息。
| Name | Required | Description | Default |
|---|---|---|---|
| role | No | user | |
| content | Yes | ||
| modelName | No | ||
| sessionId | Yes | ||
| modelProvider | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. '写入' (write) implies a mutation operation, but the description doesn't disclose whether this requires specific permissions, what happens if the session doesn't exist, whether the operation is idempotent, or what the response looks like. For a mutation tool with zero annotation coverage, this is a significant gap in behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise (three Chinese characters), which could be efficient if it were informative. However, this brevity comes at the cost of under-specification—it doesn't provide enough information to be truly helpful. While front-loaded, it lacks the necessary detail to earn a higher score for conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
This is a mutation tool with 5 parameters, 0% schema description coverage, no annotations, and no output schema. The description provides only a basic verb+resource statement without explaining behavior, parameters, usage context, or return values. For a tool of this complexity with no structured documentation support, the description is completely inadequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning none of the 5 parameters have descriptions in the schema. The tool description provides no information about any parameters—not explaining what 'sessionId', 'content', 'role', 'modelName', or 'modelProvider' represent or how they should be used. With 0% coverage and no parameter information in the description, this fails to compensate for the documentation gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '写入会话消息' (write session message) is a tautology that essentially restates the tool name 'clawhup_append_session_message' in Chinese. It specifies the verb 'write' and resource 'session message' but lacks specificity about what 'append' means operationally and doesn't distinguish this tool from sibling tools like 'clawhup_get_session_messages' or 'clawhup_create_session'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple sibling tools related to sessions (e.g., 'clawhup_get_session_messages', 'clawhup_create_session', 'clawhup_patch_session'), but the description offers no context about when this append operation is appropriate versus creating a new session or retrieving existing messages.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_create_benchmark_runDInspect
创建评测任务。
| Name | Required | Description | Default |
|---|---|---|---|
| models | Yes | ||
| promptId | Yes | ||
| sessionId | No | ||
| variables | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. The description only states '创建评测任务' (Create benchmark task), which implies a write operation but offers no details on permissions, side effects, rate limits, or what the creation entails. This is insufficient for a tool with 4 parameters and no output schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single phrase '创建评测任务' (Create benchmark task), which is concise and front-loaded. However, it is overly brief to the point of under-specification, as it lacks necessary details for effective tool use, making it inefficient rather than appropriately sized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (4 parameters with nested objects, 0% schema coverage, no annotations, no output schema), the description is completely inadequate. It doesn't explain what the tool does beyond the name, provide parameter meanings, describe behavior, or guide usage, failing to meet minimal contextual needs for a creation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 0%, meaning none of the 4 parameters (promptId, models, sessionId, variables) are documented in the schema. The description adds no information about these parameters, failing to compensate for the lack of schema documentation. This leaves the agent with no understanding of what inputs are needed or their purposes.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '创建评测任务' (Create benchmark task) restates the tool name 'clawhup_create_benchmark_run' without providing specific details about what the benchmark run entails. It doesn't distinguish this tool from sibling tools like 'clawhup_get_benchmark_result' or 'clawhup_patch_benchmark_item', nor does it clarify what resources are involved beyond the generic term 'benchmark task'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are no mentions of prerequisites, context, or exclusions, and it doesn't reference sibling tools like 'clawhup_get_benchmark_result' for retrieving results or 'clawhup_patch_benchmark_item' for updates, leaving the agent with no usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_create_material_libraryDInspect
创建素材库。
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | ||
| isDefault | No | ||
| description | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. The description only states '创建素材库' (Create material library), which implies a write/mutation operation but doesn't disclose any behavioral traits such as permissions required, whether it's idempotent, what happens on failure, rate limits, or what the output looks like. It lacks essential context for a creation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single phrase '创建素材库', which is overly concise to the point of under-specification. While it has no wasted words, it lacks necessary detail and structure—it doesn't front-load key information or provide any explanatory context, making it ineffective despite its brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a creation operation with 3 parameters), no annotations, 0% schema description coverage, and no output schema, the description is completely inadequate. It doesn't cover purpose, usage, behavior, parameters, or output, leaving the agent with insufficient information to invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description provides no information about parameters. With 3 parameters (name, isDefault, description), 0% schema description coverage, and no output schema, the description fails to add any meaning beyond the schema. It doesn't explain what 'name', 'isDefault', or 'description' represent, their constraints, or how they affect the creation process.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '创建素材库' (Create material library) restates the tool name 'clawhup_create_material_library' in Chinese, making it a tautology. It doesn't specify what 'material library' means in this context or what resources it creates, nor does it differentiate from sibling tools like 'clawhup_list_material_libraries' or 'clawhup_delete_material_library'. The purpose is vague beyond the basic verb+noun translation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
There is no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., permissions), when not to use it, or how it relates to sibling tools like 'clawhup_list_material_libraries' for viewing or 'clawhup_patch_material_library' for updates. The description provides zero contextual usage information.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_create_promptDInspect
创建 Prompt。
| Name | Required | Description | Default |
|---|---|---|---|
| tags | No | ||
| title | Yes | ||
| content | No | ||
| outputType | No | text | |
| description | No | ||
| timeoutSeconds | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. The description only states '创建 Prompt。' (Create Prompt), which implies a write/mutation operation but provides no information about permissions required, side effects, rate limits, error conditions, or what happens upon successful creation. For a creation tool with 6 parameters and no annotation coverage, this is completely inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise ('创建 Prompt。') but this represents under-specification rather than effective brevity. A single sentence with minimal content fails to provide necessary information. While technically brief, it doesn't earn its place by adding value beyond what's already obvious from the tool name.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
This is a creation/mutation tool with 6 parameters, no annotations, no output schema, and 0% schema description coverage. The description provides only the most basic action statement without explaining what a 'Prompt' is, what parameters mean, what happens after creation, or how this differs from related tools. For this level of complexity with no structured documentation support, the description is completely inadequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description mentions no parameters at all. With 6 parameters in the input schema and 0% schema description coverage, the description fails to compensate for the complete lack of parameter documentation. Parameters like 'outputType' with an enum, 'timeoutSeconds' with constraints, and 'title' (required) are completely unexplained in the description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '创建 Prompt。' (Create Prompt) is a tautology that essentially restates the tool name 'clawhup_create_prompt' in Chinese. While it indicates the tool creates something called a 'Prompt', it doesn't specify what a 'Prompt' is in this context or what resources are involved. It doesn't distinguish this tool from sibling tools like 'clawhup_create_session' or 'clawhup_create_material_library' beyond the object type.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple sibling tools related to prompts (e.g., 'clawhup_update_prompt', 'clawhup_delete_prompt', 'clawhup_execute_prompt', 'clawhup_list_my_prompts'), but the description doesn't indicate when creation is appropriate versus updating, executing, or listing prompts. No context, prerequisites, or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_create_sessionDInspect
创建会话。
| Name | Required | Description | Default |
|---|---|---|---|
| title | Yes | ||
| promptId | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. The description '创建会话' (Create session) implies a write/mutation operation but provides no details about permissions required, whether the session is persistent, what happens on failure, or any rate limits. For a mutation tool with zero annotation coverage, this lack of behavioral context is a critical gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While the description is extremely brief ('创建会话'), this is a case of under-specification rather than effective conciseness. The single phrase does not earn its place by adding value; it merely repeats the tool name. A truly concise description would still provide essential context in minimal words, which this does not achieve.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a mutation operation with 2 parameters), lack of annotations, 0% schema description coverage, and no output schema, the description is completely inadequate. It fails to explain what the tool does beyond its name, provides no parameter guidance, no behavioral context, and no usage instructions. This leaves the agent with insufficient information to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning neither parameter ('title' and 'promptId') is documented in the schema. The description adds no information about these parameters—it doesn't explain what 'title' represents, what format it should have, what 'promptId' refers to, or whether it's optional. With 2 parameters completely undocumented, the description fails to compensate for the schema's deficiencies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '创建会话' (Create session) is a tautology that merely restates the tool name 'clawhup_create_session'. It provides no additional specificity about what kind of session is being created, what resources are involved, or how it differs from sibling tools like 'clawhup_get_session' or 'clawhup_list_sessions'. The description fails to distinguish this tool's purpose from related operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description offers no guidance on when to use this tool versus alternatives. There is no mention of prerequisites, appropriate contexts, or comparisons to sibling tools such as 'clawhup_get_session' for retrieval or 'clawhup_list_sessions' for listing. The agent receives no help in determining when this creation tool is the correct choice.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_delete_generationDInspect
删除生成记录。
| Name | Required | Description | Default |
|---|---|---|---|
| generationId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. '删除生成记录' only states the action without any information about permissions required, whether deletion is permanent or reversible, what happens to related data, error conditions, or response format. For a destructive operation with zero annotation coverage, this is dangerously inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is maximally concise - a single Chinese phrase that directly states the tool's action. There's no wasted language or unnecessary elaboration. While this conciseness comes at the cost of completeness, the structure is perfectly efficient for what it does convey.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
This is a destructive deletion tool with no annotations, no output schema, and a single undocumented parameter. The description provides only the most basic action statement without any context about what 'generation record' means in this system, what the consequences of deletion are, or how to properly use the tool. For a mutation operation, this level of documentation is completely inadequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning the single required parameter 'generationId' has no documentation in the schema. The description provides absolutely no information about this parameter - not what it represents, where to find it, format requirements, or constraints. The description fails to compensate for the complete lack of schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '删除生成记录' (delete generation record) is a tautology that essentially restates the tool name 'clawhup_delete_generation' in Chinese. It specifies the verb 'delete' and resource 'generation record', but doesn't distinguish this tool from sibling deletion tools like 'clawhup_delete_material_item' or 'clawhup_delete_session'. The purpose is clear at a basic level but lacks differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides absolutely no guidance on when to use this tool versus alternatives. There's no mention of prerequisites, appropriate contexts, or when other tools might be more suitable. With multiple deletion tools in the sibling list, this is a significant gap that leaves the agent guessing about proper application.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_delete_material_itemDInspect
删除素材项。
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It only states the action 'delete' without any details on permissions required, whether deletion is permanent or reversible, side effects, error conditions, or response format. For a destructive operation with zero annotation coverage, this is critically inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single Chinese phrase '删除素材项' (Delete material item), which is extremely concise with zero wasted words. It's front-loaded with the core action, though this brevity comes at the cost of completeness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a destructive delete operation), lack of annotations, no output schema, and 0% parameter schema coverage, the description is severely incomplete. It doesn't address behavioral aspects, parameter meaning, usage context, or return values, making it inadequate for safe and effective tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 1 parameter (itemId) with 0% description coverage, and the tool description provides no information about parameters. It doesn't explain what 'itemId' represents, its format, how to obtain it, or validation rules. With low schema coverage, the description fails to compensate, leaving the parameter undocumented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '删除素材项' (Delete material item) restates the tool name 'clawhup_delete_material_item' in Chinese, making it a tautology. It specifies the verb 'delete' and resource 'material item', but doesn't distinguish from sibling tools like 'clawhup_delete_material_library' or 'clawhup_delete_prompt' beyond the resource name. This is minimal differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. There are multiple deletion tools in the sibling list (e.g., delete_material_library, delete_prompt, delete_session), but the description offers no context on prerequisites, when deletion is appropriate, or what happens after deletion. This leaves the agent with no usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_delete_material_libraryDInspect
删除素材库。
| Name | Required | Description | Default |
|---|---|---|---|
| libraryId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. '删除素材库' only states the action without any information about permissions required, whether deletion is permanent or reversible, what happens to contained items, rate limits, or expected response format. For a destructive operation with zero annotation coverage, this is completely inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—a single Chinese phrase—with zero wasted words. It's front-loaded with the core action, though this brevity comes at the cost of completeness. For conciseness alone, it's optimal.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a deletion operation, lack of annotations, 0% schema coverage, no output schema, and multiple sibling deletion tools, the description is severely incomplete. It doesn't address behavioral risks, parameter meaning, usage context, or expected outcomes. This is inadequate for safe and effective tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning the single required parameter 'libraryId' has no documentation in the schema. The description provides no information about parameters whatsoever—it doesn't mention 'libraryId', what it represents, format expectations, or how to obtain valid values. With low schema coverage and no parameter details in the description, this fails to compensate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '删除素材库' (delete material library) is a tautology that restates the tool name 'clawhup_delete_material_library' in Chinese. While it identifies the verb ('delete') and resource ('material library'), it doesn't provide any additional specificity about what deletion entails or how it differs from sibling tools like 'clawhup_delete_material_item' or 'clawhup_delete_session'. The purpose is minimally stated but lacks differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple deletion tools in the sibling list (e.g., 'clawhup_delete_material_item', 'clawhup_delete_prompt'), but the description doesn't indicate when material library deletion is appropriate versus deleting individual items or other resources. No context, prerequisites, or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_delete_promptDInspect
删除 Prompt。
| Name | Required | Description | Default |
|---|---|---|---|
| promptId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It only states the action '删除' (delete) without disclosing behavioral traits such as whether deletion is permanent, requires permissions, has side effects, or returns confirmation. This is inadequate for a destructive operation with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise ('删除 Prompt。') but under-specified, not efficient. It consists of two words and a period, lacking necessary details for a deletion tool. This brevity results in insufficient information rather than effective structure.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's destructive nature, no annotations, 0% schema coverage, no output schema, and multiple sibling deletion tools, the description is completely inadequate. It fails to provide essential context about behavior, parameters, or usage, leaving critical gaps for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description adds no information about the single required parameter 'promptId'. It does not explain what a promptId is, how to obtain it, or its format (e.g., numeric ID). The description fails to compensate for the lack of schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '删除 Prompt。' is a tautology that restates the tool name 'clawhup_delete_prompt' in Chinese ('删除' means 'delete'), providing no additional clarity about what a 'Prompt' resource is or what deletion entails. It distinguishes from siblings like 'clawhup_create_prompt' by the verb, but lacks specificity about the resource scope or effects.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. There are sibling tools like 'clawhup_delete_session' and 'clawhup_delete_material_item', but the description offers no context on prerequisites, conditions, or distinctions between deletion operations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_delete_sessionDInspect
删除会话。
| Name | Required | Description | Default |
|---|---|---|---|
| sessionId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. '删除会话' only indicates a deletion action without specifying whether this is permanent or reversible, what permissions are required, whether it affects associated data, or what happens on success/failure. For a destructive operation with zero annotation coverage, this is critically insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While technically concise with just three characters, this is under-specification rather than effective conciseness. The description fails to provide necessary information that would help an AI agent understand and use the tool correctly. Every sentence should earn its place, but here the single phrase doesn't earn its place by adding meaningful value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive deletion tool with no annotations, no output schema, and a required parameter with 0% schema coverage, the description is completely inadequate. It doesn't explain what a session is, what deletion entails, what the parameter means, or what to expect as a result. Given the complexity implied by the sibling tools (which include session management operations), this description leaves critical gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning the single required parameter 'sessionId' has no documentation in the schema. The description provides no information about this parameter - not what a sessionId is, how to obtain one, what format it expects (the schema shows type: number but no context), or valid ranges. With low coverage and no compensation in the description, this scores poorly.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '删除会话' (delete session) is a tautology that merely restates the tool name 'clawhup_delete_session' in Chinese. It doesn't specify what resource is being deleted (e.g., a chat session, workflow session, or data session) or provide any distinguishing context from sibling tools like 'clawhup_delete_generation' or 'clawhup_delete_prompt'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple deletion tools in the sibling list (delete_generation, delete_material_item, delete_material_library, delete_prompt, delete_share), but no indication of what makes a 'session' different or when deletion is appropriate versus other operations like 'clawhup_patch_session'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_doc_add_relationCInspect
添加文档关联(基于/替代/实现/参考)。
| Name | Required | Description | Default |
|---|---|---|---|
| docId | Yes | 源文档 ID | |
| relType | Yes | 关联类型: based_on / supersedes / implements / references | |
| targetId | Yes | 目标文档 ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool adds a relation, implying a write/mutation operation, but doesn't specify permissions required, whether it's idempotent, what happens on duplicate relations, or error conditions. The description lacks details on side effects, rate limits, or response format, which are critical for a mutation tool with no structured safety hints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that front-loads the core action. It wastes no words, though it could be more structured (e.g., separating purpose from examples). The brevity is appropriate given the straightforward tool name, but it risks under-specification rather than true conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with 3 parameters, no annotations, and no output schema, the description is incomplete. It doesn't cover behavioral aspects like error handling, permissions, or what the tool returns. The schema handles parameter documentation well, but the description fails to compensate for the lack of annotations and output schema, leaving gaps in understanding how to use the tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear parameter descriptions in Chinese (e.g., '源文档 ID' for docId). The description adds minimal value beyond the schema by listing relation types in parentheses, but doesn't explain their semantics or usage scenarios. Since the schema already documents parameters thoroughly, the baseline score of 3 is appropriate, as the description doesn't significantly enhance understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '添加文档关联(基于/替代/实现/参考)' states the action (add document relation) and lists relation types, providing a basic purpose. However, it's vague about what 'document relation' means operationally and doesn't distinguish this tool from potential siblings like clawhup_doc_update or clawhup_doc_create, which might also involve document relationships. The Chinese text adds specificity through the parenthetical but remains somewhat abstract.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites (e.g., existing documents), exclusions, or compare it to sibling tools like clawhup_doc_update that might handle similar functions. Usage is implied through the action but without contextual boundaries, leaving the agent to guess based on the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_doc_createCInspect
创建产研文档(PRD/设计/计划/决策/测试/发布/手册/复盘)。Spec 请用 clawhup_doc_create_spec。
| Name | Required | Description | Default |
|---|---|---|---|
| code | No | 文档编码(可选,空则自动生成) | |
| type | Yes | 文档类型: prd / design / adr / plan / testplan / release / runbook / postmortem(spec 用 clawhup_doc_create_spec) | |
| title | Yes | 文档标题 | |
| topic | No | 功能主题标签(如 user-auth) | |
| milestone | No | 迭代版本号(如 v1.0) | |
| projectId | Yes | 项目 ID | |
| description | No | 文档描述 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states '创建' (create) which implies a write/mutation operation, but doesn't mention permissions, side effects, error conditions, or what happens on success (e.g., returns a document ID). This leaves significant gaps in understanding how the tool behaves beyond the basic action.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise—just two short sentences that directly state the purpose and reference another tool. There's no fluff or redundancy, making it efficient. However, the second sentence about 'Spec' feels slightly tacked on without clear integration, preventing a perfect score.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a creation tool with 7 parameters, no annotations, and no output schema, the description is insufficient. It doesn't cover behavioral aspects like permissions, side effects, or return values, nor does it explain the relationship with sibling tools. Given the complexity and lack of structured data, more context is needed for the agent to use this tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all 7 parameters with clear descriptions. The description adds no additional parameter information beyond what's in the schema, such as format examples or constraints. This meets the baseline for high schema coverage but doesn't enhance understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('创建产研文档') and the resource type (PRD/design/plan/decision/test/release/manual/postmortem), making the purpose specific and understandable. However, it doesn't explicitly differentiate from sibling tools like clawhup_doc_create_spec or clawhup_doc_update, which would be needed for a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like clawhup_doc_create_spec or clawhup_doc_update. It mentions 'Spec 请用 clawhup_doc_create_spec' which hints at an alternative for spec creation, but doesn't explain the relationship or when to choose one over the other, leaving usage unclear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_doc_create_specCInspect
创建 Spec 规格文档(需指定编码和级别)。
| Name | Required | Description | Default |
|---|---|---|---|
| code | Yes | Spec 编码(唯一) | |
| title | Yes | Spec 标题 | |
| topic | No | 功能主题标签 | |
| milestone | No | 迭代版本号 | |
| projectId | Yes | 项目 ID | |
| specLevel | Yes | Spec 级别: L0/L1/L2/L3 | |
| description | No | Spec 描述 | |
| parentDocId | No | 父 Spec ID(L2 必须指定 L1,L3 必须指定 L2) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states this is a creation operation ('创建'), implying it's a write/mutation tool, but doesn't mention permissions, side effects, error conditions, or what happens on success/failure. For a mutation tool with 8 parameters and no annotation coverage, this is insufficient behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that conveys the core action and key requirements. There's no wasted language, though it could be slightly more informative given the tool's complexity. The structure is front-loaded with the main purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a creation tool with 8 parameters, no annotations, and no output schema, the description is inadequate. It doesn't explain what happens after creation (e.g., returns a document ID), doesn't mention validation rules beyond what's in the schema, and provides no error handling information. The agent would struggle to use this tool correctly without additional context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, so all parameters are documented in the schema. The description adds minimal value by mentioning that code and level must be specified, which aligns with the required parameters in the schema. However, it doesn't provide additional context about parameter interactions (e.g., how parentDocId relates to specLevel) beyond what the schema already states.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('创建' - create) and resource ('Spec 规格文档'), making the purpose evident. However, it doesn't differentiate from sibling tools like 'clawhup_doc_create' or 'clawhup_spec_create', which appear to perform similar creation operations. The mention of '需指定编码和级别' (need to specify code and level) adds specificity but not enough to distinguish from alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'clawhup_doc_create' or 'clawhup_spec_create'. It mentions that code and level must be specified, which is a requirement but not usage context. There's no information about prerequisites, dependencies, or typical scenarios for invoking this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_doc_getBInspect
获取文档详情(含正文内容和关联文档)。
| Name | Required | Description | Default |
|---|---|---|---|
| docId | Yes | 文档 ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. The description mentions it retrieves document details including body content and related documents, which implies a read-only operation, but it doesn't explicitly state whether this requires authentication, has rate limits, or what happens on errors. For a tool with no annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise and front-loaded: a single sentence in Chinese that directly states the tool's purpose. There's no wasted verbiage or unnecessary elaboration, making it efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (single required parameter, no output schema, no annotations), the description is minimally adequate. It explains what the tool does but lacks details on behavioral aspects like error handling or return format. For a simple retrieval tool, this might suffice, but it doesn't provide a complete picture for optimal agent use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the single parameter 'docId' described as '文档 ID' (document ID). The description doesn't add any additional parameter information beyond what the schema provides, such as format examples or constraints. With high schema coverage, the baseline score of 3 is appropriate as the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: '获取文档详情(含正文内容和关联文档)' translates to 'Get document details (including body content and related documents).' This specifies the verb (get/retrieve) and resource (document details) with additional scope information. However, it doesn't explicitly differentiate from sibling tools like 'clawhup_doc_list' or 'clawhup_doc_get' might be implied, but no direct comparison is made.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention when this tool is appropriate compared to other document-related tools like 'clawhup_doc_list' (for listing documents) or 'clawhup_doc_create' (for creating documents). There's no context about prerequisites or typical use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_doc_listCInspect
列出项目文档,支持按类型(prd/design/spec/adr/plan/testplan/release/runbook/postmortem)和功能主题筛选。
| Name | Required | Description | Default |
|---|---|---|---|
| q | No | 关键词搜索,模糊匹配 title / code | |
| size | No | 每页条数,默认 50,最大 200 | |
| type | No | 文档类型筛选: prd / design / spec / adr / plan / testplan / release / runbook / postmortem | |
| topic | No | 功能主题筛选(如 user-auth, model-gateway) | |
| cursor | No | 游标(首页不传) | |
| projectId | Yes | 项目 ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions filtering but doesn't describe pagination behavior (implied by 'cursor' parameter), rate limits, authentication requirements, or what the return format looks like. For a list operation with 6 parameters, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose and key filtering capabilities. There's no wasted verbiage, though it could be slightly more structured by separating purpose from filtering details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a list tool with 6 parameters and 100% schema coverage but no output schema, the description is minimally adequate. It covers what the tool does but lacks behavioral context (pagination, return format) and usage guidance. The absence of annotations and output schema means the description should do more to compensate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all 6 parameters thoroughly. The description adds minimal value by mentioning type and topic filtering, but doesn't provide additional semantic context beyond what's in the schema. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb '列出' (list) and resource '项目文档' (project documents), making the purpose explicit. It also specifies filtering capabilities by type and topic, which adds useful detail. However, it doesn't differentiate from sibling tools like 'clawhup_doc_get' or 'clawhup_spec_list', which prevents a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'clawhup_doc_get' (for single document retrieval) or 'clawhup_spec_list' (for listing specs specifically). It mentions filtering capabilities but doesn't clarify usage context or prerequisites beyond implied project context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_doc_updateCInspect
更新文档的描述或正文内容。
| Name | Required | Description | Default |
|---|---|---|---|
| docId | Yes | 文档 ID | |
| description | No | 文档描述 | |
| contentTemplate | No | 文档正文内容(Markdown) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. While '更新' (update) implies a mutation operation, the description doesn't specify whether this requires specific permissions, whether the update is partial or complete, what happens to existing content not mentioned in parameters, or what the response looks like. For a mutation tool with zero annotation coverage, this leaves significant behavioral questions unanswered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that gets straight to the point. There's no wasted language or unnecessary elaboration. However, it could be slightly more specific about what '文档' (document) refers to in this context, given the sibling tools include various document-related operations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with 3 parameters, no annotations, and no output schema, the description is inadequate. It doesn't explain the update behavior (partial vs. complete), doesn't mention error conditions or permissions required, and provides no information about what the tool returns. The agent would need to guess about important behavioral aspects of this update operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all parameters clearly documented in the schema itself. The description mentions '描述或正文内容' (description or body content) which maps to the 'description' and 'contentTemplate' parameters, but adds no additional semantic context beyond what the schema already provides. The baseline score of 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states '更新文档的描述或正文内容' which translates to 'Update document description or body content' - this clearly indicates the verb (update) and resource (document). However, it doesn't distinguish this from sibling tools like 'clawhup_doc_create' or 'clawhup_doc_get', nor does it specify what type of document is being updated. The purpose is understandable but lacks differentiation from related tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites, when this should be used instead of creating a new document, or how it relates to other document operations like 'clawhup_doc_create' or 'clawhup_doc_get'. The agent must infer usage from the tool name and description alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_execute_promptCInspect
执行 ClawHup Prompt 生成内容。ClawHup 会自动选择最佳模型。需要配置 CLAWHUP_API_KEY。
| Name | Required | Description | Default |
|---|---|---|---|
| promptId | Yes | Prompt ID | |
| variables | No | Prompt 所需的变量键值对 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions automatic model selection and API key requirement, which adds some context, but fails to describe critical behaviors: whether this is a read-only or mutation operation, expected response format, error handling, rate limits, or any side effects. For a tool that likely generates content (implying potential mutation or external calls), this is inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is brief and to the point, consisting of two sentences that cover the main action and a key requirement. There's no unnecessary fluff, and it's front-loaded with the primary purpose. However, it could be slightly more structured by explicitly separating usage notes from core functionality.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (a content generation tool with no annotations and no output schema), the description is incomplete. It lacks details on what the tool returns, error conditions, authentication specifics beyond mentioning the API key, and how it differs from sibling generation tools. This leaves the agent with insufficient information to use the tool effectively in context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters (promptId and variables) well-documented in the schema. The description doesn't add any parameter-specific information beyond what the schema already provides (e.g., it doesn't explain how variables map to the prompt or provide examples). This meets the baseline for high schema coverage but doesn't enhance understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('执行' meaning execute) and resource ('ClawHup Prompt 生成内容' meaning ClawHup Prompt to generate content), providing a specific verb+resource combination. However, it doesn't explicitly differentiate from sibling tools like 'clawhup_generate' or 'clawhup_generate_stream' which also appear to generate content, so it doesn't reach the highest score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides minimal guidance: it mentions that ClawHup automatically selects the best model and requires CLAWHUP_API_KEY configuration. However, it doesn't explain when to use this tool versus alternatives (e.g., 'clawhup_generate' or 'clawhup_generate_stream'), nor does it specify prerequisites or exclusions. This leaves significant gaps in usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_generateDInspect
按 Prompt 生成内容。
| Name | Required | Description | Default |
|---|---|---|---|
| promptId | Yes | ||
| provider | No | ||
| modelName | No | ||
| sessionId | No | ||
| variables | No | ||
| materialLibraryIds | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but provides none. 'Generate content' implies a write operation, but the description doesn't disclose whether this creates new resources, what permissions are required, whether it's idempotent, rate limits, or what the response looks like. For a tool with 6 parameters and complex nested objects, this lack of behavioral context is severely inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise at just 5 Chinese characters. While this represents severe under-specification rather than ideal conciseness, from a pure length perspective it's minimal and front-loaded. There's no wasted space or redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (6 parameters including nested objects), complete lack of annotations, 0% schema description coverage, and no output schema, the description is completely inadequate. It doesn't explain what the tool does beyond the most basic level, provides no behavioral context, and offers no parameter guidance. For a generation tool with multiple configuration options, this description leaves the agent guessing about fundamental aspects of tool usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage, meaning none of the 6 parameters have any documentation in the schema. The description mentions 'Prompt' which might relate to 'promptId', but doesn't explain what a promptId is, how to obtain one, or what the other 5 parameters do (provider, modelName, sessionId, variables, materialLibraryIds). The description fails to compensate for the complete lack of schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '按 Prompt 生成内容' (Generate content according to Prompt) is a tautology that essentially restates the tool name 'clawhup_generate' in different words. It doesn't specify what kind of content is generated, what resources are involved, or how this differs from sibling tools like 'clawhup_execute_prompt' or 'clawhup_generate_stream'. The description lacks specificity about the verb and resource involved.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides absolutely no guidance on when to use this tool versus alternatives. With multiple sibling tools that appear related to prompt execution and generation (clawhup_execute_prompt, clawhup_generate_stream), the agent has no indication of when this specific tool is appropriate versus those alternatives. There's no mention of prerequisites, constraints, or typical use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_generate_streamDInspect
流式生成并返回聚合结果。
| Name | Required | Description | Default |
|---|---|---|---|
| promptId | Yes | ||
| provider | No | ||
| modelName | No | ||
| sessionId | No | ||
| variables | No | ||
| materialLibraryIds | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure but offers minimal information. It mentions 'streaming' and 'aggregated results' but doesn't explain what streaming entails (real-time chunks? progressive output?), what 'aggregated' means (combined from multiple sources? summarized?), authentication requirements, rate limits, side effects, or error behavior. For a 6-parameter tool with complex inputs, this is inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise - a single Chinese sentence. While this is efficient and front-loaded, it's arguably too brief given the tool's complexity. However, within its limited scope, it wastes no words and gets straight to the point about streaming generation with aggregation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex tool with 6 parameters (including nested objects), no annotations, no output schema, and 0% schema description coverage, the description is completely inadequate. It doesn't explain what the tool actually does beyond the name, provides no parameter guidance, no behavioral context, and no differentiation from related tools. The agent would struggle to use this correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% (no parameter descriptions in schema), and the tool description provides absolutely no information about any of the 6 parameters (promptId, provider, modelName, sessionId, variables, materialLibraryIds). The description doesn't explain what these parameters mean, their relationships, or how they affect the streaming generation process. With 0% schema coverage, the description fails completely to compensate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '流式生成并返回聚合结果' (streaming generation and returns aggregated results) is vague and tautological - it essentially restates the name 'clawhup_generate_stream' without specifying what is being generated, what resources are involved, or how it differs from sibling tools like 'clawhup_generate' or 'clawhup_execute_prompt'. It lacks specific verb+resource clarity and sibling differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple related tools (clawhup_generate, clawhup_execute_prompt, clawhup_create_prompt) but no indication of when this streaming generation approach is appropriate versus batch generation or other methods. No context, prerequisites, or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_get_benchmark_resultDInspect
查询评测结果。
| Name | Required | Description | Default |
|---|---|---|---|
| runId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. '查询评测结果' only indicates a query action without any details about permissions, rate limits, error conditions, or what the return data looks like. For a tool with no annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—a single phrase in Chinese. While it may be too brief for clarity, it's structured without redundancy and gets straight to the point. Every word serves a purpose, making it efficient in terms of word count.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (a query tool with no annotations, no output schema, and low parameter coverage), the description is incomplete. It doesn't explain what benchmark results are, how they're structured, or provide any context for interpretation. The agent would struggle to use this tool effectively without additional information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 1 required parameter (runId) with 0% description coverage. The description adds no information about what runId represents, its format, or where to obtain it. With low schema coverage and no compensatory details in the description, parameter understanding is severely lacking.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '查询评测结果' (Query benchmark results) states a general action but is vague about what specifically is being queried. It doesn't clearly distinguish this tool from sibling tools like 'clawhup_create_benchmark_run' or 'clawhup_patch_benchmark_item', which also deal with benchmark operations. While it indicates a read operation, it lacks specificity about the resource scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple benchmark-related tools (e.g., create_benchmark_run, patch_benchmark_item), but no indication of prerequisites, typical workflows, or distinctions between them. This leaves the agent guessing about appropriate usage contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_get_generationDInspect
查询生成详情。
| Name | Required | Description | Default |
|---|---|---|---|
| generationId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but offers none. It doesn't indicate whether this is a read-only operation, what permissions might be required, whether it has rate limits, what format the returned details take, or any error conditions. The single sentence provides no behavioral context beyond the basic query action.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is maximally concise - a single Chinese sentence with no wasted words. While this results in severe under-specification, from a pure conciseness perspective it's perfectly efficient with every character serving the core function statement.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given a tool with 1 parameter (0% schema coverage), no annotations, no output schema, and no behavioral context, the description is completely inadequate. It provides only the most basic function statement without explaining what 'generation' means in this system, what details are returned, how to use the parameter, or when this tool is appropriate versus alternatives.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage, with one required parameter 'generationId' documented only as a number type. The description adds absolutely no semantic information about what generationId represents, how to obtain it, what format it should be in, or what happens if an invalid ID is provided. This leaves the parameter completely undocumented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '查询生成详情' (Query generation details) is a tautology that essentially restates the tool name 'clawhup_get_generation' in Chinese. It doesn't specify what 'generation' refers to in this context (e.g., AI generation, data generation, or process generation), nor does it distinguish this tool from sibling tools like 'clawhup_get_session' or 'clawhup_get_prompt_version' that also retrieve details about different resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites, appropriate contexts, or comparison to sibling tools like 'clawhup_get_session' or 'clawhup_generate' that might be related. The agent receives zero usage direction beyond the basic function implied by the name.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_get_prompt_versionCInspect
查询 Prompt 单个版本详情。
| Name | Required | Description | Default |
|---|---|---|---|
| promptId | Yes | ||
| versionId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states this is a query operation ('查询'), implying it's likely read-only and non-destructive, but doesn't explicitly confirm this or describe any behavioral traits like authentication requirements, rate limits, error conditions, or what '详情' (details) includes. For a tool with zero annotation coverage, this leaves significant gaps in understanding how it behaves.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise - a single Chinese sentence that directly states the tool's purpose. There's zero wasted verbiage, and it's front-loaded with the core action. For a simple query tool, this brevity is appropriate and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 2 parameters with 0% schema coverage, no annotations, no output schema, and multiple sibling tools in the prompt domain, the description is insufficiently complete. It doesn't explain what '详情' includes, how this differs from other prompt tools, or provide any parameter context. For a tool that likely returns structured version metadata, more guidance would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description mentions no parameters, while the schema has 2 required parameters (promptId and versionId) with 0% schema description coverage. The description doesn't explain what these IDs represent, how to obtain them, or their format (though they're numeric). With low schema coverage, the description fails to compensate, leaving parameters undocumented beyond their basic types.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '查询 Prompt 单个版本详情' clearly states the purpose as querying/retrieving details of a single prompt version. It specifies the verb ('查询' - query) and resource ('Prompt 单个版本详情' - single prompt version details), making the intent unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'clawhup_get_prompt_versions' (plural versions) or 'clawhup_prompt_detail' (general prompt detail), which would be needed for a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are several related prompt tools in the sibling list (e.g., 'clawhup_get_prompt_versions', 'clawhup_prompt_detail', 'clawhup_execute_prompt'), but the description doesn't indicate this is specifically for retrieving metadata of a single historical version rather than current details or multiple versions. No usage context or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_get_prompt_versionsCInspect
查询 Prompt 版本列表。
| Name | Required | Description | Default |
|---|---|---|---|
| promptId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It implies a read-only query operation ('查询'), but doesn't specify permissions, rate limits, pagination, or return format. For a tool with no annotation coverage, this leaves significant gaps in understanding its behavior and constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that directly states the tool's purpose with zero waste. It's appropriately sized and front-loaded, making it easy to parse quickly without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a query operation with one parameter), lack of annotations, no output schema, and 0% schema coverage, the description is incomplete. It doesn't provide enough context for an agent to understand how to use it effectively, missing details on parameters, behavior, and output expectations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds no parameter information beyond what the schema provides. With 0% schema description coverage and one required parameter ('promptId'), the description doesn't explain what 'promptId' is, its format, or how it relates to the query. This fails to compensate for the low schema coverage, leaving the parameter undocumented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '查询 Prompt 版本列表' (Query Prompt version list) clearly states the action (query/list) and resource (Prompt versions), but it's vague about scope and doesn't differentiate from sibling tools like 'clawhup_get_prompt_version' (singular) or 'clawhup_prompt_detail'. It specifies the verb and resource but lacks precision on what 'list' entails compared to similar tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. With siblings like 'clawhup_get_prompt_version' (singular), 'clawhup_prompt_detail', and 'clawhup_list_my_prompts', the description offers no context on selection criteria, prerequisites, or exclusions, leaving usage ambiguous.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_get_sessionCInspect
查询单个会话。
| Name | Required | Description | Default |
|---|---|---|---|
| sessionId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states '查询' (query) which implies a read operation, but doesn't disclose behavioral traits like whether it requires authentication, rate limits, error handling, or what the return format looks like. For a tool with zero annotation coverage, this leaves significant gaps in understanding its behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise with a single phrase ('查询单个会话') that front-loads the core action. There's zero wasted text, making it efficient for quick scanning, though this brevity contributes to gaps in other dimensions.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a read operation with one parameter), lack of annotations, no output schema, and 0% schema description coverage, the description is incomplete. It doesn't provide enough context for an agent to reliably invoke the tool, missing details on parameters, return values, and usage scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds no meaning beyond the input schema. Schema description coverage is 0%, and the description doesn't explain the 'sessionId' parameter (e.g., what it is, where to find it, format expectations). With one required parameter undocumented in both schema and description, this fails to compensate for the coverage gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '查询单个会话' (Query a single session) states the action (query) and resource (session), but is vague about scope and doesn't differentiate from sibling tools like 'clawhup_get_session_messages' or 'clawhup_list_sessions'. It provides basic purpose but lacks specificity about what constitutes a 'session' or what information is retrieved.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. The description doesn't mention prerequisites (e.g., needing a valid sessionId), when to choose this over 'clawhup_list_sessions' for multiple sessions, or any constraints. Usage is implied from the name but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_get_session_messagesCInspect
查询会话消息列表。
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| sessionId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It implies a read-only operation ('查询' means query), but doesn't disclose behavioral traits like pagination (limit parameter), ordering, error conditions, or authentication needs. For a tool with parameters and no annotations, this is inadequate transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. It's front-loaded and appropriately sized for its content, though brevity contributes to gaps in other dimensions.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 2 parameters, 0% schema coverage, no annotations, and no output schema, the description is incomplete. It doesn't explain return values (e.g., message format), error handling, or usage context, leaving significant gaps for agent understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It mentions '会话消息列表' (session message list), hinting at sessionId and possibly limit, but doesn't explain parameter meanings (e.g., sessionId format, limit's role in pagination). With 2 parameters (1 required), this adds minimal value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '查询会话消息列表' (Query session message list) states the verb (query) and resource (session messages), providing a basic purpose. However, it's vague about scope (e.g., all messages or filtered) and doesn't differentiate from sibling tools like 'clawhup_get_session' or 'clawhup_list_sessions', which might retrieve session metadata rather than messages.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description lacks context about prerequisites (e.g., needing a valid sessionId) or exclusions (e.g., not for creating messages). With many sibling tools for sessions and messages, this omission leaves the agent without usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_get_variable_suggestionsCInspect
获取 Prompt 变量建议值。
| Name | Required | Description | Default |
|---|---|---|---|
| q | No | ||
| name | Yes | ||
| limit | No | ||
| promptId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It only states the action ('获取' - get), implying a read operation, but doesn't disclose behavioral traits like whether it requires authentication, has rate limits, returns structured data, or involves network calls. For a tool with 4 parameters and no annotations, this is inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that directly states the tool's purpose. There's no wasted words or unnecessary elaboration, making it appropriately concise. However, it lacks front-loading of critical details like parameter context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 4 parameters (2 required), 0% schema coverage, no annotations, and no output schema, the description is incomplete. It doesn't explain what the tool returns (suggestion format, structure), how parameters interact, or error conditions. For a suggestion-retrieval tool with multiple inputs, more context is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so parameters (promptId, name, q, limit) are completely undocumented in the schema. The description doesn't add any parameter semantics - it doesn't explain what promptId refers to, what 'name' represents (variable name?), what 'q' is for (query?), or how 'limit' affects results. It fails to compensate for the schema gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '获取 Prompt 变量建议值' (Get Prompt variable suggestions) states the basic purpose - retrieving suggestions for variables in a prompt. It includes a verb ('获取' - get) and resource ('Prompt 变量建议值' - prompt variable suggestions), but it's vague about what these suggestions are (e.g., possible values, examples, completions) and doesn't differentiate from sibling tools like clawhup_execute_prompt or clawhup_prompt_detail that also work with prompts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites (e.g., needing a promptId), use cases (e.g., when building prompts with variables), or comparison to sibling tools like clawhup_execute_prompt (which might execute prompts with variables) or clawhup_get_prompt_version (which might retrieve prompt details).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_list_material_itemsCInspect
查询素材库中的素材项。
| Name | Required | Description | Default |
|---|---|---|---|
| libraryId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. The description only states it's a query operation ('查询'), implying read-only behavior, but doesn't specify whether it returns all items, supports filtering/pagination, requires authentication, has rate limits, or what the output format looks like. For a read operation with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that directly states the tool's purpose without unnecessary words. It's appropriately sized for a simple query tool and front-loads the core action. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 1 required parameter with 0% schema coverage, no annotations, and no output schema, the description is incomplete. It doesn't explain the parameter, behavioral details, or return values. For a query tool that likely returns a list of items, more context about output format, filtering, or pagination would be helpful. The description alone is insufficient for confident tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning the single parameter 'libraryId' has no documentation in the schema. The description doesn't mention this parameter at all, providing no semantic information about what 'libraryId' represents, how to obtain it, or its format. With low schema coverage, the description fails to compensate for the documentation gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '查询素材库中的素材项' (Query material items in the material library) states a clear verb ('查询' - query) and resource ('素材项' - material items) with a scope ('素材库中' - in the material library). However, it doesn't distinguish this tool from its sibling 'clawhup_list_material_libraries' which likely lists libraries rather than items within a specific library. The purpose is clear but lacks sibling differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites (like needing a libraryId), when this tool is appropriate versus other listing tools (e.g., 'clawhup_list_material_libraries'), or any exclusion criteria. The agent must infer usage from the parameter name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_list_material_librariesCInspect
列出素材库。
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. '列出' (list) implies a read-only operation, but it doesn't disclose any behavioral traits like pagination, sorting, filtering capabilities, authentication requirements, rate limits, or what format the returned data takes. For a list operation with zero annotation coverage, this leaves significant gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise ('列出素材库' - 4 characters), which could be seen as efficient. However, it's arguably under-specified rather than appropriately concise, as it provides minimal information beyond the tool name. It's front-loaded but lacks substance.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (0 parameters, no annotations, no output schema), the description is incomplete. While the purpose is clear at a basic level, it doesn't provide enough context about what 'material libraries' are, how results are structured, or any operational constraints. For even a simple list tool, more context would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has 0 parameters with 100% schema description coverage (empty schema). The description doesn't need to add parameter semantics since there are no parameters to document. A baseline of 4 is appropriate as the description doesn't contradict the schema and the schema fully covers the parameter space.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '列出素材库' (List material libraries) is a direct translation of the tool name 'clawhup_list_material_libraries', making it tautological. It specifies the verb 'list' and resource 'material libraries', but doesn't distinguish from sibling tools like 'clawhup_list_material_items' or provide any additional context about what 'material libraries' are in this system.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. With sibling tools like 'clawhup_create_material_library', 'clawhup_delete_material_library', and 'clawhup_patch_material_library', the description doesn't indicate this is for retrieval rather than modification, nor does it mention any prerequisites or contextual constraints.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_list_my_promptsCInspect
列出当前用户的 Prompt。
| Name | Required | Description | Default |
|---|---|---|---|
| outputType | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states this is a list operation, implying it's likely read-only and non-destructive, but doesn't confirm this or describe other behaviors like pagination, sorting, or error handling. The description adds minimal value beyond the basic action.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that directly states the tool's purpose without any unnecessary words. It's appropriately sized and front-loaded with the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a list operation with one parameter), lack of annotations, and no output schema, the description is incomplete. It doesn't explain what 'prompts' are in this context, how results are returned, or the purpose of the 'outputType' parameter, leaving significant gaps for an AI agent to understand and use the tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description mentions no parameters, while the input schema has one parameter ('outputType') with an enum but 0% schema description coverage. Since no parameters are described, the baseline is 4, but this is reduced to 3 because the schema provides some structure (enum values) that the description doesn't explain or contextualize.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('列出' meaning 'list') and resource ('当前用户的 Prompt' meaning 'current user's prompts'), providing a specific verb+resource combination. However, it doesn't distinguish this tool from sibling tools like 'clawhup_search_prompts' or 'clawhup_prompt_detail', which might offer similar functionality with different scopes or filters.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'clawhup_search_prompts' for filtered searches or 'clawhup_prompt_detail' for detailed views, nor does it specify any prerequisites or contexts for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_list_sessionsDInspect
列出会话。
| Name | Required | Description | Default |
|---|---|---|---|
| page | No | ||
| size | No | ||
| promptId | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. The description '列出会话' only states the action without any information about permissions, rate limits, pagination behavior (implied by parameters but not explained), or what the output contains. For a list operation with parameters, this lack of behavioral context is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—a single phrase—with no wasted words. It's front-loaded and to the point, though this brevity comes at the cost of completeness. For conciseness alone, it's optimal, but it lacks the necessary detail for effective tool use.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (a list operation with three parameters), no annotations, no output schema, and 0% schema description coverage, the description is completely inadequate. It doesn't explain what sessions are, how to use the parameters, what the output looks like, or any behavioral aspects. The tool is left largely undocumented.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 0%, meaning none of the three parameters (page, size, promptId) are documented in the schema. The description adds no information about these parameters—it doesn't mention pagination, filtering by promptId, or default values. With low coverage and no compensation in the description, this leaves parameters entirely unexplained.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '列出会话' (List sessions) states the basic action but is overly terse. It provides a verb ('list') and resource ('sessions'), which is better than a tautology, but it lacks specificity about what 'sessions' are in this context and doesn't differentiate from sibling tools like 'clawhup_get_session' or 'clawhup_get_session_messages'. The purpose is clear at a surface level but insufficiently detailed.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple session-related tools (e.g., 'clawhup_get_session', 'clawhup_create_session', 'clawhup_delete_session'), but the description doesn't indicate that this is for listing multiple sessions rather than retrieving a single one or performing other operations. No context or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_patch_benchmark_itemCInspect
更新评测项评分。
| Name | Required | Description | Default |
|---|---|---|---|
| runId | Yes | ||
| itemId | Yes | ||
| qualityScore | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but only states the action ('update') without disclosing behavioral traits. It doesn't mention whether this is a safe operation, if it requires specific permissions, what happens on success/failure, or any side effects. For a mutation tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's appropriately sized for a simple update operation and front-loads the core action. Every word earns its place, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (mutation with 3 required parameters), lack of annotations, 0% schema description coverage, and no output schema, the description is incomplete. It doesn't provide enough context for safe and effective use, missing details on parameters, behavior, and integration with sibling tools in the benchmark domain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the schema provides no parameter descriptions. The description adds no information about parameters beyond what's implied by the tool name. It doesn't explain what 'runId', 'itemId', or 'qualityScore' represent, their relationships, or the 1-5 scoring range indicated in the schema. The description fails to compensate for the lack of schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '更新评测项评分' (Update benchmark item score) clearly states the action (update) and target (benchmark item score), but it's vague about scope and doesn't distinguish from sibling tools like 'clawhup_patch_material_item' or 'clawhup_patch_session' that also perform patch operations. It specifies what it does but lacks detail about what a 'benchmark item' entails.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. There's no mention of prerequisites (e.g., existing benchmark runs), related tools like 'clawhup_create_benchmark_run' or 'clawhup_get_benchmark_result', or any context about typical workflows. The agent must infer usage from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_patch_material_itemDInspect
更新素材项。
| Name | Required | Description | Default |
|---|---|---|---|
| key | No | ||
| type | No | ||
| value | No | ||
| itemId | Yes | ||
| weight | No | ||
| enabled | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but offers none. '更新素材项' (Update material item) implies a mutation operation but doesn't disclose permission requirements, whether the update is partial or complete, what happens to unspecified fields, whether changes are reversible, or any rate limits. For a mutation tool with 6 parameters and no annotation coverage, this is critically insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is maximally concise with just four characters in Chinese ('更新素材项'), which translates to 'Update material item'. There's no wasted language or unnecessary elaboration. However, this conciseness comes at the cost of being severely under-specified for a tool with 6 parameters and mutation behavior.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with 6 parameters, 0% schema description coverage, no annotations, and no output schema, the description is completely inadequate. It doesn't explain what a 'material item' is, what fields can be updated, what the update operation entails, or what the tool returns. The agent would be operating blindly with this level of documentation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage for all 6 parameters, the description provides absolutely no information about parameter meanings or usage. It doesn't mention that 'itemId' is required or explain what 'key', 'type', 'value', 'weight', or 'enabled' represent in the context of updating a material item. The description fails to compensate for the complete lack of schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '更新素材项' (Update material item) is a tautology that essentially restates the tool name 'clawhup_patch_material_item'. It provides a generic verb+resource but lacks specificity about what a 'material item' is or what aspects can be updated. Compared to sibling tools like 'clawhup_add_material_item' and 'clawhup_delete_material_item', it doesn't clearly differentiate the patch operation from add or delete operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, when this tool is appropriate versus 'clawhup_add_material_item' or 'clawhup_delete_material_item', or any contextual constraints. The agent receives no usage direction beyond the generic update action implied by the name.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_patch_material_libraryDInspect
更新素材库。
| Name | Required | Description | Default |
|---|---|---|---|
| name | No | ||
| isDefault | No | ||
| libraryId | Yes | ||
| description | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It only states '更新素材库' (Update material library), which implies a mutation but gives no details on permissions, side effects, rate limits, or response behavior. This is inadequate for a tool with potential destructive implications.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single phrase '更新素材库', which is overly concise to the point of under-specification. It lacks necessary details and structure, making it inefficient rather than appropriately brief.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (4 parameters, 1 required, no output schema, no annotations), the description is completely inadequate. It doesn't explain what the tool does beyond the name, provide parameter context, or address behavioral aspects, leaving significant gaps for the agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning none of the 4 parameters (name, isDefault, libraryId, description) are documented in the schema. The description adds no information about these parameters, failing to compensate for the lack of schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '更新素材库' (Update material library) restates the tool name 'clawhup_patch_material_library' in Chinese, making it a tautology. It doesn't specify what aspect of the material library is updated or how it differs from sibling tools like 'clawhup_update_prompt' or 'clawhup_patch_material_item', which have similar naming patterns.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are no mentions of prerequisites, context, or comparisons to sibling tools such as 'clawhup_create_material_library' or 'clawhup_delete_material_library', leaving the agent with no usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_patch_sessionDInspect
更新会话。
| Name | Required | Description | Default |
|---|---|---|---|
| title | No | ||
| pinned | No | ||
| status | No | ||
| sessionId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. The description only states '更新会话' (Update session), which implies a mutation operation but reveals nothing about permissions required, whether changes are reversible, rate limits, error conditions, or what the response looks like. For a tool with 4 parameters and no output schema, this lack of behavioral information is a critical gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with just two characters ('更新会话'), which is front-loaded and wastes no words. However, this brevity comes at the cost of being under-specified rather than efficiently informative. While structurally minimal, it lacks the substance needed for effective tool selection.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (4 parameters, 1 required), 0% schema description coverage, no annotations, and no output schema, the description is completely inadequate. It doesn't explain what the tool does beyond the basic verb, provide parameter meanings, describe behavioral traits, or differentiate from siblings. For a mutation tool with multiple inputs, this level of incompleteness is unacceptable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 0%, meaning none of the 4 parameters (sessionId, title, pinned, status) are documented in the schema. The description adds no information about what these parameters mean, their expected formats, or how they affect the session update. For example, it doesn't clarify if 'status' has specific allowed values or what 'pinned' controls. The description fails to compensate for the complete lack of schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '更新会话' (Update session) is a tautology that essentially restates the tool name 'clawhup_patch_session' without providing meaningful context. While it indicates the verb (update) and resource (session), it doesn't specify what aspects of a session can be updated or how this differs from sibling tools like 'clawhup_update_prompt' or 'clawhup_update_prompt_content'. The description lacks the specificity needed for clear differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple sibling tools that might handle session-related operations (e.g., 'clawhup_create_session', 'clawhup_delete_session', 'clawhup_get_session'), but the description doesn't indicate whether this is for partial updates, full updates, or specific use cases. No prerequisites, exclusions, or contextual advice are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_project_listBInspect
列出所有项目。
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. '列出所有项目' implies a read-only operation that returns all projects, but it doesn't specify critical details like pagination, sorting, rate limits, authentication requirements, or error conditions. For a list operation with zero annotation coverage, this is inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence ('列出所有项目') that directly states the tool's purpose with zero wasted words. It's front-loaded and perfectly sized for a simple list tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (0 parameters, no output schema), the description is minimally complete but lacks depth. It doesn't address behavioral aspects (e.g., return format, pagination) or usage context, which are important even for simple tools. With no annotations and no output schema, more detail would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0 parameters with 100% coverage, so no parameter documentation is needed. The description doesn't add parameter details, which is appropriate. A baseline of 4 is given because the schema fully handles parameters, and the description doesn't need to compensate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '列出所有项目' (List all projects) clearly states the verb ('list') and resource ('projects'), making the purpose immediately understandable. It distinguishes from siblings like 'clawhup_doc_list' or 'clawhup_spec_list' by specifying 'projects' as the resource. However, it doesn't explicitly differentiate from potential project-related siblings (none are listed), so it's not a perfect 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., authentication), when it's appropriate (e.g., for browsing vs. searching), or what to use instead (e.g., if you need filtered results). With many sibling tools present, this lack of context is a significant gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_prompt_detailAInspect
获取 ClawHup Prompt 详情,包含所需变量(参数)说明。执行前先调用此工具了解参数。
| Name | Required | Description | Default |
|---|---|---|---|
| promptId | Yes | Prompt ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states this is a retrieval operation ('获取' - get) which implies it's likely read-only, but doesn't explicitly confirm this or mention other behavioral traits like authentication requirements, rate limits, or error conditions. The description adds some context about its preparatory role, but lacks comprehensive behavioral disclosure for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately concise with two clear sentences that each serve distinct purposes: the first states what the tool does, the second provides usage guidance. There's no wasted language or redundancy. It could potentially be slightly more structured by separating the purpose and guidance more explicitly, but it's efficiently written.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter retrieval tool with no annotations and no output schema, the description provides adequate but minimal context. It explains the purpose and when to use it, but doesn't describe what the returned '详情' (details) include or the format of the response. Given the tool's apparent simplicity and the schema's good coverage, the description is complete enough for basic understanding but lacks output information that would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with one parameter (promptId) fully documented in the schema. The description mentions '包含所需变量(参数)说明' (including required variable/parameter explanations), which aligns with but doesn't add meaningful semantics beyond what the schema already provides. Since schema coverage is high, the baseline score of 3 is appropriate - the description doesn't compensate but doesn't need to.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: '获取 ClawHup Prompt 详情,包含所需变量(参数)说明' (Get ClawHup Prompt details, including required variable/parameter explanations). It specifies the verb ('获取' - get/retrieve) and resource ('ClawHup Prompt 详情' - ClawHup Prompt details), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'clawhup_get_prompt_version' or 'clawhup_get_prompt_versions', which appear to serve similar retrieval functions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance: '执行前先调用此工具了解参数' (Call this tool before execution to understand parameters). This clearly indicates when to use this tool (as a prerequisite step before executing other operations that require parameter knowledge) and implies its preparatory nature. The guidance is specific and actionable.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_rewrite_generationDInspect
改写已生成内容。
| Name | Required | Description | Default |
|---|---|---|---|
| action | Yes | ||
| provider | No | ||
| modelName | No | ||
| sessionId | No | ||
| instruction | No | ||
| generationId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. The description says nothing about whether this is a read-only or destructive operation, what permissions might be required, how the rewrite affects the original content, what the response format looks like, or any rate limits. For a tool with 6 parameters that presumably modifies content, this lack of behavioral information is severely inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While technically concise with just 5 Chinese characters, this is a case of severe under-specification rather than effective conciseness. The single phrase doesn't provide enough information to be useful, making it inefficient despite its brevity. Every sentence should earn its place, but this description doesn't earn its place by providing meaningful guidance.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (6 parameters, no output schema, no annotations), this description is completely inadequate. It doesn't explain what the tool does beyond the name, provides no parameter guidance, no behavioral context, and no differentiation from sibling tools. For a content modification tool with multiple parameters, this minimal description fails to provide the necessary context for an agent to use it effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage and 6 parameters (2 required), the description provides zero information about any parameters. The schema shows parameters like 'generationId', 'action' (with enum values), 'provider', 'modelName', 'sessionId', and 'instruction', but the description doesn't mention any of them. This leaves the agent completely in the dark about what inputs are needed and how they affect the rewrite operation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '改写已生成内容' (Rewrite generated content) is a tautology that essentially restates the tool name 'clawhup_rewrite_generation'. It provides a generic verb+object but lacks specificity about what kind of content is being rewritten or how it differs from sibling tools like 'clawhup_generate' or 'clawhup_update_prompt_content'. The purpose is vague and doesn't distinguish this tool's unique function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides absolutely no guidance on when to use this tool versus alternatives. With numerous sibling tools like 'clawhup_generate', 'clawhup_update_prompt_content', and 'clawhup_reproduce_share', the agent has no indication of when rewriting generated content is appropriate versus creating new content or modifying prompts. This leaves the agent guessing about the tool's intended context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_rollback_prompt_versionCInspect
回滚 Prompt 版本。
| Name | Required | Description | Default |
|---|---|---|---|
| promptId | Yes | ||
| versionId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. 'Rollback' implies a destructive mutation that reverts to a previous state, but the description doesn't specify whether this requires special permissions, what happens to the current version, if the rollback is reversible, or any side effects. It lacks critical details for a mutation operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with just one phrase ('回滚 Prompt 版本'), which is front-loaded and wastes no words. For its limited content, it's efficiently structured, though this conciseness comes at the cost of completeness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a mutation tool with no annotations, 0% schema coverage, two required parameters, and no output schema, the description is completely inadequate. It doesn't explain the rollback process, success conditions, error cases, or what the tool returns. The agent lacks essential context to use this tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate for two undocumented parameters (promptId and versionId). The description mentions 'Prompt version' which hints at versionId, but doesn't explain what promptId represents, the format of these IDs, or their relationship. It adds minimal value beyond the bare schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '回滚 Prompt 版本' (Rollback Prompt version) is a tautology that essentially restates the tool name 'clawhup_rollback_prompt_version' in Chinese. It specifies the verb 'rollback' and resource 'Prompt version', but doesn't distinguish from siblings like 'clawhup_update_prompt' or 'clawhup_save_prompt_version' that also modify prompts. The purpose is clear at a basic level but lacks differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are no indications of prerequisites, conditions for rollback, or comparisons to sibling tools like 'clawhup_update_prompt' or 'clawhup_get_prompt_versions'. This leaves the agent with no context for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_save_prompt_versionDInspect
保存 Prompt 版本。
| Name | Required | Description | Default |
|---|---|---|---|
| tags | No | ||
| title | No | ||
| content | Yes | ||
| promptId | Yes | ||
| outputType | No | ||
| description | No | ||
| changeSummary | No | MCP save version | |
| timeoutSeconds | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. The description '保存 Prompt 版本' (Save Prompt version) implies a write operation that creates or persists data, but it doesn't specify whether this is a destructive action, what permissions are required, what happens on success/failure, or any rate limits. For a mutation tool with zero annotation coverage, this lack of behavioral information is a significant gap that could lead to misuse.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise ('保存 Prompt 版本。') but this is under-specification rather than effective brevity. It's a single short sentence that fails to convey necessary information, making it inefficient rather than well-structured. While it's front-loaded (the only sentence), it doesn't earn its place by adding value beyond the tool name.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (8 parameters, 2 required), zero schema description coverage, no annotations, and no output schema, the description is completely inadequate. It doesn't explain what the tool does beyond the name, provides no parameter guidance, offers no behavioral context for a mutation operation, and gives no indication of return values or error conditions. This leaves the agent with insufficient information to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 8 parameters with 0% description coverage, meaning none of the parameters have descriptions in the schema. The tool description provides no information about any parameters—it doesn't mention 'promptId', 'content', 'tags', 'title', or other fields. With 0% schema coverage and no parameter guidance in the description, the agent has no semantic understanding of what these parameters mean or how to use them correctly.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '保存 Prompt 版本' (Save Prompt version) is a tautology that essentially restates the tool name 'clawhup_save_prompt_version' in Chinese. It doesn't specify what 'saving a prompt version' entails operationally or what resource it acts upon beyond the obvious. While it indicates a verb ('保存' - save) and resource ('Prompt 版本' - prompt version), it lacks specificity about what constitutes a prompt version or how this differs from related tools like 'clawhup_update_prompt' or 'clawhup_create_prompt'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are multiple sibling tools related to prompts (e.g., 'clawhup_create_prompt', 'clawhup_update_prompt', 'clawhup_get_prompt_version', 'clawhup_rollback_prompt_version'), but the description doesn't clarify when saving a version is appropriate versus creating a new prompt or updating an existing one. No context, prerequisites, or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_search_promptsBInspect
搜索 ClawHup 平台上的 AI Prompt。支持关键词搜索、按类型/标签筛选。
| Name | Required | Description | Default |
|---|---|---|---|
| tag | No | 标签筛选 | |
| limit | No | 返回数量,默认 10 | |
| query | No | 搜索关键词 | |
| outputType | No | 输出类型筛选 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions the tool '支持关键词搜索、按类型/标签筛选' (supports keyword search and filtering by type/tag), which gives some behavioral context about filtering capabilities. However, it lacks details on permissions, rate limits, pagination (beyond the 'limit' parameter), or what the return format looks like (no output schema). For a search tool with zero annotation coverage, this is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose ('搜索 ClawHup 平台上的 AI Prompt') and adds supporting details ('支持关键词搜索、按类型/标签筛选'). There is no wasted text, and it's appropriately sized for a search tool with clear parameters.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (search with filtering), 100% schema coverage, no annotations, and no output schema, the description is adequate but has gaps. It covers the purpose and basic capabilities, but lacks behavioral context (e.g., permissions, rate limits) and output details, which are important for a search operation. It's minimally viable but not fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with all parameters documented in the schema (query for keywords, tag for filtering, limit for result count, outputType for type filtering). The description adds minimal value by mentioning '关键词搜索、按类型/标签筛选' (keyword search and filtering by type/tag), which loosely maps to 'query', 'tag', and 'outputType', but doesn't provide additional semantics beyond what the schema already states. Baseline 3 is appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('搜索' meaning 'search') and resource ('ClawHup 平台上的 AI Prompt'), specifying it searches for AI Prompts on the ClawHup platform. It distinguishes itself from siblings like 'clawhup_prompt_detail' (which gets details) and 'clawhup_list_my_prompts' (which lists without search), but doesn't explicitly name these alternatives, keeping it at 4 instead of 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by mentioning support for keyword search and filtering by type/tag, suggesting it's for finding prompts with specific criteria. However, it doesn't explicitly state when to use this tool versus alternatives like 'clawhup_list_my_prompts' or 'clawhup_prompt_detail', nor does it provide exclusions or prerequisites, making it only implied guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_spec_createCInspect
创建新的迭代 Spec。
| Name | Required | Description | Default |
|---|---|---|---|
| code | Yes | Spec 编码(唯一) | |
| title | Yes | Spec 标题 | |
| milestone | No | 迭代版本号 | |
| projectId | Yes | 项目 ID | |
| specLevel | Yes | Spec 级别: L0/L1/L2/L3 | |
| description | No | Spec 描述 | |
| parentDocId | No | 父 Spec ID(L2 必须指定 L1,L3 必须指定 L2) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states '创建新的' (create new), implying a write operation, but doesn't cover critical aspects like permissions needed, whether the creation is idempotent, error handling, or what the response looks like (since there's no output schema). This leaves significant gaps for an agent to understand the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese ('创建新的迭代 Spec。') that directly states the tool's purpose without any fluff or redundancy. It's appropriately sized and front-loaded, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a creation tool with 7 parameters, no annotations, and no output schema, the description is incomplete. It doesn't explain behavioral traits, usage context, or return values, leaving the agent with insufficient information to invoke the tool effectively beyond basic parameter passing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with all 7 parameters well-documented in the schema (e.g., 'specLevel' as 'L0/L1/L2/L3', 'parentDocId' dependencies). The description adds no additional parameter information beyond the schema, so it meets the baseline of 3 where the schema does the heavy lifting without extra value from the description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('创建新的' meaning 'create new') and resource ('迭代 Spec' meaning 'iteration Spec'), making the purpose evident. However, it doesn't differentiate from sibling tools like 'clawhup_doc_create_spec' or 'clawhup_spec_update', which could cause confusion about when to use this specific tool versus alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'clawhup_doc_create_spec' or 'clawhup_spec_update'. The description lacks context about prerequisites, such as needing an existing project or parent spec for certain levels, which is only hinted at in the schema but not in the description itself.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_spec_getCInspect
获取 Spec 详情(含正文内容)。
| Name | Required | Description | Default |
|---|---|---|---|
| specId | Yes | Spec ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool retrieves details including body content, which implies a read-only operation, but doesn't specify if it requires authentication, has rate limits, returns structured data, or handles errors. For a tool with no annotation coverage, this leaves significant gaps in understanding its behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that directly states the purpose. It's front-loaded with the core action and includes the key detail about body content. There's no wasted verbiage, making it appropriately concise for a simple retrieval tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (1 required parameter, no nested objects) and high schema coverage, the description is minimally adequate. However, with no annotations and no output schema, it fails to address behavioral aspects like return format or error handling. It meets basic needs but lacks depth for full contextual understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the single parameter 'specId' documented as 'Spec ID'. The description adds no additional parameter semantics beyond this, such as format examples or constraints. Since schema coverage is high, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '获取 Spec 详情(含正文内容)' clearly states the action (获取/get) and resource (Spec), specifying that it retrieves details including body content. It distinguishes from sibling tools like clawhup_spec_list (which lists specs) and clawhup_spec_create/update (which modify specs). However, it doesn't explicitly contrast with clawhup_doc_get, which might retrieve document details, leaving some ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a valid specId), exclusions, or comparisons to siblings like clawhup_spec_list for listing multiple specs or clawhup_doc_get for document details. Usage is implied by the name and description but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_spec_listCInspect
列出项目的 Spec 规格文档。
| Name | Required | Description | Default |
|---|---|---|---|
| category | No | 分类筛选: baseline / iteration | |
| projectId | Yes | 项目 ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It only states the action ('list') without mentioning whether this is a read-only operation, if it requires authentication, what the output format might be, or any rate limits. This is inadequate for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that directly states the tool's purpose without any wasted words. It is appropriately sized and front-loaded, making it easy to understand at a glance.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (a list operation with filtering), lack of annotations, and no output schema, the description is incomplete. It doesn't explain what 'Spec 规格文档' entails, how results are returned, or any behavioral traits, making it insufficient for effective tool selection and invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, so the schema already documents both parameters ('projectId' and 'category'). The description adds no additional meaning or context beyond what the schema provides, such as explaining what 'baseline / iteration' means for the category. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('列出' meaning 'list') and the resource ('Spec 规格文档' meaning 'Spec specification documents'), providing a specific verb+resource combination. However, it doesn't differentiate from sibling tools like 'clawhup_doc_list' or 'clawhup_spec_get', which could also list documents, so it doesn't reach the highest score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are no explicit instructions on context, prerequisites, or comparisons to sibling tools such as 'clawhup_spec_get' or 'clawhup_doc_list', leaving usage unclear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_spec_updateCInspect
更新 Spec 的描述或正文内容。
| Name | Required | Description | Default |
|---|---|---|---|
| specId | Yes | Spec ID | |
| status | No | Spec 状态: confirmed / frozen / implemented | |
| description | No | Spec 描述 | |
| contentTemplate | No | Spec 正文内容 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but only states it's an update operation without behavioral details. It doesn't disclose whether this is a partial or full update, if it's idempotent, what permissions are needed, or what happens on success/failure. For a mutation tool, this leaves critical gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence in Chinese that directly states the tool's function without any fluff. It's appropriately sized and front-loaded with the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with 4 parameters, no annotations, and no output schema, the description is insufficient. It doesn't cover behavioral aspects like side effects, error handling, or return values, leaving the agent with incomplete context for safe and effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are well-documented in the schema. The description adds minimal value by implying 'specId' is required and that 'description' and 'contentTemplate' correspond to the fields mentioned, but doesn't explain relationships or constraints beyond what's in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '更新 Spec 的描述或正文内容' clearly states the action (update) and target (Spec description or content), making the purpose immediately understandable. It doesn't differentiate from sibling tools like 'clawhup_spec_create' or 'clawhup_spec_get', but it's not vague or tautological.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, when to choose this over 'clawhup_spec_create' for new specs or 'clawhup_doc_update' for broader updates, or any constraints like required permissions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_update_generation_visibilityCInspect
更新生成记录公开状态。
| Name | Required | Description | Default |
|---|---|---|---|
| isPublic | Yes | ||
| generationId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states '更新' (update), implying a mutation operation, but doesn't specify whether this requires authentication, what permissions are needed, if it's idempotent, or what happens on failure (e.g., error handling). For a mutation tool with zero annotation coverage, this lack of behavioral details is a significant gap that could lead to misuse.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence in Chinese ('更新生成记录公开状态。') that directly states the action. It's front-loaded with the core purpose and avoids unnecessary words. However, while efficient, this brevity contributes to the lack of detail in other dimensions, making it somewhat under-specified rather than optimally concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (a mutation tool with 2 parameters), no annotations, 0% schema description coverage, and no output schema, the description is incomplete. It doesn't explain the domain (e.g., what a 'generation record' is), behavioral traits, parameter meanings, or expected outcomes. For a tool that modifies data, this level of documentation is inadequate and leaves too much for the agent to guess.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 0%, meaning neither parameter ('generationId' and 'isPublic') has descriptions in the schema. The tool description doesn't add any parameter-specific information—it doesn't explain what 'generationId' refers to (e.g., an ID from another operation) or what 'isPublic' means in context (e.g., visibility to other users). With two undocumented parameters, the description fails to compensate for the schema's lack of documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '更新生成记录公开状态' (Update generation record public status) clearly states the verb (update) and resource (generation record public status), but it's somewhat vague about what 'generation record' refers to specifically. It doesn't distinguish this tool from sibling tools like 'clawhup_update_prompt' or 'clawhup_update_prompt_content', which also perform updates. The purpose is understandable but lacks specificity about the domain context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are no indications of prerequisites, constraints, or comparisons to sibling tools (e.g., 'clawhup_get_generation' for reading or 'clawhup_create_generation_share' for sharing). Without such context, an agent must infer usage from the tool name and parameters alone, which is insufficient for optimal selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_update_promptCInspect
更新 Prompt 元信息。
| Name | Required | Description | Default |
|---|---|---|---|
| tags | No | ||
| title | No | ||
| promptId | Yes | ||
| outputType | No | ||
| description | No | ||
| timeoutSeconds | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. '更新' (update) implies a mutation operation, but the description doesn't specify whether this requires authentication, what permissions are needed, if it's idempotent, or what happens on failure (e.g., error handling). It also doesn't mention side effects like versioning or whether changes are reversible. For a mutation tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with a single sentence '更新 Prompt 元信息' (Update Prompt metadata), which is front-loaded and wastes no words. While it may be under-specified, it's not verbose or poorly structured—every word serves the core purpose without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (a mutation tool with 6 parameters, 1 required), no annotations, no output schema, and 0% schema description coverage, the description is incomplete. It doesn't provide enough context for the agent to understand how to use the tool effectively, what to expect in return, or handle edge cases. The description should explain more about the update operation, parameter interactions, and expected outcomes to be considered complete for this context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 0%, meaning none of the 6 parameters (promptId, tags, title, outputType, description, timeoutSeconds) are documented in the schema. The description '更新 Prompt 元信息' (Update Prompt metadata) only vaguely hints at what might be updated (metadata), but it doesn't explain any parameters, their purposes, or how they relate to 'metadata'. For example, it doesn't clarify that 'promptId' is required to identify the prompt, or what 'outputType' with its enum values means in context. The description fails to compensate for the lack of schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '更新 Prompt 元信息' (Update Prompt metadata) states a clear verb ('更新' - update) and resource ('Prompt 元信息' - Prompt metadata), which is better than a tautology. However, it's somewhat vague about what 'metadata' specifically entails compared to sibling tools like 'clawhup_update_prompt_content' (which updates content) or 'clawhup_prompt_detail' (which retrieves details). It doesn't explicitly distinguish itself from these siblings, leaving ambiguity about scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are several related tools (e.g., 'clawhup_update_prompt_content', 'clawhup_prompt_detail', 'clawhup_create_prompt'), but the description doesn't mention any of them or clarify the specific context for updating metadata versus other operations. This lack of differentiation leaves the agent without explicit usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clawhup_update_prompt_contentDInspect
更新 Prompt 模板内容。
| Name | Required | Description | Default |
|---|---|---|---|
| content | Yes | ||
| promptId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. It only states '更新 Prompt 模板内容' (Update Prompt template content), which implies a mutation operation but provides no details about permissions required, whether changes are reversible, rate limits, error conditions, or what happens to existing content. This is inadequate for a mutation tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise at just 6 Chinese characters ('更新 Prompt 模板内容'), which translates to 'Update Prompt template content'. It's front-loaded with the core action and resource. While it's under-specified, every character serves a purpose without waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is completely inadequate for a mutation tool with 2 required parameters, 0% schema description coverage, no annotations, and no output schema. It doesn't explain what the tool does beyond the obvious, provides no parameter guidance, no behavioral context, and no differentiation from similar tools. This leaves the agent with insufficient information to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning neither parameter (promptId, content) has descriptions in the schema. The description adds no information about what these parameters mean, their format, constraints, or examples. For a tool with 2 required parameters and no schema documentation, this represents a significant gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description '更新 Prompt 模板内容' (Update Prompt template content) is a tautology that restates the tool name 'clawhup_update_prompt_content' in Chinese. It specifies the verb 'update' and resource 'Prompt template content', but doesn't differentiate from sibling tools like 'clawhup_update_prompt' or 'clawhup_prompt_detail'. The purpose is clear but lacks distinction from alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, context, or exclusions. With sibling tools like 'clawhup_update_prompt' and 'clawhup_prompt_detail', there's no indication of how this tool differs or when it should be selected.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!