DialogBrain
Server Details
AI-powered unified inbox with MCP tools for managing conversations, contacts, and knowledge across WhatsApp, Telegram, Instagram, Email, and LinkedIn.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.1/5 across 99 of 119 tools scored. Lowest: 2.9/5.
Most tools have clearly distinct purposes with detailed descriptions. However, the sheer number (119) introduces some potential confusion, especially between agent_handoff and agents_ask, though they serve different roles. Overall, an agent can reliably distinguish tools.
All tools follow a consistent snake_case pattern with a noun_verb structure (e.g., agents_create, messages_send, files_upload). No mixed conventions or unpredictable naming; the pattern is maintained across all 119 tools.
119 tools is extremely excessive for any server. Even for a comprehensive platform like DialogBrain, this overwhelms both agents and users. Typical well-scoped servers have 3-15 tools; this is far beyond that, making navigation and selection impractical.
The tool surface covers a vast range of functionalities (agents, knowledge, messaging, contacts, LinkedIn, etc.) with few obvious gaps. Missing features like contact deletion or thread archiving are minor; overall, the domain appears well-covered.
Available Tools
131 toolsagent_handoffAInspect
Delegate a multi-step task (research, composing messages, booking, scheduling) to the full agentic planner. Use when a user ask needs more than a direct answer. Returns final_answer for you to narrate in one short sentence. Do NOT re-trigger the same handoff if the tool_result has status timeout or error — acknowledge and offer to retry.
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Override the escalation model. Omit (recommended) to use the calling agent's configured model from settings; falls back to claude-sonnet-4-6 when no agent context. Ignored when `agent_id` is set — the target agent uses its own stored model. | |
| agent_id | No | Optional ID of another agent in the same workspace to delegate the task to. When set, the target agent runs with ITS OWN prompt, tools, and model; `task_description` becomes its user query. Spawns a new trace linked back to this trace via parent_trace_id (visible in the admin lineage card). Omit to run a sub-loop on the calling agent (default behaviour). | |
| task_description | Yes | Plain-language description of what the planner should accomplish. Include everything the planner needs: the user's goal, constraints, and any context already gathered in this voice call. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Given no annotations, the description fully compensates: discloses return value (`final_answer`), error handling (do not re-trigger), delegation behavior with `agent_id` (spawns new trace), and model override fallback logic.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with purpose, then usage notes and error handling. No redundant information, every sentence serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (multi-step delegation, return format, error scenarios), the description covers all necessary aspects. No output schema but the return value is explained.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. Description adds valuable context: explains model override behavior when `agent_id` is set, describes trace linking for `agent_id`, and clarifies task_description should include all context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool delegates multi-step tasks to a planner and is for complex asks. It distinguishes from sibling tools by its unique purpose but does not explicitly differentiate from all siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use (when more than a direct answer is needed) and provides a clear don't-re-trigger condition for timeout/error results. Does not list alternative tools but the context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_add_fileAInspect
Attach a file to this agent's private knowledge (agent-specific files, not shared with other agents).
Workflow:
Upload the file with files_upload (pass source_url for remote files)
Index it with files_ingest (pass the file_id)
Call this tool with agent_id + file_id
Returns chunk_count — shows 0 while still processing. Call agents.list_files later to see the final chunk count once indexing completes.
| Name | Required | Description | Default |
|---|---|---|---|
| file_id | Yes | file_id returned by files_upload or files_ingest | |
| agent_id | Yes | ID of the agent to attach the file to |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses that files are agent-specific and not shared, and explains the async behavior where chunk_count returns 0 until indexing completes. This adds meaningful behavioral context beyond the schema, though it could be more comprehensive (e.g., permissions, idempotency).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a numbered workflow and clear sections. It is concise (5 sentences) with no filler, front-loads the purpose, and every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with only 2 parameters and no output schema, the description is remarkably complete. It covers prerequisites, workflow, return value behavior (chunk_count 0 while processing), and follow-up actions, leaving no significant gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters described. The description adds workflow context (e.g., file_id from upload/ingest) but does not significantly enhance the parameter meanings beyond the schema. Per rubric, baseline is 3 for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Attach a file to this agent's private knowledge (agent-specific files, not shared with other agents).' This uses a specific verb ('attach') and resource ('file to agent's knowledge'), and distinguishes it from siblings like agents_remove_file and agents_list_files.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a detailed 3-step workflow (upload, index, attach) and notes that chunk_count shows 0 while processing, with a follow-up suggestion to use agents.list_files later. It implicitly tells when not to use (must have uploaded and ingested first) and gives explicit steps, fully guiding the agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_approve_draftAInspect
Approve a pending agent draft and send the message.
The draft will be sent to the conversation it was generated for. You can optionally edit the text before sending.
Use this when user says:
'Approve this draft'
'Send this reply'
'Approve and send'
'Looks good, send it'
IMPORTANT: This will send a message to a real person.
| Name | Required | Description | Default |
|---|---|---|---|
| draft_id | Yes | ID of the draft to approve | |
| edited_text | No | Optional edited response text (if user wants to modify before sending) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full responsibility. It discloses that the action sends a message to a real person and that text can be optionally edited. This is a critical behavioral trait for an irreversible action.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with only 4 sentences plus a list of examples. It is well-structured, front-loading the main purpose, and every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with 2 parameters and no output schema, the description covers purpose, usage triggers, and a critical warning, making it complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema provides full coverage with descriptions for both parameters. The description adds context by mentioning optional editing, but does not significantly enhance understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Approve a pending agent draft and send the message.' It uses specific verbs and resources, and the list of example user phrases helps distinguish this from siblings like agents_reject_draft.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly lists example user phrases that trigger use, and it warns about sending to a real person. However, it does not explicitly state when not to use the tool or mention alternatives beyond the implied rejection sibling.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_askAInspect
Send a message to an AI agent and get its response.
The agent runs with its configured prompt, tools, and knowledge. Use this to test agents or have them process a task.
Returns: {status: 'replied'|'silent', response_text, messages[], full_reply, model_used, tokens_*, send_mode, execution_mode}. messages[] carries each messages.send invocation the agent made (text, subject, reply_to_message_id, timestamp, message_id, attachments=[{file_id,name,mime}]). full_reply concatenates text only — attachment-only sends show up in messages but not full_reply. status='silent' iff both response_text is empty AND messages is empty.
Execution may take 10-60s depending on agent complexity.
| Name | Required | Description | Default |
|---|---|---|---|
| message | Yes | Message/goal to send to the agent | |
| agent_id | Yes | ID of the AI agent to ask | |
| send_mode | No | Send mode for the agent run: 'draft' = create drafts, 'auto' = send directly. Defaults to the agent's configured default_send_mode. Does NOT change execution_mode — that is fixed by the agent's config. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses execution time (10-60s), explains the return value in detail, and describes the agent's behavior (runs with configured prompt, tools, knowledge). It lacks explicit safety/permission info but is otherwise transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is somewhat lengthy but well-structured: purpose first, then return format details. Every sentence adds value, though some output details could be condensed. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description provides a complete return format, including edge cases like 'silent' status and attachment-only sends. It covers timing, parameters' effect on output, and execution mode. Thorough and sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so parameters are already described. The description adds value by clarifying send_mode's default behavior and its independence from execution_mode, going beyond the schema. No additional semantics needed for message and agent_id.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool sends a message to an AI agent and gets its response, and distinguishes it from sibling tools like agent_handoff by specifying 'test agents or have them process a task'. It uses specific verbs and resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use this to test agents or have them process a task', providing clear usage context. It does not explicitly mention when not to use or list alternatives, but the sibling names imply distinction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_createAInspect
Create a new AI agent in the workspace.
Execution modes:
ai_assisted (default, recommended): Two-phase AI — fast pre-classifier (Haiku) for keyword filtering and simple replies, then full AI with tools for complex messages. Best for: auto-replies, group monitoring, keyword-based filtering.
agentic: Autonomous multi-step agent with planning and tool execution. Best for: complex scheduled tasks, multi-step automation.
rule_based: Simple pattern matching without AI.
For keyword filtering: use ai_assisted mode + set keywords in trigger conditions (free, deterministic) and/or auto_reply_rules (smart, LLM-based) via agents.update.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Name of the AI agent (1-100 characters) | |
| prompt_id | No | ID of the prompt to assign to this agent | |
| send_mode | No | Default send mode: 'auto' or 'draft' (default) | |
| description | No | Optional description of what this agent does | |
| execution_mode | No | Execution mode: 'rule_based', 'ai_assisted' (default), 'agentic', 'claude_channels', or 'voice' |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must disclose behavioral traits. It explains the behavior of execution modes (two-phase AI, autonomous planning, simple pattern matching). However, it omits authorization needs, rate limits, side effects, and return value. Since no output schema exists, return format is unclear.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is fairly concise and front-loads the main purpose. The bullet-like list of execution modes uses clear formatting. Minor redundancy exists (e.g., mentioning default in both description and schema) but overall efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (5 params, 2 enums, no output schema), the description covers the main purpose, execution modes, and parameter context well. Missing return value and error handling, but the richness of mode explanations compensates.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds substantial value beyond the schema. It elaborates on execution modes with detailed explanations and best-use scenarios, which are not present in the schema's enum descriptions. This helps the agent select the correct mode.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Create a new AI agent in the workspace.' It specifies the action (create), resource (AI agent), and scope (workspace). This distinguishes it from sibling tools like agents_update (update) and agents_delete (delete).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit best-use context for each execution mode (e.g., 'Best for: auto-replies, group monitoring, keyword-based filtering' for ai_assisted). It also gives specific advice for keyword filtering to use ai_assisted mode. However, it does not explicitly state when not to use this tool or compare to alternatives like agents_add_file.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_deleteAInspect
Permanently delete an AI agent.
WARNING: This cannot be undone. The agent and all its triggers will be removed.
| Name | Required | Description | Default |
|---|---|---|---|
| agent_id | Yes | ID of the agent to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Given no annotations, the description bears full responsibility for behavioral disclosure. It explicitly warns that deletion is permanent and cannot be undone, and that all triggers will be removed, which are critical side effects. Minor omissions like return format or permissions are acceptable for a straightforward delete tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—two sentences with no filler. It front-loads the action and follows with a clear warning, making it easy for an agent to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple delete operation with a single parameter, the description covers purpose, irreversibility, and cascading deletion of triggers. It is adequate but could briefly mention the return value (e.g., success confirmation) for full completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the single parameter is already well-documented. The description does not add extra meaning beyond the schema's 'ID of the agent to delete,' so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Permanently delete an AI agent,' using a specific verb and resource. It distinguishes this tool from siblings like agents_create, agents_get, agents_update, and agents_trigger_delete, which handle different operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description lacks guidance on when to use this tool versus alternatives. It does not mention prerequisites, confirmation steps, or conditions under which deletion is appropriate, leaving the agent without clear decision support.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_getAInspect
Get detailed information about a specific AI agent.
Returns full agent config including:
Execution configuration
Tool configuration
Knowledge configuration
Escalation configuration
Triggers list
Knowledge collections
Custom AI instructions (prompt_text)
Auto-reply rules override (auto_reply_rules)
| Name | Required | Description | Default |
|---|---|---|---|
| agent_id | Yes | ID of the AI agent to fetch |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry full burden for behavioral disclosure. It describes the return data but does not disclose behavioral traits such as permissions required, rate limits, or whether the operation is safe (non-destructive). The agent is left unaware of any side effects or access constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, with a brief introductory sentence followed by a structured bullet list. Every sentence adds value, though the list could be slightly more compact. No extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, no output schema), the description adequately explains the return value with a detailed list. However, it lacks any information about error conditions or access control, which would improve completeness. Overall, it is sufficient for a basic fetch operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the single parameter 'agent_id' is described as 'ID of the AI agent to fetch'. The description does not add additional meaning beyond the schema, so it meets the baseline without enhancement.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get detailed information about a specific AI agent' with a specific verb and resource. The bullet list of returned fields further clarifies the scope. Among siblings, this tool is clearly distinct as a read operation, unlike agents_create, agents_update, or agents_delete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for fetching full agent details by ID but does not explicitly state when to use it versus alternatives like agents_list or agents_trace_get. There is no guidance on prerequisites or exclusions, leaving the agent to infer context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_silenceAInspect
End this turn without sending any message. Use when the thread is owned by a human operator after job.escalate, when the guest is self-resolving, when the message is a duplicate, or for observation-only turns. Calling this tool is the ONLY correct way to stay silent — narrated silence text (e.g. '(Staying silent…)', 'Internal:…') would be delivered to the guest verbatim.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | Yes | Free-form explanation for admin audit. Stored in trace_tool_executions.tool_params (ClickHouse String; reason filters are scan-only). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description covers the key behavioral trait: ending the turn silently. It warns that narrated silence would be sent verbatim. It does not mention any side effects or required permissions, but for a simple silent action, the transparency is sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no redundancy. The first sentence states the primary action, and the second provides use cases and a critical warning. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, no output schema, no nested objects), the description is complete. It covers purpose, usage conditions, and parameter behavior. No missing details are apparent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter 'reason' has a description that adds value beyond the schema: it explains the field is free-form, stored for audit in trace_tool_executions.tool_params, and that reason filters are scan-only. Schema coverage is 100%, but the extra context justifies a higher score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'End this turn without sending any message.' It specifies the verb (end) and resource (turn), and the phrase 'ONLY correct way to stay silent' distinguishes it from potentially confusing alternatives like narrated silence text.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit usage guidance is provided: 'Use when the thread is owned by a human operator after job.escalate, when the guest is self-resolving, when the message is a duplicate, or for observation-only turns.' It also warns against using narrated silence text, making the conditions for use very clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_listAInspect
List all AI agents configured in the workspace.
Returns agents with their basic info, trigger count, and knowledge collection count.
Use this to:
See all configured AI agents
Filter by enabled/disabled status
Get agent IDs for further operations
| Name | Required | Description | Default |
|---|---|---|---|
| enabled | No | Filter by enabled status (true = enabled only, false = disabled only, omit = all) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It implies a read-only list operation but does not explicitly state no side effects. It describes the output structure, which adds some transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences plus a bullet list, front-loaded with main purpose, concise and no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers essential aspects: what it does, what it returns, how to filter. Lacks mention of pagination or limits, but for a simple list tool it is mostly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%. The description mentions filtering by enabled status but does not add significant meaning beyond what the schema already provides. Baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists all AI agents in the workspace, specifying the returned data (basic info, trigger count, knowledge count). It distinguishes from siblings like agents_get (single agent) and agents_list_drafts (drafts).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit use cases: see all agents, filter by enabled/disabled status, get IDs for further operations. It does not explicitly state when not to use it, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_list_draftsAInspect
List pending agent drafts awaiting approval.
Shows drafts that have been generated by AI agents but not yet sent. Each draft includes:
Thread/conversation info
Trigger message (what prompted the reply)
Generated response text
Creation time and expiration
Use this when user asks:
'Show pending agent drafts'
'What messages are waiting for approval?'
'List drafts to approve'
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of drafts to return | |
| thread_id | No | Filter by specific thread ID (optional) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. It discloses that drafts have expiration, creation time, and are unsent, but does not mention any safety or authorization needs. Given the read-only nature of listing, it is adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and well-structured: a clear purpose line, bullet points of included fields, and example queries. No fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description explains the return fields (thread info, trigger, response, creation time, expiration) well, compensating for the lack of output schema. It does not cover pagination or ordering, but is otherwise complete for a list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage for both parameters (limit and thread_id). The description adds no new meaning beyond what is already in the schema, so baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'List pending agent drafts awaiting approval' and provides example user queries. It unambiguously identifies the tool's function and distinguishes it from siblings like agents_approve_draft and agents_reject_draft.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use this when user asks...' and lists specific queries. It implies context for use, though it does not explicitly state when not to use it or name alternatives beyond the general sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_list_filesAInspect
List files directly attached to this agent (agent-specific files, not shared collections).
Returns file_id, title, status, and chunk_count for each file. chunk_count shows how many indexed chunks were created — 0 means the file is still processing.
Use agents.add_file to attach a new file, or agents.remove_file to detach one.
| Name | Required | Description | Default |
|---|---|---|---|
| agent_id | Yes | ID of the agent whose files to list |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Describes return fields (file_id, title, status, chunk_count) and explains that chunk_count=0 means still processing. No annotations provided, so this is valuable behavioral disclosure. Does not explicitly state read-only, but it's implied by 'list'.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences plus return field explanation and usage note. Front-loaded with primary function, no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one param, no output schema), the description covers purpose, scope, return fields, and related tools. It is complete for an agent to select and use this tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for the single parameter agent_id, so baseline score applies. The description does not add additional meaning or context beyond the schema's description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it lists files attached to a specific agent, distinguishing from shared collections. The verb 'list' and resource 'files attached to this agent' are explicit and differentiate from sibling tools like collections_list_files.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides context that it lists agent-specific files (not shared collections) and mentions related tools for add/remove (agents.add_file, agents.remove_file). Lacks explicit when-not or alternative comparisons, but the distinction is clear enough for typical use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_prompt_historyAInspect
List past versions of an agent's prompt_text. Every edit to the agent's prompt is snapshotted to an append-only table — use this tool to browse history, find a prior known-good version, and copy it into agents.prompt_restore.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max versions to return (1-200, default 50) | |
| agent_id | Yes | ID of the agent | |
| before_version | No | Cursor: return versions strictly below this version_number |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description states that 'every edit to the agent's prompt is snapshotted to an append-only table,' disclosing the immutable, append-only nature of the history. It implies read-only behavior, which is sufficient given no annotations. No mention of auth needs or rate limits, but acceptable for a browsing tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, followed by usage guidance. Every word is meaningful; no redundancy. Efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema is provided, and the description does not specify the return format (e.g., fields returned like version_number, prompt_text, timestamp). It mentions 'list past versions' but lacks details on what agents can expect, which is important for an API tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions (agent_id, limit, before_version). The description adds minimal extra semantics beyond contextualizing the parameters as part of a browsing history. Baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'List past versions of an agent's `prompt_text`.' It uses a specific verb (list) and resource (past versions of prompt_text), and differentiates from the sibling tool `agents.prompt_restore` by mentioning the intended workflow.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a clear use case: 'use this tool to browse history, find a prior known-good version, and copy it into `agents.prompt_restore`.' It gives context for when to use it but does not explicitly state when not to use it or list alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_prompt_restoreAInspect
Restore a past version of an agent's prompt_text by version_number. Creates a new version pointing at the restored content — history is preserved. Use agents.prompt_history first to find the version_number you want.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Optional: why this restore is happening (shows up in history UI) | |
| agent_id | Yes | ID of the agent | |
| version_number | Yes | The version_number to restore (get it from agents.prompt_history) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that restore creates a new version and preserves history, indicating non-destructive behavior, which is important since no annotations are provided.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three well-structured, front-loaded sentences without any extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, outcome, and prerequisite adequately for this simple restore tool, with no missing elements given the lack of output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Adds context beyond the schema: explains reason parameter's purpose ('shows up in history UI') and that version_number comes from prompt_history.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it restores a past version of an agent's prompt_text by version_number, distinguishing it from sibling tools like agents_prompt_history and prompts_prompt_restore.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly instructs to first use agents.prompt_history to obtain the version_number, providing clear guidance on prerequisite steps.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_reject_draftAInspect
Reject a pending agent draft without sending.
The draft will be marked as rejected and won't be sent. Use this when the generated response isn't appropriate.
Use this when user says:
'Reject this draft'
'Don't send this'
'Cancel this reply'
'Delete this draft'
'This response is wrong'
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Optional reason for rejection (for logging/feedback) | |
| draft_id | Yes | ID of the draft to reject |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that the draft will be marked as rejected and won't be sent, which is basic behavioral information. However, without annotations, it lacks details on permanence, permissions, or side effects. The description is adequate for a simple action but not rich.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is extremely concise: two short sentences for the main function and a bullet list of use cases. No redundant information. Front-loaded with the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and the simple nature of the tool (reject a draft), the description covers the purpose and usage adequately. Could mention what the tool returns or any confirmation, but this is not critical for a basic rejection action.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage with descriptions for both parameters. The description does not add additional meaning or clarify usage beyond what the schema provides. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the action (reject) and resource (pending agent draft). Distinguishes itself from sibling tools like agents_approve_draft by its focus on rejection. The verb and resource are specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit example user phrases that trigger use, such as 'Reject this draft' and 'Don't send this'. It also gives a general condition ('when the generated response isn't appropriate'). Does not mention alternatives like agents_approve_draft, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_remove_fileAInspect
Remove a file from this agent's private knowledge.
The file itself is not deleted — it's just detached from this agent. Use agents.list_files to find the file_id to remove.
| Name | Required | Description | Default |
|---|---|---|---|
| file_id | Yes | ID of the file to detach (from agents.list_files) | |
| agent_id | Yes | ID of the agent to remove the file from |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description clarifies a key behavioral trait: the file is detached, not deleted. However, with no annotations provided, it lacks details on permissions, side effects, or immediate impact on the agent's knowledge, which would make it more transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences long, front-loads the purpose, and every sentence adds value. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple detach operation, the description covers purpose, behavioral nuance, and a prerequisite. It lacks error handling or post-condition details, but these are not critical for this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already has 100% description coverage for both parameters. The description adds marginal value by referencing agents.list_files for file_id, but does not provide additional semantic meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool removes a file from an agent's private knowledge. It distinguishes itself from siblings like agents_add_file by specifying detachment rather than deletion, and references agents.list_files for finding the file_id.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It provides a direct usage hint to use agents.list_files to obtain the file_id. While it does not explicitly state when not to use this tool, the context of removing a file is straightforward and the sibling tools (add, list) make the usage clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_simulate_inboundAInspect
Replay an inbound message on a thread through the real trigger pipeline and return what would have happened. The router auto-picks the winning enabled agent + trigger by priority/specificity (same logic as production). By default send_mode='draft' so no real message is sent; pass send_mode='auto' on a test account to let the matched agent actually deliver (drafts get overwritten by the next draft, so 'auto' is the only way to verify Telegram/email delivery end-to-end).
Use to verify routing for a thread: which agent answers, which trigger wins, or — when nothing matches — the structured skip reason. Pass blockchain_tx_data instead of message_text to simulate a blockchain:transfer event on the thread.
Returns: {matched: true, matched_agent: {id, name, execution_mode}, matched_trigger: {id, trigger_type, conditions, specificity_score}, routing_reason, response_text, messages[], execution_mode, send_mode, model_used, tokens_input, tokens_output, latency_ms, rag_queries_made, rag_results_used} on a hit, or {matched: false, skip_reason, simulator_warnings} on a miss.
| Name | Required | Description | Default |
|---|---|---|---|
| send_mode | No | How the matched agent should deliver its reply. 'draft' (default, safe) creates a draft only — no real send, no idempotency key. 'auto' lets the agent deliver through the channel adapter exactly as it would in production — use this on a test account to verify Telegram/email delivery end-to-end. Drafts get overwritten by the next draft on the thread, so 'auto' is required when you want to see the message persisted. | draft |
| thread_id | Yes | Thread ID to route the simulated event from. Must belong to the API key's workspace. | |
| message_text | No | Inbound message body to simulate. Defaults to '[MCP simulation test]' when omitted. | |
| blockchain_tx_data | No | When set, simulate a blockchain:transfer event instead of a channel:message:new event. Expected keys: chain, to_address / from_address, tx_hash. | |
| attachment_file_ids | No | Optional list of workspace file IDs to attach to the simulated inbound message — same shape as a real Telegram message with image/document attachments. Use this to test agent behavior on incoming messages that carry images (e.g. logos for invoices) or documents the agent must reference. File IDs must belong to the API key's workspace. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It thoroughly explains that send_mode is forced to 'draft', no real message is sent, no idempotency key is written, and routing uses production logic. It also details both hit and miss return structures, including skip reason and simulator_warnings.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is well-structured with a front-loaded purpose paragraph, followed by usage guidance and return details. While it contains many details, each sentence serves a purpose. Slightly lengthy but justified by complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, description provides full return structures for both hit and miss. It covers edge cases (skip reason, warnings) and multiple simulation types. For a 4-parameter simulation tool, this is exceptionally complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. Description adds value by explaining thread_id must belong to workspace, message_text default, blockchain_tx_data expected keys, and attachment_file_ids shape/usage (e.g., for logos or documents). This meaningfully extends the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool replays an inbound message through the real trigger pipeline to test routing. It distinguishes itself by forcing 'draft' mode (no real send) and mentions two simulation types (message_text or blockchain_tx_data). No sibling tool duplicates this functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use to verify routing for a thread' and mentions when no match returns skip reason. Does not explicitly state when not to use it, but the context (simulation vs real send) is implied. Could be improved by naming alternatives like agents_ask or actual send tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_task_completeAInspect
Report that a Claude Code agent task has been completed. Call this when you finish processing an agent_task from DialogBrain.
| Name | Required | Description | Default |
|---|---|---|---|
| success | Yes | Whether the task completed successfully | |
| summary | No | Brief summary of what was done | |
| trace_id | Yes | Trace ID from the agent task event |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must carry burden. It states it 'reports completion' but doesn't disclose side effects, idempotency, or return behavior. Minimal transparency, but acceptable for a simple reporting tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with no redundant information. Front-loaded with action and resource.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with 3 params, no output schema, and clear purpose, the description is adequate. Could mention if it is a one-way notification, but not critical.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. Description adds no extra context beyond what the schema already provides for each parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states 'Report that a Claude Code agent task has been completed' – clear verb and resource. It distinguishes from sibling tools like agents_create or agents_get by focusing on completion reporting.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Call this when you finish processing an agent_task from DialogBrain' – clear when to use. Does not mention when not to use or alternatives, but context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_trace_getAInspect
Fetch the full execution detail for a single trace — tool executions, events timeline, LLM call spans (with error_message on failures).
Use after agents.traces_list identifies a specific trace of interest (failed run, slow run, unexpected outcome).
By default LLM system_prompt and prompt_messages are stripped — set include_llm_bodies=true to fetch them when diagnosing prompt engineering issues (emits a WARNING audit log). Set full=true to disable all field truncation. completion_text on failed LLM calls is always returned (capped at 8 KB).
| Name | Required | Description | Default |
|---|---|---|---|
| full | No | Disable all field truncation. Escape hatch for a human operator. | |
| agent_id | Yes | Expected agent_id — used for scope validation. Mismatch returns not_found. | |
| trace_id | Yes | Trace identifier returned by agents.traces_list. | |
| include_llm_bodies | No | Include system_prompt and prompt_messages in LLM spans. Audited at WARNING level. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, description carries full burden and excels: discloses default stripping of LLM bodies, that include_llm_bodies=true emits WARNING audit log, full=true disables truncation, and failed LLM completion_text is always returned (capped). Also mentions agent_id mismatch returns not_found.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, no redundancy. First sentence defines purpose, second gives usage, third explains defaults and flags with side effects. Efficient and well-organized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, description doesn't fully explain return structure beyond mentioning error_message and completion_text. But it covers key behavioral aspects and parameter details. Lacks full return format, which would aid agent understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (baseline 3). Description adds value: for include_llm_bodies, adds 'when diagnosing prompt engineering issues'; for full, adds 'Escape hatch for a human operator'; for trace_id, specifies it comes from agents_traces_list; for agent_id, notes 'scope validation'. Adds behavioral context beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb 'Fetch' and resource 'full execution detail for a single trace', enumerating contents like tool executions, events timeline, LLM call spans with error_message. Distinguishes from sibling agents_traces_list by specifying it gets one trace's details.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises to use after agents_traces_list identifies a trace of interest (failed, slow, unexpected). Provides context for when to use include_llm_bodies and full flags. No explicit exclusions or alternatives, but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_traces_listAInspect
List recent execution traces for an agent — the same data as /admin/requests, scoped to one agent and readable by an LLM.
Use this when an agent call timed out, drafted the wrong response, or you want to know which tool/LLM call burned the latency. Pair with agents.trace_get for full detail on a specific trace.
Filters: status, success, source (single value or comma-separated: agent,voice), date_from/date_to (ISO-8601), pagination via limit/offset.
Returns returned_count, dropped_on_page (should be 0 — positive means the backend agent_id predicate let something through), and has_more. Edge case: a raw page of all-dedup-dropped rows yields returned_count=0, has_more=true; re-call with offset += limit.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max rows per page (1–100). | |
| offset | No | Rows to skip for pagination. | |
| source | No | Filter by trace source. Single value or comma-separated, e.g. 'agent,voice'. Values: agent / auto_reply / agentic / outreach / voice. Note: source='agent' also matches voice traces today (known upstream bug). | |
| status | No | Filter by status. | |
| date_to | No | ISO-8601 upper bound on created_at. | |
| success | No | Filter to succeeded (true) or failed (false) runs only. | |
| agent_id | Yes | Agent ID to pull traces for (must belong to your workspace). | |
| date_from | No | ISO-8601 lower bound on created_at, e.g. '2026-04-10T00:00:00Z'. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses behavior: pagination edge case (returned_count=0, has_more=true when rows deduplicate), known upstream bug (source='agent' also matches voice), and filter details (comma-separated values, ISO-8601 dates).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-organized: first sentence states purpose and scope, then usage scenarios, then filter details, then return fields and edge cases. Every sentence adds value; no fluff or repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description comprehensively covers filters, pagination behavior, edge cases, and known bugs. It explains return fields adequately for the tool's purpose without overloading.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds extra context: explains filter usage (e.g., comma-separated source values, ISO-8601 format for dates), pagination parameters (limit/offset), and the meaning of return fields (returned_count, dropped_on_page, has_more). This goes beyond the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists recent execution traces for an agent, scoped to one agent, and explains it is the same data as /admin/requests, readable by an LLM. It distinguishes from siblings like agents.trace_get and agents_traces_stats by specifying use cases and pairing recommendations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explicitly states when to use the tool (e.g., agent call timed out, wrong draft, latency debugging) and recommends pairing with agents.trace_get. While it does not explicitly list when not to use it, the context is clear and practical.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_traces_statsAInspect
Aggregated trace statistics for one agent over the last N days — total runs, success rate, avg duration, error breakdown, top tools used, runs-per-day histogram.
Use this when you want a bird's-eye view of an agent's health before diving into individual traces with agents.traces_list / agents.trace_get. Scoped to the target agent (exact match, no substring bleed). days is capped at 30 — matches the ClickHouse request_traces TTL.
| Name | Required | Description | Default |
|---|---|---|---|
| days | No | Rolling window in days (1–30). | |
| agent_id | Yes | Agent ID to compute stats for (must belong to your workspace). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but the description adds behavioral context: days capped at 30 due to ClickHouse TTL, exact match on agent_id. It implies read-only statistics but doesn't explicitly state permissions or side-effects, which is a minor gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first lists all statistics offered, second provides usage guidance and constraints. No redundant words, clearly structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description lists the statistics but doesn't detail the exact structure (e.g., histogram format, error breakdown keys). However, it is reasonably complete for a stats tool, especially with the TTL and scoping details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameters. The description adds value beyond schema: 'exact match, no substring bleed' for agent_id and 'capped at 30 — matches ClickHouse request_traces TTL' for days.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it provides aggregated trace statistics (total runs, success rate, avg duration, error breakdown, top tools, histogram) for one agent, distinguishing it from sibling trace tools by framing it as a bird's-eye view.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says to use this for a bird's-eye view before diving into individual traces with agents_traces_list/agents_trace_get, and notes the exact match scoping and 30-day cap due to TTL.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_trigger_createAInspect
Create a new trigger for an AI agent.
Triggers determine when the agent activates.
Trigger types:
incoming_message: Activates on new incoming messages
schedule: Activates on a schedule
webhook: Activates on webhook events
event: Activates on system events
| Name | Required | Description | Default |
|---|---|---|---|
| enabled | No | Whether the trigger is enabled (default: true) | |
| agent_id | Yes | ID of the agent to create a trigger for | |
| priority | No | Trigger priority — lower numbers run first (default: 100) | |
| send_mode | No | Send mode override for this trigger (null = inherit from agent) | |
| conditions | No | Trigger conditions (JSON). Supported fields for incoming_message: - keywords: ["pricing","demo"] — message must contain keyword(s) (free, no LLM cost) - keyword_match: "any" (default, OR) or "all" (AND) - channel_types: ["telegram","whatsapp","livechat_voice","twilio_voice","telegram_voice","voice",...] — filter by channel. For voice, use EITHER the three per-channel keys (scoped) OR "voice" alone (wildcard matching all three) — mixing them is redundant. Per-channel keys: "livechat_voice" (web widget), "twilio_voice" (PSTN inbound), "telegram_voice" (Telegram p2p calls) - context_types: ["dm","group","channel","livechat"] — filter by chat type - group_mode: "mentions_only" or "questions" — for group chats - channel_account_ids: ["123"] — restrict to specific accounts - folder_ids: [5,10] — restrict to threads in folders - ai_tag_ids: [1,2] — restrict to threads with AI tags - ai_filter_ids: [1,2] — semantic intent filters (message matched via embedding similarity, works in noisy groups) - ai_filter_mode: "any" (default, OR) or "all" (AND) — how multiple AI filters combine - ai_filters: [{id: 1}, {name: "...", description: "..."}] — shorthand: reference existing by id or create inline (calls Voyage embedding API). If a filter with the same name already exists, it is reused by id. Prefer referencing existing filters by id when available. Use ai_filters.create + ai_filters.test for fine-tuning before assigning. - contact_states: ["active"] — filter by contact state - cooldown_seconds: 30 — min gap between runs per thread - max_runs_per_thread_per_hour: 5 — rate limit Supported fields for job_completed (proactive callback when a delegated job finishes): - source_agent_id: <int> — fire only when this agent's job completed - source_agent_slug: <str> — alternate to source_agent_id - job_type: "agentic_session" — match a specific job type (default: any) - outcome: ["completed"] | ["escalated"] | ["completed","escalated"] — default ["completed"] - min_duration_seconds: <int> — skip very-short jobs (noise filter) - thread_filter: {thread_ids: [<int>...]} — restrict to specific threads | |
| thread_ids | No | Restrict this trigger to specific threads (chats) by their numeric thread IDs. When set, the trigger only fires for messages in these threads. Maps to conditions.thread_filter.thread_ids. | |
| trigger_type | Yes | Type of trigger: 'incoming_message', 'incoming_call', 'voice_transcript', 'schedule', 'webhook', 'event', 'blockchain_event', or 'job_completed' |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full behavioral disclosure burden. It explains trigger types and condition semantics in detail, including how fields like keyword_match and channel_types work. However, it does not mention side effects (e.g., immediate activation, dependencies on agent state) or the response structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a clear header, bullet points for trigger types, and a detailed conditions section. It is front-loaded with the main purpose. However, the conditions section is lengthy and could be more compact.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (7 parameters, nested conditions), the description covers the main functionality and condition details. However, it omits the return value (e.g., created trigger details) and does not mention prerequisites beyond the schema's required fields.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds significant value by explaining the conditions object in depth, including supported fields, defaults, and usage guidelines (e.g., for voice channels and AI filters). This far exceeds the schema's brief description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Create a new trigger for an AI agent' and lists trigger types and their purposes. The verb 'create' is specific, and the resource 'trigger' is well-defined, distinguishing it from sibling tools like agents_trigger_update or agents_trigger_delete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context on what the tool does and the types of triggers, but does not explicitly state when to use it over alternatives (e.g., updating an existing trigger). No 'when not to use' or alternative recommendations are given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_trigger_deleteAInspect
Delete a trigger from an AI agent.
WARNING: This cannot be undone.
| Name | Required | Description | Default |
|---|---|---|---|
| agent_id | Yes | ID of the agent that owns this trigger | |
| trigger_id | Yes | ID of the trigger to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the burden. It correctly warns that deletion is irreversible, which is critical for a destructive operation. However, it does not disclose prerequisites, side effects, or return behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two sentences. The action is front-loaded, and the warning follows directly. No unnecessary information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple delete tool with two parameters and no output schema, the description covers the essential purpose and a key behavioral aspect (irreversibility). It could mention the return value or confirmation, but overall it is sufficiently complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already provides clear descriptions for both parameters (agent_id and trigger_id). The description adds no additional meaning beyond the schema, so a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action: 'Delete a trigger from an AI agent.' Uses a specific verb and resource, and distinguishes from sibling tools like agents_trigger_create and agents_trigger_update.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
There is no guidance on when to use this tool versus alternatives. A warning about irreversibility is given, but no mention of when to prefer this over other actions (e.g., disabling a trigger).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_trigger_updateAInspect
Update an existing AI agent trigger.
All parameters are optional — only provided fields will be updated.
| Name | Required | Description | Default |
|---|---|---|---|
| enabled | No | Enable or disable this trigger | |
| agent_id | Yes | ID of the agent that owns this trigger | |
| priority | No | Trigger priority — lower numbers run first | |
| send_mode | No | New send mode override | |
| conditions | No | New trigger conditions (replaces existing). Same fields as trigger_create: keywords, keyword_match, channel_types, context_types, group_mode, channel_account_ids, folder_ids, ai_tag_ids, ai_filter_ids, ai_filter_mode, ai_filters: [{id: 1}, {name: "...", description: "..."}] — shorthand: reference existing by id or create inline (calls Voyage embedding API). If a filter with the same name already exists, it is reused by id. contact_states, cooldown_seconds, max_runs_per_thread_per_hour | |
| thread_ids | No | Restrict this trigger to specific threads (chats) by their numeric thread IDs. When set, merged into conditions.thread_filter.thread_ids. If conditions is also provided, thread_ids is merged into it. | |
| trigger_id | Yes | ID of the trigger to update | |
| trigger_type | No | New trigger type |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry full burden. It only states update and partial update behavior, omitting details on idempotency, side effects (e.g., conditions replacement), authorization needs, or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with clear purpose and key usage note. No fluff, front-loaded, and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with 8 parameters (including nested objects) and no output schema, the description is adequate but incomplete. It does not specify return values or behavior on failure, which would aid an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. The description adds value by clarifying that only provided fields are updated, which goes beyond the schema's individual parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Update an existing AI agent trigger,' specifying the action and resource. It is distinct from sibling tools like agents_trigger_create and agents_trigger_delete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description notes that all parameters are optional, implying partial updates, but does not explicitly contrast with create or delete, nor does it provide when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agents_updateAInspect
Update an existing AI agent's configuration.
All parameters are optional — only provided fields will be updated.
Use this to:
Enable or disable an agent
Change agent name or description
Assign or detach a prompt
Change default send mode
Replace knowledge collections
Update agent status
Change agent priority for trigger matching (lower number = higher priority)
Override which tools the agent can/can't call on triggered runs
Override which context sections (situation, communication style, job state, conversation history, thread summary) the agent receives
Opt into boilerplate prompt sections (safety guidelines, data confidentiality, factual accuracy) — all default OFF
| Name | Required | Description | Default |
|---|---|---|---|
| name | No | New name for the agent | |
| model | No | Canonical source for which LLM the agent runs on. To switch models pass JUST this — do NOT also rewrite prompt_text (any 'duty model' section in the prompt is stale doc, not the config). | |
| status | No | Agent status: 'active', 'paused', or 'archived' | |
| enabled | No | Enable or disable the agent | |
| agent_id | Yes | ID of the agent to update | |
| priority | No | Agent priority for trigger matching. LOWER number = HIGHER priority (wins tiebreaks). Typical range 1-100. Fallback auto-reply agents use 10; specialised/topical agents use 100. When two agents match the same incoming message, the one with the lower priority number fires. | |
| prompt_id | No | Prompt ID to assign (null to detach) | |
| send_mode | No | Default send mode: 'auto' or 'draft' | |
| fast_model | No | Model for the fast-path responder (voice, text auto-reply, agent executor). Defaults to claude-haiku-4-5-20251001 when unset. Non-Anthropic models (deepseek-chat, gpt-4.1-nano, kimi-k2.6) do NOT use BYOK today — they use the system API key + credits. Pass null to revert to default. | |
| api_surface | No | OpenAI HTTPS endpoint for this agent's LLM calls (Phase 3a). 'chat_completions' (default, also when null) routes to /v1/chat/completions. 'responses' routes to /v1/responses — required for OpenAI native server tools (web_search, code_interpreter, image_generation, input_file PDFs). Capability still wins: agents whose tool list triggers the server_tool_responses_api substitution always route to Responses regardless of this setting. Ignored on non-OpenAI models (Anthropic, DeepSeek, Moonshot). | |
| description | No | New description for the agent | |
| prompt_text | No | DESTRUCTIVE — REPLACES the entire system prompt. Pass ONLY when the user explicitly asks to edit/rewrite the prompt. To READ the prompt use prompts.get. When updating other fields (model, name, …) OMIT this. To append, prompts.get first then concatenate. Pass null to revert to the linked template. | |
| voice_tools | No | Allow-list of tool IDs usable in voice mode (e.g. ['calls.end']). Empty list [] = explicit no-tools allow-list. Omit leaves unchanged. MCP cannot null-clear — use REST to revert to inherit from agent allowed_tools. | |
| denied_tools | No | Block-list of tool IDs the agent must not call on triggered runs. Applied after allowed_tools and default visibility. Empty list [] = clear the block-list. | |
| allowed_tools | No | Explicit allow-list of tool IDs this agent can call on triggered runs (e.g. ['messages.send', 'agents.handoff']). Empty list [] = clear the allow-list and fall back to system defaults. When set, only these tools (minus denied_tools) are exposed to the agent. Does NOT affect the My AI dropdown path. | |
| execution_mode | No | Execution mode: 'agentic', 'ai_assisted', 'rule_based', 'claude_channels', or 'voice' | |
| voice_greeting | No | Opening line the agent speaks when the call connects. Pass an empty string "" to clear. Omit or null leaves unchanged. | |
| voice_stt_model | No | Speech-to-text model: 'flux' (LLM-powered end-of-turn) or 'nova-3' (silence-based). Flux is more responsive; nova-3 is the fallback when your Deepgram plan lacks Flux. | |
| voice_tts_speed | No | TTS playback speed multiplier (0.5-2.0, default 1.0). Yandex/OpenAI/Cartesia only — ignored for Deepgram. | |
| voice_tts_voice | No | TTS voice id — provider-specific (e.g. 'aura-2-thalia-en' for Deepgram, 'alloy' for OpenAI, 'alena' for Yandex, Cartesia voice UUID). Pass null to revert to provider default. | |
| auto_reply_rules | No | Plain-English rules injected into the fast model's system prompt as a `## Rules` block. No reserved keywords — the fast model reads them as guidance and decides per turn whether to reply directly or escalate to the main model for tools. Example: '- If the user greets, reply "Hi! How can I help?"\n- If the user asks what you can do, reply with a 1-sentence summary\n- If the question needs live data (prices, stock, booking), escalate' Engagement filtering (SKIP) belongs in trigger `conditions` (keywords, ai_filters, channel_types, cooldown), NOT here — if a message should be ignored the trigger shouldn't have fired. Pass null to clear. | |
| voice_max_tokens | No | Max TTS tokens per voice reply (40-200, default 100). Lower = snappier, higher = more detail. | |
| include_job_state | No | Include current job state (active job context, tasks, notes) in the agent's prompt. | |
| include_situation | No | Include situation context (channel, sender info, trigger type) in the agent's prompt. | |
| voice_stt_keyterms | No | Domain-vocab bias for STT — names, product SKUs, etc. Passed verbatim as repeated `&keyterm=<w>` query params. Works on both Nova-3 and Flux. Prefer short phrases over full sentences. Empty list [] = no bias. Omit leaves unchanged. | |
| voice_stt_language | No | STT language hint. 'multi' (default) enables code-switching; singletons like 'en', 'ru', 'es' give higher accuracy when the caller language is known. Use 'multi' for bilingual callers. | |
| voice_tts_language | No | TTS language code, BCP-47 lite e.g. 'en', 'es', 'pt-BR' (Cartesia only, default 'en'). | |
| voice_tts_provider | No | Text-to-speech provider: 'deepgram' (default, Aura-2 EN-only), 'openai' (multilingual), 'yandex' (best Russian), or 'cartesia' (Sonic-3 ultra-low TTFB). | |
| voice_primary_model | No | Primary LLM for voice turns (e.g. 'gpt-4.1-mini', 'claude-haiku-4-5-20251001'). gpt-4.1-nano is too weak for reliable turn tracking; mini is the recommended floor. Pass null to revert to default. | |
| fast_prompt_override | No | Full fast-path prompt override. Placeholders substituted via .replace(): {message}, {history}, {rules}, {tools}, {output_contract}. agent.prompt_text is NOT injected into fast_prompt_override — include it yourself if you want it. Pass null to clear. | |
| voice_filler_enabled | No | Emit 'thinking' filler audio while tools run so the caller hears life on the line (default true). | |
| voice_max_tool_calls | No | Max tool calls per voice turn (0-10, default 3). | |
| voice_thinking_texts | No | Pool of phrases spoken while the agent sets up the turn before calling the LLM (e.g. ['Hmm', 'So', 'One sec']). Pre-rendered to PCM at call start; one is picked at random per turn so the agent doesn't repeat the same word. Pass [] to clear. Omit or null leaves unchanged. | |
| include_learned_style | No | Include learned communication style (per-contact tone, dormancy state) in the agent's prompt. | |
| include_thread_summary | No | Include condensed summary of older thread messages in the agent's prompt. | |
| include_factual_accuracy | No | Inject the Factual Accuracy block (~100 tokens, generic anti-hallucination rules) into the system prompt. Default OFF — skip if you write domain-specific accuracy rules in Instructions. Agentic mode only. | |
| knowledge_collection_ids | No | Replace all knowledge collections with these IDs (empty list = clear all) | |
| include_safety_guidelines | No | Inject the generic Safety Guidelines block (~80 tokens) into the system prompt. Default OFF — enable only if you don't already write safety rules in your Instructions. Agentic mode only. | |
| include_tool_call_history | No | Include the agent's own tool calls and results from the last 3 runs on this thread, compacted to IDs + top hits (~200-1000 tokens). Lets the agent recall file IDs, search hits, and decisions it already made across turns. Default ON. Agentic mode only. | |
| voice_endpointing_min_delay | No | Silence after end-of-utterance before agent replies (0.1-2.0s, default 0.3). Higher = fewer false interrupts; lower = snappier. | |
| voice_preemptive_generation | No | Speculatively start the LLM on STT partials so the agent begins responding before end-of-utterance. Matches LiveKit stock template. Default true. | |
| include_conversation_history | No | Include recent messages from this thread (up to 20) in the agent's prompt. | |
| include_data_confidentiality | No | Inject the Data Confidentiality block (~250 tokens, cross-contact PII isolation + prompt-injection defense) into the system prompt. Recommended for multi-tenant workspaces. Default OFF. Agentic mode only. | |
| voice_interruption_min_duration | No | Min caller speech duration to interrupt the agent (0.1-1.5s, default 0.25). Higher = ignore short fillers like 'uh-huh'. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries full burden. It is highly transparent, noting that all parameters are optional and only provided fields are updated, and includes warnings for destructive actions like prompt_text replacement.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a clear opening sentence and bulleted list of use cases. It is appropriately sized for a complex tool with 43 parameters, with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity and lack of output schema, the description is highly complete, covering all update scenarios and parameter behaviors. It effectively compensates for missing annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds value by grouping use cases and explaining parameter intent (e.g., priority explanation), going beyond the schema's per-parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Update an existing AI agent's configuration.' It provides a specific verb ('Update') and resource ('AI agent'), distinguishing it from siblings like agents_create or agents_delete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description lists many use cases with bullet points, providing clear context on when to use the tool. However, it does not explicitly mention when not to use it or compare with alternatives, though the sibling list offers implicit differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_filters_createAInspect
Create a new AI filter for semantic intent-based message matching.
AI filters use vector embeddings (via Voyage AI) to detect whether an incoming message matches a specific intent or topic. The filter's description is embedded as a reference vector at creation time. When a message arrives, its embedding is compared against this reference using cosine similarity.
The description field is the most important part — it becomes the reference embedding that all incoming messages are compared against. Write it as a clear statement of what kind of messages should match:
'Customer asking about pricing, subscription plans, or billing'
'User reporting a bug, crash, or unexpected behavior in the product'
'Inbound sales lead expressing interest in purchasing or trialing'
The threshold controls sensitivity: 0.5 is a balanced default, lower values (0.3) cast a wider net, higher values (0.8) require closer matches.
Note: This tool calls the Voyage AI embedding API to generate the reference vector.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Filter name — a short, human-readable label (max 100 chars) | |
| threshold | No | Cosine similarity threshold for a message to be considered a match. Range 0.1–1.0. Default 0.50. Lower values (e.g. 0.3) are more permissive and catch more messages. Higher values (e.g. 0.8) require closer semantic similarity. | |
| description | Yes | Reference text that defines what messages should match this filter. This text is embedded as a vector and used for cosine similarity comparison against all incoming messages. Be specific and descriptive — the quality of this text directly determines filter accuracy. E.g. 'Customer asking about pricing, subscription costs, or billing issues'. Max 500 chars. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations are provided, the description carries the full burden. It discloses that the tool calls the Voyage AI embedding API to generate the reference vector, which is valuable. However, it does not mention potential side effects (e.g., cost, rate limits, authentication needs, or what happens on failure). The threshold behavior is well explained, but details about the creation process (e.g., immediate activation) are missing.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured: a concise summary, then technology explanation, usage tips, threshold guidance, and a note about the external API call. It is informative without being overly verbose. A slight trimming of some sentences could improve conciseness, but overall it is efficient for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has three parameters (all well-documented in schema) and no output schema, the description covers purpose, mechanism, parameter importance, and external API dependency. However, it does not explain the return value (likely a filter identifier) or state whether creation is asynchronous or synchronous. Prerequisites (e.g., API key) are not mentioned. For a creation tool, it is fairly complete but leaves some gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema already describes all three parameters (name, threshold, description) with 100% coverage. The description adds significant value by stressing the importance of the description field, providing concrete examples, and explaining threshold values and their impact. This goes beyond the schema and helps the agent craft proper inputs.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: creating an AI filter for semantic intent-based message matching. It explains the underlying technology (vector embeddings via Voyage AI) and how the filter works (embedding description as reference vector, cosine similarity comparison). This distinguishes it from sibling tools like ai_filters_list, ai_filters_update, which have different operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides good usage guidance: it explains when to create a filter, emphasizes the importance of the description field with examples, and advises on threshold tuning (0.5 default, lower for wider net, higher for precision). However, it does not explicitly state when not to use it or compare to alternatives like ai_filters_test or ai_filters_update, which would improve clarity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_filters_deleteAInspect
Permanently delete an AI filter.
When to use:
User wants to remove a filter they no longer need
This action cannot be undone. Any triggers that reference this filter by ID will no longer match it — review and update those triggers after deletion.
| Name | Required | Description | Default |
|---|---|---|---|
| filter_id | Yes | ID of the filter to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It clearly states the action is permanent ('cannot be undone') and warns about triggers that reference the filter by ID no longer matching, disclosing important side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise with two short paragraphs, front-loading the purpose and then providing necessary warnings. Every sentence adds value with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple delete operation with one required parameter and no output schema, the description is complete. It covers the action, irreversibility, and impact on triggers, which is sufficient context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already has a description for 'filter_id' as 'ID of the filter to delete,' achieving 100% coverage. The description adds no further parameter details, so a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description starts with 'Permanently delete an AI filter,' which uses a specific verb and resource, clearly distinguishing it from sibling tools like create, list, or update.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states 'When to use: - User wants to remove a filter they no longer need,' giving clear context for usage. It does not provide explicit when-not-to-use or alternative tools, but the context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_filters_listAInspect
List all AI filters for the current workspace.
AI filters are semantic intent-based message filters that use embeddings (vector representations) to detect whether an incoming message matches a specific intent or topic. Unlike keyword filters, they understand meaning: 'I need help with my order' and 'my package hasn't arrived' both match a 'shipping support' filter even without shared keywords.
Each filter stores a reference embedding of its description. When a message arrives, its embedding is compared via cosine similarity against the filter's reference vector. If the similarity exceeds the threshold, the filter matches.
When to use:
Check which semantic filters already exist before creating a new one
Get filter IDs for use in trigger conditions
Review thresholds and active status of existing filters
Returns all filters with id, name, description, threshold, and is_active.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations are provided, the description carries full burden. It correctly implies a read-only operation ('list all'), explains the embedding-based matching mechanics, and specifies the return fields. It could mention potential limitations like pagination but is transparent overall.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a clear purpose, technical explanation, usage guidance, and return value specification. Every sentence is informative and non-redundant. It is concise yet comprehensive.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description fully explains the return fields (id, name, description, threshold, is_active). It provides sufficient context about the tool's role within the AI filtering system, making it complete for an agent to understand and use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has zero parameters, and schema coverage is 100%. Per the guidelines, 0 parameters baseline is 4. The description adds context about what the tool returns and how filters work, but no parameter documentation is needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The first sentence clearly states the tool lists AI filters for the workspace. It then explains what AI filters are (semantic/intent-based) and how they differ from keyword filters, making the tool's purpose specific and distinct.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The 'When to use' section explicitly lists three concrete scenarios: checking existing filters before creation, getting IDs for triggers, and reviewing thresholds/active status. This provides strong guidance for when this tool is appropriate versus sibling tools like ai_filters_create.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_filters_testAInspect
Test a message against an AI filter to check whether it would match.
This tool embeds the provided message using Voyage AI and computes the cosine similarity between the message vector and the filter's stored reference vector. It returns the similarity score, whether the message would match (similarity >= threshold), and the filter's threshold value.
Use this to:
Verify a filter works as intended before using it in a trigger
Tune the threshold by testing borderline messages
Debug why a message did or did not match a filter in production
Returns: {similarity: float, matched: bool, threshold: float}
Note: This tool calls the Voyage AI embedding API to embed the test message.
| Name | Required | Description | Default |
|---|---|---|---|
| message | Yes | The message text to test. This is embedded and compared against the filter's reference vector via cosine similarity. | |
| filter_id | Yes | ID of the filter to test against |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully bears the responsibility of disclosing behavior. It reveals that the tool calls the Voyage AI embedding API, computes cosine similarity, and returns similarity score, match boolean, and threshold. It does not mention rate limits, authentication needs, or potential side effects, but as a read-only test tool, the disclosed information is sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: four sentences, a bulleted list, and a return format specification. It is front-loaded with the core purpose and efficiently organized, with no extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of an output schema, the description provides the return format and explains the entire testing process. It covers the tool's scope adequately. Minor omissions like error handling or prerequisite (filter must exist) are not critical for a well-defined test tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description adds technical context for the 'message' parameter (embedding and comparison), but the schema already describes both parameters adequately. The added value is marginal, not enough to raise the score above baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool tests a message against an AI filter to check for a match, explains the embedding and cosine similarity process, and uses specific verb+resource ('test message against filter'). It distinguishes itself from sibling tools like ai_filters_create by focusing on testing rather than CRUD operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides three specific use cases: verify filter, tune threshold, debug. However, it lacks explicit guidance on when not to use this tool (e.g., for creating or updating filters) and does not compare it to sibling testing-like tools such as agents_simulate_inbound. Nevertheless, the listed use cases offer clear context for invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_filters_updateAInspect
Update an existing AI filter's name, description, threshold, or active state.
When to use:
User wants to rename a filter
User wants to refine the filter description to improve match accuracy
User wants to adjust the similarity threshold (higher = stricter matching)
User wants to enable or disable a filter without deleting it
Provide only the fields you want to change. At least one field is required.
Note: If the description is changed, this tool calls the Voyage AI embedding API to re-generate the reference vector with the new description text.
| Name | Required | Description | Default |
|---|---|---|---|
| name | No | New filter name (max 100 chars, optional) | |
| filter_id | Yes | ID of the filter to update | |
| is_active | No | Enable (true) or disable (false) the filter | |
| threshold | No | New cosine similarity threshold. Range 0.1–1.0. Optional. | |
| description | No | New reference description text. If changed, the Voyage AI embedding API is called to re-generate the reference vector. Max 500 chars. Optional. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses a key behavioral trait: changing the description calls the Voyage AI embedding API to re-generate the reference vector. With no annotations provided, this adds important context. It does not mention other potential side effects or prerequisites, but the mutation is clearly stated.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise and well-structured: a single opening sentence followed by bullet-point use cases and a note. Every sentence adds value with no filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a partial-update tool with no output schema and no annotations, the description covers the purpose, usage scenarios, and a key side effect (API call). It is sufficiently complete for an agent to understand when and how to use the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage for all parameters. The description summarizes the fields but does not add meaning beyond what the schema already provides (e.g., ranges, optionality). Hence, it meets the baseline of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Update' and the resource 'AI filter', and lists the specific attributes (name, description, threshold, active state). It distinguishes from sibling tools like create, delete, list, and test.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The 'When to use' section provides explicit scenarios (rename, refine description, adjust threshold, enable/disable). It also notes that at least one field must be provided. However, it does not mention when not to use or suggest alternatives, which keeps it from a perfect score.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_tags_add_to_threadAInspect
Apply one or more AI tags to a thread (manually).
When to use:
User wants to label a conversation with one or more tags
User asks to categorize or tag a thread
Provide the thread_id (integer) and an array of tag_ids to apply. If a tag is already applied it will be updated to is_manual=true.
| Name | Required | Description | Default |
|---|---|---|---|
| tag_ids | Yes | Array of tag IDs to apply (1–20 IDs) | |
| thread_id | Yes | ID of the thread to tag |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses that already-applied tags get updated to is_manual=true, but lacks details on error handling, permissions, or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Concise at 5 lines with front-loaded action and clear structure. No superfluous content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but tool is simple with two required parameters. Description covers action and a key behavioral detail. Adequate for its complexity, though missing potential error scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so description adds minimal value. Repeats parameter names and types. Constraint of 1-20 IDs is already in schema description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Apply one or more AI tags to a thread (manually)' with specific verb and resource. Distinguishes from siblings like ai_tags_remove_from_thread.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit 'When to use:' section with two scenarios. Does not explicitly state when not to use or mention alternatives, but effectively guides usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_tags_createAInspect
Create a new AI tag (automatic message filter).
AI tags are lightweight classifiers that run on every incoming message. When a message matches the tag's description/criteria, the thread is automatically labelled — so AI agents can cheaply pre-filter threads instead of running full LLM analysis on everything. Good descriptions are the key: they tell the classifier exactly when to apply this tag.
When to use:
User wants to auto-classify incoming messages (e.g. bug reports, sales leads, support requests)
User wants to reduce AI agent costs by pre-filtering threads by topic or intent
Tips for the description field:
Be specific: 'Messages reporting errors, crashes, or unexpected behavior in the product'
Include examples of what qualifies and what doesn't
Limit: 20 active personal tags / 50 active team tags.
| Name | Required | Description | Default |
|---|---|---|---|
| icon | No | Emoji icon for the tag (max 10 chars, optional) | |
| name | Yes | Tag name (max 100 chars) | |
| color | No | Tailwind color key for the tag badge. Allowed: amber, blue, green, red, purple, yellow, slate | |
| description | No | Classifier prompt: describe exactly when this tag should be applied to a thread. The more specific, the better the auto-classification accuracy. E.g. 'Messages reporting software errors, crashes, or unexpected behavior'. Max 500 chars. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries the full burden. It discloses that AI tags are lightweight classifiers that run on every incoming message, that labels are applied automatically, and mentions limits (20 personal/50 team). Does not mention immediate activation or side effects, but the creation action is straightforward.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured: starts with a one-liner, then explains the concept, provides usage guidance, tips, and limits. It is front-loaded with the main action and every sentence adds unique value without being verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (4 parameters, no output schema), the description adequately covers purpose, usage, and parameter tips. It lacks explicit return value information, but for a creation tool, the behavior is clear enough for an AI agent to invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all 4 parameters. The description adds value by explaining the 'description' parameter as a classifier prompt with tips and examples, going beyond the schema. Icon and color parameters are not elaborated beyond schema, but the overall parameter information is clear.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description starts with 'Create a new AI tag (automatic message filter)', clearly stating the verb and resource. It explains what AI tags are and distinguishes from sibling tools like ai_tags_delete, ai_tags_list, and ai_filters_create by emphasizing auto-classification of incoming messages to pre-filter threads.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Includes a 'When to use' section explicitly stating two use cases: auto-classifying incoming messages and reducing AI agent costs. Provides tips for the description field but lacks explicit when-not-to-use or alternative tools, though the context implies this is for lightweight classification rather than complex filtering.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_tags_deleteAInspect
Delete a personal AI tag. All thread associations are removed automatically.
When to use:
User wants to permanently remove a tag they no longer need
This cannot be undone. Threads are NOT deleted — they just lose this tag.
| Name | Required | Description | Default |
|---|---|---|---|
| tag_id | Yes | ID of the tag to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description fully discloses key behaviors: automatic removal of thread associations, irreversibility, and that threads are not deleted. This meets the full burden for transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise: three short sentences effectively convey purpose, usage context, and behavioral notes. No filler, every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple tool (1 param, no output schema) and full schema coverage, the description adequately covers purpose and side effects. Slight lack of return value info, but not critical for a delete operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (tag_id described as 'ID of the tag to delete'). The description adds no additional context about the parameter, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete a personal AI tag') and resource (tag). It distinguishes itself from sibling tools like ai_tags_add_to_thread and ai_tags_create by specifying deletion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides a 'When to use' context ('user wants to permanently remove a tag'), but lacks explicit alternatives or when-not-to-use guidance. The behavioral notes (automatic thread association removal, irreversibility) help, but no direct comparison with sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_tags_listAInspect
List all personal AI tags.
AI tags are automatic message filters: the system runs a lightweight classifier on every incoming message and applies matching tags to threads. This lets AI agents skip expensive full analysis on most messages — they only act on threads that match relevant tags, dramatically cutting LLM costs.
When to use:
Check which auto-classification filters exist before creating one
Get tag IDs for add_to_thread / remove_from_thread
See how many threads each tag currently matches
Returns all tags with thread counts (non-archived, included threads only).
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Given no annotations, the description fully explains behavior: lightweight classifier runs on every incoming message, tags are applied automatically, and it returns thread counts. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with separate sections, front-loaded with the main action. A few sentences could be trimmed, but overall efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description explains return values (tags with thread counts) and provides sufficient context about the AI tags feature, making it complete for a list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has zero parameters, so schema coverage is 100%. Description adds value by explaining the return data (tags with thread counts) and the broader context of the system.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'List all personal AI tags' and explains the purpose of AI tags as automatic message filters, distinguishing it from sibling tools like ai_tags_create or ai_tags_delete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit 'When to use' bullets: check filters before creating, get tag IDs for add/remove, and see thread counts. Lacks explicit when-not-to-use but is clear in context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_tags_remove_from_threadAInspect
Remove a specific AI tag from a thread.
When to use:
User wants to un-label or remove a specific tag from a conversation
User wants to correct an incorrectly applied tag
Provide both thread_id and tag_id.
| Name | Required | Description | Default |
|---|---|---|---|
| tag_id | Yes | ID of the tag to remove | |
| thread_id | Yes | ID of the thread to remove the tag from |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must bear the burden. It states the core behavior but does not disclose side effects, permissions, or error conditions. It is adequate for a simple removal but lacks detail.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise, with a clear structure: main action followed by a bulleted 'When to use' list. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple removal tool, the description covers purpose, usage context, and inputs. Lacks mention of return behavior or error cases, but overall sufficient given the simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions. The description adds no extra meaning beyond the schema, only reinforcing the need for both IDs. Baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Remove') and the resource ('a specific AI tag from a thread'). It distinguishes from sibling tools like ai_tags_add_to_thread (opposite) and ai_tags_delete (deletes tag definition, not association).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The 'When to use' section provides explicit contexts: un-labeling or correcting tags. It requires both IDs. However, it does not mention when not to use or direct to alternatives like ai_tags_delete for deleting the tag entirely.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ai_tags_updateAInspect
Update an existing personal AI tag's name, description, icon, color, or active state.
When to use:
User wants to rename a tag
User wants to change a tag's icon, color, or description
User wants to enable or disable a tag
Provide only the fields you want to change. At least one field is required.
| Name | Required | Description | Default |
|---|---|---|---|
| icon | No | New emoji icon (max 10 chars, optional) | |
| name | No | New tag name (max 100 chars, optional) | |
| color | No | New color key. Allowed: amber, blue, green, red, purple, yellow, slate | |
| tag_id | Yes | ID of the tag to update | |
| is_active | No | Enable (true) or disable (false) the tag | |
| description | No | New LLM hint (max 500 chars; empty string clears it, optional) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool updates tags and lists fields, implying partial update. But it lacks details on side effects, error conditions, permissions, or what happens if tag_id is invalid. Basic behavior is clear but incomplete.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two clear paragraphs. The first states the action and updatable fields, the second provides usage scenarios. No superfluous words; structure is efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately covers purpose and usage. It explains partial updates and required fields. However, it could mention that tag_id must be obtained from ai_tags_list or similar, but overall it is fairly complete for an update tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema descriptions cover all parameters (100% coverage). The description adds value by grouping fields and emphasizing partial update ('Provide only the fields you want to change'), which reinforces the optional nature beyond the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool updates a personal AI tag's name, description, icon, color, or active state. It lists specific use cases, clearly distinguishing it from siblings like ai_tags_create (create) and ai_tags_delete (delete).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a clear 'When to use' list covering rename, change icon/color/description, and enable/disable. It notes that only fields to change should be provided and at least one field is required. However, it does not explicitly exclude scenarios or mention alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calendar_check_availabilityAInspect
Check when you have free time in Google Calendar. Shows busy periods and free slots in a given time range. Useful for finding meeting times or checking schedule conflicts.
| Name | Required | Description | Default |
|---|---|---|---|
| end_time | No | End date/time to check availability (YYYY-MM-DD or ISO 8601). Defaults to end of start_time day, or 7 days from now. | |
| start_time | No | Start date/time to check availability (YYYY-MM-DD or ISO 8601). Defaults to start of today. | |
| calendar_id | No | Calendar ID to check. Defaults to primary calendar. | primary |
| working_hours_only | No | If true, only show free slots during working hours (9 AM - 6 PM). Default: false (show all free time). | |
| min_duration_minutes | No | Minimum duration in minutes for free slots. Filters out short gaps. Default: 30 minutes. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries burden. Describes basic behavior (shows busy/free) and hints at filtering via parameters, but lacks details on timezone handling, error states, or output structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with the purpose. No unnecessary words. Highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema is provided, and the description does not specify the format of returned data (e.g., list of time blocks, objects with start/end). This is a significant gap for a tool with 5 optional parameters.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All 5 parameters have descriptions in schema, so description adds no additional meaning. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it checks free time in Google Calendar, showing busy periods and free slots. Distinguishes from sibling calendar tools like calendar_list_events by focusing on availability.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Useful for finding meeting times or checking schedule conflicts', providing clear use context. However, no explicit alternative suggestions or when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calendar_create_eventBInspect
Create a new event in Google Calendar. Specify the title, start time, end time, and optionally invite attendees. Use ISO 8601 format for dates (e.g., 2024-12-15T14:00:00).
| Name | Required | Description | Default |
|---|---|---|---|
| end | No | Event end time in ISO 8601 format. If not provided, defaults to 1 hour after start. Also accepts 'end_time' as alias. | |
| start | No | Event start time in ISO 8601 format (e.g., 2024-12-15T14:00:00). Also accepts 'start_time' as alias. | |
| title | No | Alias for summary - event title. | |
| summary | No | Event title/summary. Required. Also accepts 'title' as alias. | |
| end_time | No | Alias for end - event end time. | |
| location | No | Event location (physical address or virtual meeting link). | |
| timezone | No | Timezone for the event (e.g., 'America/New_York', 'UTC'). | |
| attendees | No | List of attendee email addresses to invite. | |
| start_time | No | Alias for start - event start time in ISO 8601 format. | |
| calendar_id | No | Calendar ID to create event in. Defaults to primary calendar. | primary |
| description | No | Event description/notes. | |
| add_google_meet | No | If true, automatically creates a Google Meet link for the event. | |
| conference_data | No | Conference data for Google Meet. Alternative to add_google_meet flag. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses that events are created and provides date format requirements, but does not mention side effects (e.g., whether attendees are notified), required permissions, error behavior, or what happens on success (return value). This leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three short sentences, front-loaded with purpose. It is concise with no fluff, but could be slightly more structured (e.g., grouping core vs optional parameters). Reasonably efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 13 parameters and no output schema or annotations, the description is overly minimal. It lacks details on many parameters, does not mention default behaviors (e.g., calendar_id defaults to primary), and omits information about the return value. There is a contradiction: description implies title, start, end are required, but schema has no required fields.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description repeats schema content for core parameters (title, start, end) but adds no meaning beyond what the schema already provides. It does not elaborate on many other parameters (location, timezone, conference_data, etc.).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Create a new event in Google Calendar', which is a specific verb+resource. It distinguishes this tool from sibling calendar tools like calendar_update_event or calendar_list_events, as creation is explicitly mentioned.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for creating events but does not provide explicit guidance on when to use this tool versus alternatives like calendar_update_event or calendar_check_availability. No exclusion criteria or prerequisites are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calendar_delete_eventBInspect
Delete an event from Google Calendar. This action cannot be undone. Use with caution.
| Name | Required | Description | Default |
|---|---|---|---|
| event_id | Yes | ID of the event to delete. Required. | |
| calendar_id | No | Calendar ID containing the event. Defaults to primary. | primary |
| send_notifications | No | Whether to send cancellation notifications to attendees. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so the description must carry the full burden. It only states 'This action cannot be undone', which is critical, but omits other behavioral details such as required permissions, side effects on attendees (partially covered by schema parameter), or response behavior. This is insufficient for a destructive operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences (15 words) that front-load the primary action and critical warning. Every word earns its place without unnecessary fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema and three parameters, the description does not explain return values, error behavior, or confirmation. For a delete tool, a simple success indication would be useful, and the description lacks this completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds no additional meaning beyond what is already in the schema parameters, resulting in no extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete an event') and the resource ('Google Calendar'), distinguishing it from sibling tools like calendar_create_event and calendar_update_event.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use or alternatives are provided. The warning 'Use with caution' implies caution but does not guide the agent on when to prefer this tool over others. Since delete operations are straightforward, this is adequate but not exemplary.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calendar_list_eventsAInspect
List events from Google Calendar. Shows upcoming events by default. Can filter by date range and search query.
| Name | Required | Description | Default |
|---|---|---|---|
| query | No | Free text search query to filter events. | |
| date_to | No | End date/time to query (YYYY-MM-DD or ISO 8601 format). Defaults to 7 days from now. Alias: time_max. | |
| date_from | No | Start date/time to query (YYYY-MM-DD or ISO 8601 format). Defaults to now. Alias: time_min. | |
| calendar_id | No | Calendar ID to list events from. Defaults to primary calendar. | primary |
| max_results | No | Maximum number of events to return. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description partially discloses behavior (default upcoming events, filtering capabilities). However, it omits details like pagination, rate limits, or error handling. It is adequate but not exhaustive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences with no redundancy. It front-loads the core action and then adds key default and filter information efficiently.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 5 parameters, no output schema, and no annotations, the description covers the main purpose and defaults. It lacks details on return format, timezone handling, or edge cases, but is sufficient for a straightforward list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the descriptions already document each parameter's purpose. The tool description adds minimal extra value beyond confirming defaults and filtering, so a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the verb 'list' and resource 'events from Google Calendar', clearly distinguishing it from sibling tools like create, delete, update, and check availability. It also notes default behavior and filtering options.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for viewing events with optional date range and search filters. It does not explicitly state when not to use or provide alternatives, but the context is clear enough for an agent to infer appropriate use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calendar_update_eventAInspect
Update an existing event in Google Calendar. Can modify title, time, location, description, and attendees. Only specified fields will be updated.
| Name | Required | Description | Default |
|---|---|---|---|
| end | No | New end time in ISO 8601 format. Optional. | |
| start | No | New start time in ISO 8601 format. Optional. | |
| summary | No | New event title/summary. Optional. | |
| event_id | Yes | ID of the event to update. Required. | |
| location | No | New event location. Optional. | |
| attendees | No | New list of attendee emails. Replaces existing attendees. | |
| calendar_id | No | Calendar ID containing the event. Defaults to primary. | primary |
| description | No | New event description. Optional. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses partial update behavior but lacks details on permissions required, error handling (e.g., event not found), or side effects beyond updating fields.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, front-loaded with purpose and supported by a list of modifiable fields. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is minimally adequate for a simple CRUD tool but lacks details on required parameters (event_id implicit), default calendar_id, and return behavior. No output schema exists to compensate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so each parameter is documented. The description adds minimal value by categorizing fields, but the partial update hint is useful. Baseline score is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (update an existing event) and the resource (Google Calendar event). It lists modifiable fields, distinguishing it from sibling tools like create and delete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies partial update semantics with 'Only specified fields will be updated,' but does not explicitly state when to use this tool versus alternatives, nor does it mention prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calls_get_transcriptAInspect
Get the structured transcript and final state of a voice call by call_id. Returns per-turn rows in chronological order, call status (active/completed/failed/abandoned), duration, and an outcome field telling whether the recipient picked up (answered/no_answer/busy/declined/failed/unknown). answered_at is non-null once the recipient picked up. Returns active turns if the call is still in progress.
| Name | Required | Description | Default |
|---|---|---|---|
| call_id | Yes | Call ID returned by calls.make in _meta.call_id. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully covers behavior: it returns transcript rows, call status, duration, outcome, and notes that answered_at is null until pickup. It also mentions behavior for active calls.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, with the purpose stated in the first sentence. No extraneous information, every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with one parameter and no output schema, the description adequately explains the return structure and special conditions (answered_at, active calls). It is sufficiently complete for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a clear description for call_id. The tool description does not add extra semantic detail beyond the schema, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves structured transcript and final state of a voice call by call_id, listing specific return fields (per-turn rows, status, duration, outcome). This is distinct from sibling tools like calls_make or calls_list_active.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions the call_id comes from calls.make, providing context on when to use this tool. Though it doesn't explicitly state when not to use it, the purpose is clear enough for differentiation among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calls_hangupAInspect
Hang up an active voice call by call_id. Use after calls.make when the agent decides to terminate before the callee does, or to abort a stuck call. Idempotent: returns success if the call is already terminal.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Short internal reason for ending the call (e.g. 'campaign timeout'). Stored on voice_sessions.metadata. | |
| call_id | Yes | Call ID returned by calls.make in _meta.call_id. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the burden. It discloses idempotency and implies a write operation. However, it does not mention permissions, side effects on the call session, or success/error details beyond idempotency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences covering purpose, usage, and idempotency. No wasted words; every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple hangup tool with no output schema and minimal parameters, the description covers the core context: when to use it, idempotency, and parameter sources. Lacks error behavior details but is largely sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the schema already describes both parameters well (call_id source, reason metadata storage). The description adds no substantial new meaning beyond the schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Hang up'), the resource ('active voice call'), and the identifier ('by call_id'). It distinguishes from siblings like calls_make and calls_list_active by its specific termination function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'after calls.make when the agent decides to terminate before the callee does, or to abort a stuck call.' Also notes idempotent behavior. No exclusionary guidance, but sufficient for this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calls_list_activeAInspect
List active voice calls in this workspace. Use before calls.make on a Telegram account (only one MTProto call per account at a time) to check whether the line is free.
| Name | Required | Description | Default |
|---|---|---|---|
| channel | No | Filter by voice channel: 'telegram' or 'twilio'. | |
| channel_account_id | No | Filter by channel_account.id (the calling Telegram account or Twilio number). Combine with channel for a per-line busy check. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It indicates a read operation (listing active calls) and hints at the purpose (line check), but does not disclose details like return format, side effects, or permissions. This provides minimal beyond purpose.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: first sentence states purpose, second provides usage context. No redundant information, efficiently packed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has no output schema and only 2 optional parameters. Description covers when to use but fails to explain the return format or structure of the list. Given low complexity, it is adequate but not fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, with both parameters described. The description adds no additional meaning beyond the schema, only hinting at Telegram context. Baseline 3 is appropriate as schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists active voice calls in the workspace, using a specific verb and resource. It distinguishes from sibling tools like calls_make or calls_hangup by focusing on listing, and the context confirms no direct sibling exists for generic call listing.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use: before calls.make on a Telegram account to check line availability. This provides clear context for usage and implies a prerequisite (one call per account), giving strong guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calls_list_historyAInspect
Search historical voice calls in this workspace by participant name, contact_id, thread, channel, source, and/or date range. Returns one row per call (NOT per turn) with call_id, duration_seconds, outcome, direction, started_at, source, channel_label, and parent_thread_id (the originating chat thread for Telegram-group / Twilio-outbound / Meet calls). Pair with calls.get_transcript(call_id) for the full per-turn transcript. Use this instead of messages.read_history for cross-thread call queries — group calls and Meet sessions live on per-call sub-threads, not on the parent chat thread.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum calls to return (default 20, max 100). | |
| since | No | ISO date or datetime lower bound (inclusive). Default: 90 days ago. Naive timestamps are interpreted as UTC. | |
| until | No | ISO date or datetime upper bound (inclusive). Default: now. | |
| source | No | Filter by voice_sessions.source: 'telegram' (1:1 + group), 'twilio' (PSTN), 'meet' (Google Meet bot), 'livechat' (in-app voice). | |
| channel | No | Filter by message-level channel of the call thread: 'telegram' (1:1 voice or group call sub-thread), 'twilio_voice', 'meet_voice', 'livechat_voice'. | |
| thread_id | No | Restrict to calls on this thread OR with this thread as their originating parent (Telegram group → call sub-thread back-link, Twilio outbound source_thread_id back-link). | |
| contact_id | No | Filter by exact entity_id (from contacts.find). Mutually exclusive with participant_name when both target the same person. | |
| participant_name | No | Filter to calls whose parent thread has a participant matching this name (substring match against entity.title). Resolves group calls via the parent group's roster, not the per-call thread's speaker list. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses that the tool returns one row per call (not per turn), the specific fields returned, and behavioral details for several parameters (e.g., thread_id back-links, participant_name substring match via parent group roster, mutual exclusivity of contact_id and participant_name). It does not mention pagination or default limit behavior, but covers many key aspects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is fairly long but every sentence adds substantial value. It is front-loaded with the main purpose and then provides structured details. There is no redundancy or fluff. Could be slightly tighter, but overall effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description includes the return fields. It also references pairing with another tool for transcripts. It covers many edge cases through parameter descriptions. However, it does not address pagination, sorting, or error conditions, which would make it even more complete for a tool with 8 parameters.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds significant value beyond schema: it explains default values for `since` and `until`, timezone handling for naive timestamps, the distinction between `source` and `channel`, the back-link behavior for `thread_id`, and the mutual exclusivity and resolution logic for `contact_id` and `participant_name`. This goes well beyond the basic schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches historical voice calls by participant name, contact_id, thread, channel, source, and date range. It specifies the return format (one row per call with specific fields) and distinguishes itself from sibling tools like `messages_read_history`.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear guidance: use this instead of `messages_read_history` for cross-thread call queries, and pair with `calls_get_transcript` for full transcripts. It explains when group calls and Meet sessions live on sub-threads, implying appropriate use cases. However, it does not explicitly state when NOT to use (e.g., for active calls).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calls_makeAInspect
Place an outbound AUDIO/VOICE phone call via Twilio (PSTN) or Telegram (MTProto 1:1 call). Use this any time the user asks to 'call', 'ring', 'phone', 'dial', or have a spoken conversation. Do NOT use messages.send when the user asks to call someone — a call is real-time voice, not a text message. You conduct the conversation as the voice agent using the provided greeting and instructions.
| Name | Required | Description | Default |
|---|---|---|---|
| channel | No | Voice transport: 'twilio' (phone via PSTN — requires phone_number in E.164) or 'telegram' (MTProto 1:1 call — requires telegram_user_id, NOT a phone number or thread_id). Defaults to 'twilio'. | |
| greeting | Yes | The first sentence the agent speaks immediately when the call connects. ALWAYS provide a greeting — without it the caller hears silence. Keep it short and natural. Example: 'Hi, this is Diana calling from DialogBrain. Do you have a moment to chat?' | |
| report_back | No | When to re-invoke you after the call ends. 'on_answer' (default) = only if the call was answered, 'always' = even on missed/failed calls, 'never' = fire and forget. Transcript is always stored regardless of this setting. | |
| instructions | No | What to do during the call — objective, questions, tone. The AI generates a natural opening and guides the conversation. Example: 'Call about invoice #1234. Ask if they received it and when payment is expected. Be friendly and professional.' | |
| phone_number | No | Destination phone number in E.164 format (e.g., '+15551234567', '+66812345678'). Required when channel='twilio'. | |
| voice_agent_id | No | Override: specific voice agent to conduct the call. If omitted, uses the workspace's default voice agent. Must be an agent with execution_mode='voice'. | |
| telegram_user_id | No | Destination Telegram user ID (decimal int64 as string, e.g. '123456789'). Required when channel='telegram'. The caller account must have had prior interaction with this user — a cold contact cannot be reached via voice. | |
| channel_account_id | No | Specific calling channel_account ID. For channel='twilio' this is the Twilio number; for channel='telegram' this is the connected Telegram account. If omitted, auto-selects the first active account of the matching channel. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It explains that transcript is always stored, the re-invoke behavior via report_back, and that prior interaction is needed for Telegram calls. However, it does not mention cost implications or explicit side effects like call recording, leaving some behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose and usage guidelines. It is concise relative to the complexity of the tool but could be slightly shorter by reducing redundancy with schema descriptions. Overall, it is well-structured and informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (8 parameters, two channels, no output schema), the description covers the main points: when to use, channel requirements, post-call behavior, and default settings. It could mention error handling or call duration, but it is largely complete for the agent to invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so the baseline is 3. The tool description adds context beyond the schema by explaining the overall flow, default channel, and the requirement for prior Telegram interaction. This adds semantic value, justifying a 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it places outbound audio/voice calls via Twilio or Telegram. It specifies the verb 'Place', the resource 'outbound AUDIO/VOICE phone call', and distinguishes from messages.send. It also mentions the agents it uses, providing a specific and clear purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says when to use (user asks to call, ring, phone, dial, spoken conversation) and when not (do not use messages.send). It also gives context about defaults and requirements, such as channel-specific parameters, making it easy for the agent to decide when this tool is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calls_waitAInspect
Block until a voice call ends (status changes from 'active') or timeout elapses. Returns ended=true with final state when the call has ended; ended=false on timeout (re-issue to keep waiting). The returned state includes outcome so callers can branch on pickup vs. no-answer (answered/no_answer/busy/declined/failed/unknown). Default timeout 90s; cap 110s — bounded by nginx proxy_read_timeout 120s on /mcp.
| Name | Required | Description | Default |
|---|---|---|---|
| call_id | Yes | Call ID returned by calls.make in _meta.call_id. | |
| timeout_seconds | No | Max seconds to wait. Default 90, cap 110 (bounded below nginx 120s proxy_read_timeout). On expiry returns ended=False with status='active' so the caller can re-issue to keep waiting. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses blocking behavior, timeout handling, return of ended flag and outcome, and the underlying nginx limit. No hidden behaviors.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with the core action, no extraneous words. Every sentence adds critical information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but the description adequately explains return values (ended flag, outcome, final state) and polling semantics. Covers timeout handling and system constraints. Complete for the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds significant value: specifies call_id source (_meta.call_id), default timeout 90, cap 110, and behavior on expiry (ended=False with status='active'). Deepens understanding beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool blocks until a voice call ends or timeout elapses, with specific verb 'block' and resource 'voice call'. It distinguishes from siblings like calls_hangup or calls_get_transcript by focusing on waiting/polling behavior.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use (after a call is made to wait for completion), what to do on timeout (re-issue), and mentions default/cap. It lacks explicit exclusions or alternatives, but the polling pattern is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
collections_add_fileAInspect
Add a file to a knowledge collection.
The file must be uploaded and indexed first (files_upload + files_ingest). If the file was previously removed, it is re-enabled.
| Name | Required | Description | Default |
|---|---|---|---|
| file_id | Yes | ID of the file to add (from files_upload) | |
| collection_id | Yes | ID of the collection |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses the re-enable behavior and implies mutation. Lacks details on permissions or side effects but is sufficient for a simple add.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no fluff. Front-loaded with main action, followed by prerequisite and nuance. Efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema and description does not mention return value. For a simple add tool, it is adequate but could be more complete by indicating what the response will be.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage with descriptions. The description does not add additional meaning beyond the schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Add a file to a knowledge collection.' It provides a specific verb (add) and resource (knowledge collection), and differentiates from siblings like collections_remove_file.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly mentions prerequisites (file must be uploaded and indexed) and a behavioral nuance (re-enables removed files). While it doesn't explicitly state when not to use, the context is clear and helpful.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
collections_assign_agentAInspect
Assign a knowledge collection to an AI agent.
Once assigned, the agent's knowledge.query will automatically scope RAG search to files in its assigned collections.
| Name | Required | Description | Default |
|---|---|---|---|
| agent_id | Yes | ID of the AI agent | |
| collection_id | Yes | ID of the collection to assign |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, and the description fails to disclose whether the assignment is additive or overwriting, any prerequisites (e.g., agent must exist), or potential errors. The burden is on the description, but it only states the positive effect.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: one for purpose, one for consequence. No redundant words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple assignment tool with two required ID parameters and no output schema, the description covers the main outcome and effect on agent behavior. Lacks mention of reversibility, but sibling tool exists for that.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions. The description adds no extra meaning beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Assign a knowledge collection to an AI agent' with a specific verb and resource, and distinguishes from its sibling 'collections_unassign_agent'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides context on the effect (scopes RAG search) and implies use for setting agent knowledge, but does not explicitly state when not to use or mention alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
collections_createAInspect
Create a named knowledge collection.
Collections group files for RAG search. After creating, add files with collections.add_file and assign to agents with collections.assign_agent.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Collection name (must be unique per user) | |
| description | No | Optional description of the collection |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It fails to disclose behavioral traits such as idempotency, error handling (e.g., duplicate name), or required permissions. The description only states the action without any behavioral context beyond the basic operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise at two sentences, with the core action in the first sentence and context/next steps in the second. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with 2 parameters and no output schema, the description provides enough context: what a collection is, its purpose (RAG search), and the workflow after creation. However, it lacks any mention of potential errors or behavior under failure, which would make it more complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (both parameters have descriptions). The description does not add any additional meaning beyond what is already in the schema. The constraint 'must be unique per user' for name is already in the schema's description. Baseline 3 is appropriate as the schema fully documents the parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Create a named knowledge collection' with a specific verb and resource. It explains the purpose of collections (for RAG search) and distinguishes from sibling tools like collections_add_file and collections_assign_agent by mentioning them explicitly.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a clear context: create a collection, then use collections.add_file and collections.assign_agent for subsequent steps. It implicitly guides when to use this tool (when a new collection is needed) but does not explicitly state when not to use it or provide alternative tools for other scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
collections_deleteAInspect
Delete a knowledge collection.
If the collection is assigned to agents, prompts, or channels, pass force=true to delete anyway. CASCADE removes all assignments automatically.
| Name | Required | Description | Default |
|---|---|---|---|
| force | No | Force delete even if collection is in use (default: false) | |
| collection_id | Yes | ID of the collection to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that deleting a collection may be blocked if assigned, and force=true overrides. No annotation provided so description carries burden; could mention irreversibility or required permissions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two efficient sentences. First sentence states purpose, second adds critical usage condition. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a delete tool but lacks details on return values, idempotency, or permissions. Without annotations or output schema, more completeness would help.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds context for force parameter but introduces ambiguous 'CASCADE' not in schema. Slightly confuses but mostly clarifies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Delete a knowledge collection,' using a specific verb and resource. Distinguishes from sibling tools like collections_create and collections_list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides guidance on when to use force=true (when assigned). Implicitly indicates default behavior (cannot delete if in use). Lacks explicit mention of alternatives like unassigning first.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
collections_listAInspect
List all knowledge collections in the workspace.
Collections are named groups of files used for RAG search. Auto-created collections (per-agent, per-prompt) are hidden by default.
| Name | Required | Description | Default |
|---|---|---|---|
| include_inactive | No | Include inactive collections (default: false) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description adds value by disclosing that auto-created collections are hidden by default. It lacks permissions or side-effect details but is sufficient for a read-only list tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two sentences, front-loading the main action and avoiding any unnecessary words or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (one boolean parameter, no output schema), the description fully covers the necessary context: what the tool does, the default behavior, and what collections are.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description adds 'auto-created collections hidden by default' context but does not significantly enhance parameter meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly and specifically states the tool lists knowledge collections, defines what collections are, and notes that auto-created ones are hidden by default, distinguishing it from siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context by explaining the purpose and default behavior, but does not explicitly mention when to use this tool versus alternatives like collections_list_files.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
collections_list_filesAInspect
List all files in a knowledge collection with their indexing status and chunk counts. Each returned file has a file_id (integer) that can be passed to messages.send as attachments=[file_id] to send the file to a contact, or to files.read to read its text content.
| Name | Required | Description | Default |
|---|---|---|---|
| collection_id | Yes | ID of the collection |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must fully disclose behavior. It mentions the output includes indexing status and chunk counts, but does not state whether the tool is read-only, requires specific permissions, has pagination, or any rate limits. The lack of such details is a significant gap for a tool with no annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences long, front-loaded with the main purpose, and contains no unnecessary words. Every sentence provides value: the first states the core function, the second adds practical usage information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that there is no output schema, the description should elaborate on the return structure. It only mentions 'indexing status and chunk counts' and hints at the file_id usage. This is adequate but lacks details about the shape of the response (e.g., array of objects with specific fields), leaving some ambiguity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The only parameter, collection_id, is already described in the schema as 'ID of the collection'. The description adds no additional meaning beyond that. Since schema coverage is 100%, baseline 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists files in a knowledge collection with indexing status and chunk counts, and distinguishes from siblings like collections_add_file or collections_list. It also mentions the file_id field and how it can be used with messages.send or files.read, adding specific value.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use the tool (listing files in a collection) but provides no explicit guidance on when not to use it or alternatives. Sibling tools like collections_list or files_read are not mentioned as alternatives, leaving the agent to infer usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
collections_remove_fileAInspect
Remove a file from a knowledge collection.
The file itself is not deleted — only the collection membership is removed.
| Name | Required | Description | Default |
|---|---|---|---|
| file_id | Yes | ID of the file to remove | |
| collection_id | Yes | ID of the collection |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses that the file is not deleted, only membership is removed. However, without annotations, it could provide more, such as permissions or side effects (e.g., what happens if file is in no collection).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with the action. Every word earns its place; no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple removal tool with two required params and no output schema, the description is largely complete. It could mention return value but is adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters described. The description adds no further semantics beyond the schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Remove a file from a knowledge collection') and the resource, and distinguishes itself from deletion of the file itself. Among siblings, 'collections_add_file' is the counterpart.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool vs alternatives (e.g., files_delete). It does not mention when not to use or provide context for choosing this tool over other file-related tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
collections_unassign_agentAInspect
Remove a knowledge collection from an AI agent.
The collection and its files are not deleted — only the agent assignment is removed.
| Name | Required | Description | Default |
|---|---|---|---|
| agent_id | Yes | ID of the AI agent | |
| collection_id | Yes | ID of the collection to unassign |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description bears the burden. It discloses non-destructive behavior: collection and files remain. No mention of permissions or side effects, which is acceptable for a simple unassignment.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with the action. Every word adds value, no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Simple tool with no output schema. Description covers what it does and what it does not do. Missing return value or result type, but contextually adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions. The description does not add extra meaning beyond the schema; baseline applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action: 'Remove a knowledge collection from an AI agent.' It distinguishes from deletion by noting collection and files are not deleted. The sibling tool 'collections_assign_agent' implies the inverse. Specific verb+resource.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use or when-not-to-use. The description clarifies that only the assignment is removed, which helps avoid misuse, but no alternatives or prerequisites are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
contacts_add_channelAInspect
🔗 Link a new channel identity (email, phone, LinkedIn, etc.) to an existing contact.
When to use:
User learns a contact's email or phone and wants to save it
User wants to link a LinkedIn/Instagram profile to an existing contact
Adding a second channel for an existing person
Requires contact_id (entity_id) from contacts.find.
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Email address, phone number, or username for this channel | |
| channel | Yes | Channel type to add | |
| contact_id | Yes | entity_id from contacts.find | |
| display_name | No | Optional display label for this identity |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description must carry the burden. It describes the action but does not disclose behavior on duplicate channels (e.g., overwrite vs. error) or authentication requirements, leaving gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two short paragraphs with clear headings and bullet points. Every sentence adds value, no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple add-channel tool, the description covers purpose, usage, and prerequisite. Missing behavior on duplicates is a minor gap, but overall sufficient given schema coverage and tool simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the schema already describes all parameters. The description adds no additional meaning beyond the prerequisite for contact_id, resulting in a baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action: 'Link a new channel identity ... to an existing contact.' It uses a specific verb and resource, and distinguishes itself from sibling tools like contacts_find and contacts_update.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly lists three 'When to use' scenarios and a prerequisite ('Requires contact_id from contacts.find'), providing clear context for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
contacts_discoverAInspect
Search for a contact on a live channel (Telegram, WhatsApp, etc.) before adding them. Use this to look up a person by username or phone number before calling contacts.sync.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Username, phone, or name to search for | |
| channel | Yes | Channel name: telegram, whatsapp, etc. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries burden. Indicates it's a search (read) operation, but does not disclose details like error handling or auth. Adequate for a simple search.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences perfectly front-loaded and concise: purpose first, usage guidance second. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given low complexity (2 params, no output schema), description covers purpose, usage, and parameter hints adequately. Lacks output format but still sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds example usage ('username or phone number') but not critical new info beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb 'Search for' and resource 'contact on a live channel', and distinguishes from sibling contacts_sync by saying 'before calling contacts.sync'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use: 'before calling contacts.sync' and how: 'by username or phone number'. Lacks explicit exclusions but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
contacts_findAInspect
👤 Search for contacts in your address book by name or username.
When to use:
User asks 'find contact X' or 'who is Y?'
User wants to know someone's username or ID
Before sending a message to verify contact exists
To get contact's channel reference for messaging
Examples: ❓ User: 'find contact named [name]' → contacts_search(query='[name]', limit=5)
❓ User: 'who is [full name]?' → contacts_search(query='[full name]', limit=1)
❓ User: 'search for @username' → contacts_search(query='username', limit=10)
Returns: name, username, channel, channel_ref, similarity_score, match_type. Plus:
entity_id: local DB key — pass to contacts.profile. Null for live-discovered contacts (skip contacts.profile for those).
telegram_user_id (when channel='telegram'): the Telegram user ID — pass to calls.make / messages.send. NOT entity_id.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return | |
| query | Yes | Name or username to search for (supports partial matches) | |
| channel | No | Filter by channel (telegram, email, whatsapp, sms, slack) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, but the description explains return fields, special cases like entity_id null for live-discovered contacts, and telegram_user_id usage. It sufficiently covers behavioral traits for a read-only search tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with sections and front-loaded purpose, but somewhat verbose with redundancy (e.g., return fields repeated from schema). Could be trimmed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description fully explains return values and special fields. It covers all necessary context for usage, including entity_id handling and channel specifics.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameter descriptions, and the description adds value by showing concrete usage examples (e.g., different query formats, limit usage) beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches for contacts by name or username, with examples differentiating it from sibling tools like contacts_profile or contacts_sync.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use scenarios and examples, but lacks explicit when-not-to-use guidance. Still, the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
contacts_profileAInspect
👤 Get full profile for a contact: all channel identities, notes, role, capabilities, birthday.
When to use:
After contacts.find to get complete info about a specific person
To see all channels a contact is reachable on
To read notes, role, or capabilities for a contact
Requires contact_id (entity_id) from contacts.find.
| Name | Required | Description | Default |
|---|---|---|---|
| contact_id | Yes | entity_id from contacts.find |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It describes the tool as a read operation ('Get full profile') and lists returned data. Lacks mention of edge cases or error handling, but is transparent about the operation and outputs.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Very concise: first line states purpose, then bullet points for usage, then requirement. No extraneous information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains what is returned. It also mentions prerequisite (contacts.find). The tool is simple and the context is complete for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only parameter contact_id has a schema description ('entity_id from contacts.find') which is reinforced and expanded by the description requiring it from contacts.find. Schema coverage is 100%, and description adds value by clarifying the source of the ID.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the action (Get full profile for a contact) and lists the data it returns (all channel identities, notes, role, capabilities, birthday). Distinguishes itself from sibling contacts.find by specifying it provides 'complete info' about a specific person.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use the tool: after contacts.find, to see all channels, to read notes/role/capabilities. Also provides a requirement: contact_id from contacts.find. No alternatives mentioned but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
contacts_syncAInspect
Add a discovered contact and open a conversation thread. Returns thread_id for the new conversation. Call contacts.discover first to verify the contact exists.
| Name | Required | Description | Default |
|---|---|---|---|
| channel | Yes | Channel name: telegram, whatsapp, etc. | |
| identifier | Yes | Username or phone number to add |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses the return (thread_id) and the prerequisite, but with no annotations, it lacks detail on side effects (e.g., whether it can create duplicate threads, if it modifies existing contacts, or permissions needed). Behavior is partially transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each adding unique value: purpose, return value, and prerequisite. No unnecessary words, tightly structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with only two parameters and no output schema, the description covers purpose, return, and prerequisite. However, it could be more complete by clarifying behavior for existing contacts (e.g., does it open an existing thread or always create new?). Slight gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage with descriptions for both parameters, so the description adds no extra meaning beyond restating the schema fields. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('add a discovered contact and open a conversation thread') and specifies the return value ('Returns thread_id for the new conversation'). It distinguishes from siblings like contacts_discover by noting it as a prerequisite.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells the agent to call contacts.discover first, providing a clear sequential guideline. However, it doesn't specify when not to use this tool or mention alternative tools for related operations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
contacts_updateAInspect
✏️ Update a contact's profile: name, notes, role, capabilities, birthday, preferred channel.
When to use:
User wants to add notes about a contact
User wants to set/update role or capabilities for a contact
User wants to rename a contact or update birthday
Requires contact_id (entity_id) from contacts.find. At least one optional field must be provided.
| Name | Required | Description | Default |
|---|---|---|---|
| role | No | Contact role (e.g. developer, client, partner). Empty string clears role. | |
| notes | No | Free-text notes/context about this contact. Empty string clears notes. | |
| contact_id | Yes | entity_id from contacts.find | |
| birthday_day | No | Birth day 1-31 (must be set together with birthday_month) | |
| capabilities | No | List of capabilities (e.g. ['backend', 'design']) | |
| display_name | No | New display name (max 255 chars) | |
| birthday_year | No | Birth year 1900-2100 (optional, standalone) | |
| birthday_month | No | Birth month 1-12 (must be set together with birthday_day) | |
| preferred_channel | No | Preferred channel for contacting this person |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It indicates mutation ('update') and a prerequisite, but fails to mention side effects such as whether missing optional fields are left unchanged or overwritten, whether the update is partial or full, or what the response looks like. The schema describes clearing behavior for some fields (empty strings), but this is not echoed in the description, leaving behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: a single-line summary with an emoji, followed by bullet-point use cases and a prerequisite note. Every sentence serves a purpose with no redundancy, and the structure is front-loaded with the main action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (9 parameters, no output schema, no annotations), the description covers core aspects (what, when, prerequisite, constraint) but omits return value, error conditions, and detailed field behavior. It is adequate for basic use but leaves information gaps that an agent may need, preventing a higher score.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so each parameter already has a detailed description. The tool description adds a high-level list of updatable fields and the prerequisite for contact_id, but does not clarify constraints like the mutual requirement of birthday_day and birthday_month, or the clearing behavior of empty strings for role/notes. This adds limited value beyond the schema, resulting in a baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Update a contact's profile' and lists specific fields (name, notes, role, capabilities, birthday, preferred channel). It distinguishes itself from sibling tools like contacts_find (find), contacts_profile (view), and contacts_sync (sync) as the dedicated update tool, leaving no ambiguity about its purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit 'When to use' scenarios (add notes, set/update role/capabilities, rename, update birthday) and important prerequisites (requires contact_id from contacts.find, at least one optional field). It does not list alternatives or exclusions, but the guidance is clear and sufficient for typical use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
documents_createAInspect
Generate a document (PDF / PPTX / DOCX / HTML) from markdown content authored by you.
REQUIRED parameters:
title: Short human-readable title.
content_markdown: The body. Slides separated by
---on its own line at the top level (Marp rule). Tables, code, lists, footnotes, definition lists, and {.section-header} class attrs all parse.format: "document" (single flowing body) or "presentation" (slides).
output_type: "pdf", "pptx", "docx", or "html".
Optional:
theme: "default" | "corporate" | "minimal" | "pitch" | "invoice" | "contract" | "cinema" | "editorial" (default "default"). cinema/editorial are presentation-only (engine=marp).
language: BCP-47 tag (default "en"). Drives font fallback for Cyrillic/CJK/Arabic content.
engine: "marp" | "weasyprint". For format=presentation PDF/HTML only. Default "marp" (designer-grade Chromium renderer with full CSS3, web layout, and {.cover}/{.hero}/{.split}/{.stats}/{.dark} layout classes). Pass "weasyprint" for the legacy print-CSS path. Rejected for format=document or output_type=pptx.
DELIVERY CONTRACT (CRITICAL):
After this tool returns a file_id, deliver the file by calling
messages.send(attachments=[file_id], text="<short caption>").
Do NOT embed the file_id in a markdown link, a sandbox: URL, or
/api/files/<id>/download text — those render as plain text on the
recipient's channel, not as a file attachment. The attachments parameter
is the ONLY way the file actually attaches.
CONVENTIONS:
Two-column slide: wrap with
::: cols\n::: col\n…\n:::\n::: col\n…\n:::\n:::.Speaker notes (presentations only):
::: notes\n…\n:::at the end of a slide block. NOT<!-- ... -->(comments are escaped, not captured).Section header slide:
{.section-header}on its own line directly above the heading. Block-attr form, not inline.Images: only
(workspace file_id),data:image/...URIs, or hosts in DOCUMENTS_MEDIA_URL_ALLOWLIST. Other URLs are dropped with[image removed].
LAYOUT CLASSES (engine=marp only — ignored under engine=weasyprint):
{.cover}— title-slide layout (centered headings, gradient background).{.hero image="file:NNN"}— full-bleed background image with dark overlay and white headline.{.split image="file:NNN"}— 50/50 image left, content (heading/bullets) right.{.stats}— 3-up KPI cards: each card is### big-numberfollowed by a one-line label paragraph.{.dark}/{.invert}— per-slide dark mode override. Bothimage="file:NNN"andimage=file:NNNare accepted (quoted or unquoted). Place the class line on its own row directly above the slide content.
Format × output_type rules:
document+pptxis rejected — setformat=presentationor pick pdf/docx/html.theme=invoice/contract+output_type=pptxsilently uses the default PPTX master.
For theme="invoice", every invoice MUST include a "Total" row whose value equals sum(line items) + tax (within ±0.01). The renderer fails closed on missing or mismatched totals.
EXEMPLAR — invoice (English):
Invoice INV-{YYYYMMDD-HHMMSS}
From: {Issuer Legal Name}, {Address}, {Tax ID} To: {Customer Name}, {Customer Address}, {Customer Tax ID} Issue date: {YYYY-MM-DD} Due date: {YYYY-MM-DD}
Description | Qty | Unit price | Total |
{Service 1} | 1 | 1500.00 | 1500.00 |
{Service 2} | 2 | 500.00 | 1000.00 |
Subtotal: USD 2500.00 Tax (20%): USD 500.00 Total: USD 3000.00
Payment: {bank details OR crypto wallet — never both}
EXEMPLAR — invoice (Russian):
Счёт-фактура № INV-{YYYYMMDD-HHMMSS}
От: {Юридическое название организации}, {Адрес}, ИНН {Tax ID} Кому: {Название клиента}, {Адрес клиента}, ИНН {Tax ID} Дата: {YYYY-MM-DD} Срок оплаты: {YYYY-MM-DD}
Описание | Кол-во | Цена | Сумма |
{Услуга 1} | 1 | 1500.00 | 1500.00 |
{Услуга 2} | 2 | 500.00 | 1000.00 |
Подытог: USD 2500.00 НДС (20%): USD 500.00 Итого: USD 3000.00
Реквизиты: {банковские реквизиты ИЛИ криптокошелёк — не оба сразу}
EXEMPLAR — contract (English):
Service Agreement
Between: {Provider Legal Name}, {Address} ("Provider") And: {Client Legal Name}, {Address} ("Client") Effective date: {YYYY-MM-DD}
1. Scope of services
{Concise description of what Provider agrees to deliver.}
2. Term
This Agreement begins on the Effective date and continues until {termination condition or end date}.
3. Compensation
Client pays Provider {amount and currency} according to {payment schedule}.
4. Confidentiality
Both parties agree to keep proprietary information of the other party confidential during and after the term of this Agreement.
5. Termination
Either party may terminate with {N} days' written notice.
6. Governing law
{Jurisdiction}.
Provider: ____________________ Client: ____________________ {Provider signatory name} {Client signatory name}
EXEMPLAR — contract (Russian):
Договор оказания услуг
Между: {Юридическое название Исполнителя}, {Адрес} ("Исполнитель") И: {Юридическое название Заказчика}, {Адрес} ("Заказчик") Дата вступления в силу: {YYYY-MM-DD}
1. Предмет договора
{Краткое описание услуг, которые Исполнитель обязуется оказать.}
2. Срок действия
Договор вступает в силу с указанной даты и действует до {условие прекращения или дата окончания}.
3. Стоимость и порядок оплаты
Заказчик оплачивает услуги Исполнителя в размере {сумма и валюта} в порядке {график платежей}.
4. Конфиденциальность
Стороны обязуются сохранять конфиденциальность сведений, полученных в ходе исполнения настоящего Договора, в течение срока его действия и после его прекращения.
5. Расторжение
Любая из сторон вправе расторгнуть Договор, направив письменное уведомление не менее чем за {N} дней.
6. Применимое право
{Юрисдикция}.
Исполнитель: ____________________ Заказчик: ____________________ {ФИО подписанта Исполнителя} {ФИО подписанта Заказчика}
| Name | Required | Description | Default |
|---|---|---|---|
| theme | No | Visual theme. invoice/contract trigger the corresponding exemplar styling. | default |
| title | Yes | Short human-readable title for the document. | |
| engine | No | PDF/HTML engine for presentations. 'marp' (default for format=presentation) renders via headless Chromium with full CSS3, web fonts, and layout classes (.cover, .hero, .split, .stats, .dark). 'weasyprint' is the legacy renderer. Rejected for output_type=pptx (always python-pptx for editable text — use output_type=pdf or html, or remove the engine parameter). Rejected for format=document (always weasyprint). | |
| format | Yes | 'document' for a single flowing body, 'presentation' for slides. | |
| language | No | BCP-47 language tag (e.g. 'en', 'ru', 'zh', 'ja'). Drives font fallback for non-Latin scripts. | en |
| output_type | Yes | Renderer target: 'pdf' | 'pptx' | 'docx' | 'html'. | |
| content_markdown | Yes | Markdown body authored by the agent. Slides separated by '---' on its own top-level line. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully carries the burden. It discloses critical behaviors: image restrictions, delivery contract, invoice total validation, formatting conventions for slides, and rules for themes. This is exceptionally transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with clear sections (REQUIRED, Optional, DELIVERY CONTRACT, CONVENTIONS, etc.) but is quite lengthy due to extensive exemplars. While every section adds value, the length slightly impacts conciseness, though it remains well-organized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (6 parameters, no output schema, many edge cases), the description is remarkably complete. It covers the return contract, formatting conventions, image restrictions, error cases (document+pptx), and provides exemplars for invoices/contracts. No missing critical information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Although schema coverage is 100%, the description adds significant meaning beyond the schema: it explains theme triggers for invoice/contract styling, language for font fallback, content_markdown slide separation using '---', format/presentation distinctions, and output_type restrictions. This greatly aids correct parameter usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Generate a document (PDF / PPTX / DOCX / HTML) from markdown content authored by you.' It identifies the specific verb (generate), resource (document), and supported formats, distinguishing it clearly from sibling tools (no other document creation tool present).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description includes explicit delivery instructions (DELIVERY CONTRACT) and format/output_type rules (e.g., document+pptx rejected). It provides context on when to use different formats and themes but does not explicitly mention when not to use this tool or alternative tools (though no direct alternatives exist among siblings).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
files_get_base64AInspect
Download one or more files server-side and return their content as base64-encoded strings. Use this to inspect images, PDFs, or any binary file attached to messages when you cannot access presigned S3 URLs directly. Supports up to 5 files per call, max 15 MB each. For large files batch in groups of 1-2 to avoid oversized responses.
| Name | Required | Description | Default |
|---|---|---|---|
| file_ids | Yes | List of file IDs to fetch as base64 (max 5). Get IDs from files.info or message attachment_file_ids. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses server-side operation, base64 output, file limits, and batching guidance. No contradictions with missing annotations. Could mention if read-only or destructive, but overall good transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three succinct sentences: purpose, use case, constraints. No redundant information; each sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema or annotations, the description covers purpose, usage context, parameter source, and limitations. Sufficient for correct invocation without additional details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema already describes file_ids thoroughly. Description adds source for IDs (files.info or attachment_file_ids) and reinforces max limit, providing value beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool downloads files and returns base64-encoded content, with specific use cases (inspecting images, PDFs) and distinguishes from siblings like files_read or files_info by mentioning server-side download and inability to access presigned URLs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides when to use (when cannot access S3 URLs) and constraints (max 5 files, 15 MB each, batching advice). Lacks explicit when not to use or alternative tools, but context implies alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
files_infoAInspect
Get metadata and download URLs for files by their IDs.
When to use:
After messages_read_history returns attachment_file_ids
To get a presigned download URL to read a received file
Returns: filename, mime_type, byte_size, download_url (1-hour presigned URL).
| Name | Required | Description | Default |
|---|---|---|---|
| file_ids | Yes | List of file IDs (max 20) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries full burden. It discloses return fields: filename, mime_type, byte_size, download_url with a 1-hour expiry, which is good transparency for a metadata tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences front-loaded with purpose, followed by usage context and return description. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given low complexity, complete schema coverage, and no output schema, the description covers purpose, usage, parameters, and return value sufficiently.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter (file_ids) with 100% schema coverage; description adds no new meaning beyond what is in the schema (list of integer IDs, max 20). Baseline 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Get metadata and download URLs for files by their IDs.' The verb 'Get' and resource 'files' are specific, and the tool is distinct from sibling tools like files_get_base64 (base64 content) and files_read (content).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: after messages_read_history returns attachment_file_ids and to get a presigned download URL. No alternatives or exclusions are mentioned, but for a simple retrieval tool, this guidance is adequate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
files_ingestAInspect
Save and index a file into the knowledge base. Use this when the user asks to save, store, or remember a document. The file will be processed (OCR if needed) and indexed for future search.
| Name | Required | Description | Default |
|---|---|---|---|
| tags | No | Optional list of tags for categorization (e.g., ['presentation', 'dextrade']). | |
| title | No | Human-readable title for the file (e.g., 'Project Presentation', 'Q1 Report'). If not provided, uses original filename. | |
| file_id | Yes | ID of the file to ingest (from attachment_file_ids in context). | |
| thread_id | No | Optional thread ID to associate the file with. If not provided, uses context thread. | |
| description | No | Optional description of the file contents. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description partially discloses behaviors (OCR processing, indexing) but omits prerequisites, side effects, or limits like file size or required upload step.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two efficient sentences front-loading purpose and usage, with no redundant words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks return value details, error conditions, and explicit prerequisites (e.g., file must be uploaded first). Adequate but not fully complete for a 5-parameter tool with no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds no per-parameter meaning beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs 'save and index' and clearly identifies the resource as 'file into the knowledge base', distinguishing it from siblings like files_upload or files_read.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('when the user asks to save, store, or remember a document'), but does not mention when not to use or provide alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
files_readAInspect
Read the contents of an attached file directly. Use this when the user asks 'what is in this file?' or 'read this document'. Works for text files (.txt, .md, .json, code files, etc.) and PDFs (returns OCR-extracted text after files.ingest). For images, use files.get_base64.
| Name | Required | Description | Default |
|---|---|---|---|
| file_id | Yes | ID of the file to read (from attachment_file_ids in context). | |
| encoding | No | Text encoding to use (default: utf-8). | utf-8 |
| max_chars | No | Maximum characters to return (default: 10000). Use smaller values for large files. | |
| summarize | No | If true, generate AI summary instead of returning raw content. Use for 'summary', 'summarize', 'краткое содержание' requests. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses supported file types, that PDFs return OCR text after ingest, and that images are not handled. Does not explicitly state read-only nature, but it's implied. Good coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences with no wasted words. Core purpose, usage, and file type info front-loaded. Excellent conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read tool with 4 params and no output schema, the description covers file types, prerequisites, and alternatives. Could mention return format (raw text or summary), but not critical. Overall complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with detailed param descriptions. Description adds value by linking the 'summarize' parameter to user requests for summaries, and implies encoding and max_chars usage. Good complement.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool reads file contents, specifies supported file types, and distinguishes from sibling tools like files.get_base64 for images. It includes example user requests as usage cues.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use (user asks 'what is in this file?') and when not to (for images, use files.get_base64). Also notes prerequisite for PDFs (needs files.ingest first).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
files_uploadAInspect
Upload a file to DialogBrain and get a file_id for use in messages_send.
When to use:
User wants to send a file/image to a contact
Before calling messages_send with an attachment
Returns: file_id (integer) to pass to messages_send attachments parameter.
| Name | Required | Description | Default |
|---|---|---|---|
| title | No | Optional display title | |
| content | No | Base64-encoded file bytes. Either content OR source_url is required. | |
| filename | No | Filename with extension (e.g. 'photo.png') | upload |
| mime_type | No | MIME type (e.g. 'image/png', 'application/pdf') | application/octet-stream |
| source_url | No | Public URL to fetch file from. Either content OR source_url is required. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries behavioral burden. It mentions upload and return of file_id but does not disclose limitations (e.g., file size limits, authentication needs, side effects). Adds some value but lacks depth for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Short, front-loaded description with bullet points. Every sentence adds value. No fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description returns file_id. Lacks info on error handling, supported MIME types, or size limits. Adequate for basic use but incomplete for edge cases.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers all 5 parameters with descriptions (100% coverage). Description does not add beyond schema; it only mentions the return file_id. Baseline 3 is appropriate since schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description states 'Upload a file to DialogBrain and get a file_id for use in messages_send.' Clearly specifies verb (upload), resource (file), and outcome (file_id). Distinguishes from siblings like files_get_base64 and files_read by focusing on upload and subsequent messaging use.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use: 'User wants to send a file/image to a contact' and 'Before calling messages_send with an attachment.' Provides clear context, though it could mention when not to use (e.g., for other file operations).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
folders_createAInspect
📁 Create a new inbox folder to organize threads.
When to use:
User wants to create a folder to group related conversations
User wants to organize threads by topic, project, or contact type
After creating a folder, use threads.update with folder_id to move threads into it.
| Name | Required | Description | Default |
|---|---|---|---|
| icon | No | Emoji icon for the folder (max 10 chars, optional) | |
| name | Yes | Folder name (max 100 chars) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the burden. It indicates a mutation (creates a resource) and implicitly mentions folder_id is returned. However, it does not disclose potential side effects, error conditions, or permissions needed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise: 5 lines with an emoji, verb, bullet points, and a forward-reference. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks explicit return value description but implies folder_id via post-creation hint. References a related tool (threads.update). For a simple create tool, this is reasonably complete given the context of sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema covers both parameters with descriptions (100% coverage). The description adds no additional semantics beyond the schema, such as examples or format constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool creates a new inbox folder to organize threads, using the verb 'create' and resource 'inbox folder'. It is distinct from siblings like folders_delete (delete) and threads_update (move threads).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear 'when to use' scenarios (user wants to create a folder, organize threads) and a post-creation hint to use threads.update. However, it does not explicitly state when not to use it or list alternatives beyond the sibling reference.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
folders_deleteAInspect
🗑️ Delete an inbox folder. Threads inside become unfiled (not deleted).
When to use:
User wants to remove a folder they no longer need
User wants to clean up their inbox organization
Threads inside the folder are NOT deleted — they simply move back to the inbox.
| Name | Required | Description | Default |
|---|---|---|---|
| folder_id | Yes | ID of the folder to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses key behavioral trait: threads become unfiled but are not deleted. This is essential for understanding the tool's impact.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Short, uses emoji for visual cue, two clear sentences plus bullet list. Every sentence adds value; no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 1 parameter, no output schema, and no annotations, description fully covers the tool's behavior including side effects on threads. Complete for an agent to understand usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has one parameter with description. Schema coverage is 100%, so baseline 3. Description adds no extra meaning beyond the schema; it does not specify format or validation rules.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states deletion of an inbox folder and explains consequence for threads (unfiled, not deleted). Distinguishes from siblings like folders_create.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides two when-to-use scenarios: removing unneeded folders and cleaning up inbox organization. Also clarifies that threads inside are not deleted.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
group_discovery_addAInspect
Add a specific group to your discovery list by @username or invite link (t.me/...).
When to use:
You already know the group's @username or invite link
Adding a known group without searching
Returns: group metadata including id, title, member_count.
| Name | Required | Description | Default |
|---|---|---|---|
| link | Yes | The group's @username or invite link (e.g. '@phuket' or 't.me/...') | |
| channel | Yes | Channel the group is on (e.g. 'telegram') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, description carries full burden. It discloses return value (group metadata including id, title, member_count) which is helpful. However, it does not specify failure modes or idempotency behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is concise: action, when-to-use, and returns. Every sentence adds value, and it is front-loaded with the core purpose. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple add tool with two parameters and no output schema, the description fully covers purpose, usage context, parameter semantics, and return value. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds meaning by clarifying 'link' accepts @username or t.me/... links and 'channel' is the platform like 'telegram'. This goes beyond the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the action: add a specific group to your discovery list via @username or invite link. The verb 'add' and resource 'discovery list' are specific, and it distinguishes from sibling tool 'group_discovery_search' which is for searching.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Includes a 'When to use' section specifying preconditions: you already know the group's @username or invite link, and you are adding a known group without searching. This provides clear context but does not explicitly mention when not to use or alternative tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
group_discovery_joinAInspect
Join a group and start syncing its messages to your inbox. The group must be in your discovery list (use group_discovery.search or group_discovery.add first).
What this does:
Joins the group on Telegram (or other channel)
Creates a thread in your inbox for syncing messages
Optionally enables AI auto-reply drafts
Returns: success, thread_id, auto_reply_enabled.
| Name | Required | Description | Default |
|---|---|---|---|
| group_id | Yes | ID of the discovered group (from group_discovery.search or group_discovery.list) | |
| enable_auto_reply | No | Enable AI auto-reply drafts for messages in this group. Drafts can be reviewed and sent manually. Default: true. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description bears full burden. It discloses key behaviors: joining the group, creating a thread, optional auto-reply. It also lists return values. However, it does not mention potential side effects like notification noise or irreversibility, which would elevate transparency further.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a prerequisite sentence and a bullet list of actions. It is concise with no unnecessary words, but the bullet list uses spaces after dashes which is minor. Overall, it efficiently conveys all necessary information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 2 parameters and no output schema, the description covers inputs, actions, and return values. It explains the prerequisite and optional behavior. It could mention error scenarios (e.g., group not found or already joined) to be fully complete, but it is adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so schema already describes both parameters. The description adds value by linking group_id to discovery results, explaining the effect of enable_auto_reply, and indicating return fields (thread_id, auto_reply_enabled). This goes beyond the schema's static descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'join' and resource 'group', and details the actions: joins group, creates thread, optionally enables auto-reply. It distinguishes from sibling tools by specifying the prerequisite that the group must be in the discovery list, contrasting with search/add/list tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit prerequisite guidance: 'The group must be in your discovery list (use group_discovery.search or group_discovery.add first).' This tells agents when to use this tool, though it does not explicitly state when not to use it among alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
group_discovery_listAInspect
List groups you've found and joined in this workspace.
Lifecycle values:
discovered: found but not yet evaluated
bookmarked: saved for later
monitored: joined and actively syncing messages
dismissed: hidden
By default, dismissed groups are excluded. Returns: id, title, member_count, lifecycle, scan_status, overall_score.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results (1-100, default 20) | |
| offset | No | Pagination offset (default 0) | |
| channel | No | Filter by channel (e.g. 'telegram'). Optional. | |
| lifecycle | No | Filter by state: discovered, bookmarked, monitored (=joined/syncing), dismissed. Optional. | |
| min_score | No | Minimum overall score (0.0-1.0). Optional. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries the full burden. It explains the lifecycle states and that dismissed groups are excluded by default, and lists return fields. This gives sufficient behavioral context for a list tool, though it does not mention auth needs or rate limits, which are less critical here.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, starting with the main action and then providing lifecycle details and defaults. Each sentence adds necessary information without redundancy. It is efficiently structured and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool, the description covers all important aspects: what it lists, return fields, lifecycle filters, default exclusion, and pagination parameters are implied by schema. No gaps are evident given the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by explaining the meaning of lifecycle states (e.g., 'discovered: found but not yet evaluated') and the default exclusion of dismissed, which goes beyond the schema's enum list. This enriches parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists groups the user has found and joined in the workspace. It specifies the verb 'list' and the resource, and contrasts with sibling tools like group_discovery_search by implying a broad listing vs filtering. The returned fields are listed, providing clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus sibling tools like group_discovery_search or group_discovery_scan. The description only states what it does, but does not exclude cases where other tools might be more appropriate, leaving the agent without decision support.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
group_discovery_preview_messagesAInspect
Read recent public messages from a group without joining it. Only works for groups where can_preview_history=true.
Use this to manually evaluate message quality before deciding to join. For an automated quality score, use group_discovery.scan instead.
Returns: list of recent messages with sender, text, date, is_reply.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Number of recent messages to fetch (1-100, default 20) | |
| group_id | Yes | ID of the discovered group (from group_discovery.search or group_discovery.list) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses the precondition (can_preview_history=true) and the return fields. However, it does not specify behavior when the condition fails (e.g., error response), which is a minor gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three terse sentences: action, usage guidance with alternative, return format. No redundant text, all sentences earn their place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a two-parameter read tool with no output schema, the description covers purpose, precondition, return format, and sibling differentiation. It lacks explicit mention of error handling or message ordering, but remains largely complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage with descriptions for both parameters. The description adds no new semantic information beyond what the schema already provides, so baseline 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Read recent public messages from a group without joining it' with specific verb and resource. It distinguishes itself from the sibling tool group_discovery.scan by specifying manual evaluation vs automated scoring.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use ('manually evaluate message quality before deciding to join') and provides an alternative ('For an automated quality score, use group_discovery.scan instead'). This gives clear context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
group_discovery_scanAInspect
Scan a group to evaluate its quality before joining. Fetches recent messages, analyzes activity, spam, and engagement, then returns a quality score and plain-English verdict.
When to use:
After finding groups with group_discovery.search
Before deciding which groups to join
Returns: overall_score (0-1), is_disqualified, disqualify_reasons, individual scores, and a verdict string.
| Name | Required | Description | Default |
|---|---|---|---|
| group_id | Yes | ID of the discovered group (from group_discovery.search or group_discovery.list) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description fully carries burden. It discloses that the tool fetches recent messages, analyzes activity/spam/engagement, and returns scores and a verdict. No destructive side effects are indicated, and no contradiction with missing annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: four sentences plus bullet points. It front-loads the main action and every sentence adds value. No redundant or missing information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description lists return fields (overall_score, is_disqualified, etc.), providing completeness. It covers purpose, usage, parameters, and returns, making it sufficient for an agent to understand and invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has one parameter with 100% coverage description. The description adds value by specifying that group_id comes from group_discovery.search or group_discovery.list, linking it to the workflow. This is helpful beyond the schema's description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Scan a group to evaluate its quality before joining.' It uses a specific verb (scan, evaluate) and resource (group). It distinguishes from siblings by specifying it's for evaluation before joining, as opposed to search, list, add, join, or preview tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit 'When to use' section provides clear context: after group_discovery.search and before deciding to join. This guides the agent on workflow placement and alternative tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
group_discovery_searchAInspect
Search for public groups or channels by topic on Telegram (or other channels). Returns matching groups with title, member count, and whether messages can be previewed.
When to use:
Finding groups related to a topic or niche
Building a list of groups for outreach or monitoring
After searching, use group_discovery.scan to evaluate quality before joining.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return (1-50, default 20) | |
| channel | Yes | Channel to search on (e.g. 'telegram') | |
| keywords | Yes | Search keywords or phrase (e.g. 'crypto trading signals') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations present, so description carries full burden. It does not disclose behavioral details like rate limits, authentication, scope (only public groups), error handling, or performance characteristics. The description is minimal on behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is concise with a clear structure: action, output, when to use, next steps. No redundant sentences. Front-loaded with key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 3 parameters, no output schema, and many sibling tools, the description covers purpose, usage, and return fields adequately. Lacks behavioral details but is otherwise complete for a search tool. Mentioning output fields compensates for missing output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with parameter descriptions already clear. The description adds no additional semantics beyond restating the purpose. Baseline 3 appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches for public groups/channels by topic and specifies the return fields (title, member count, previewability). It distinguishes from sibling group_discovery tools by mentioning using scan after search, but could more explicitly differentiate from other search tools like web_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides a 'When to use' section with two concrete use cases and suggests using group_discovery.scan after searching. No explicit when-not-to-use or comparison to siblings, but the guidelines are helpful for typical scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
images_generateAInspect
Generates a PNG image from a text prompt using Gemini 2.5 Flash Image. Returns a file_id consumable by messages.send(attachments=[...]) and other file-aware tools. Supports up to 3 reference image file_ids for subject-consistent edits and composition. Latency: ~8-10s per image. Output: 1024×1024 PNG.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | Text description of the image to generate (3-4000 chars). | |
| aspect_ratio | No | Output aspect ratio. | 1:1 |
| reference_file_ids | No | Optional list of up to 3 file_ids whose images should be used as visual references (for edits, subject consistency, or composition). Files must be image MIME types (image/png, image/jpeg, image/webp, image/gif). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden for behavioral disclosure. It discloses latency (~8-10s per image) and output dimensions (1024x1024 PNG), which is helpful, but does not mention potential rate limits, content safety, or error handling. For a generative tool, this is adequate but not thorough.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences, each carrying distinct information: main action and output, reference image support, and performance characteristics. It is front-loaded with the primary purpose and avoids any redundant or wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description covers the core behavior, input constraints (implied by schema), reference image limits, latency, and output format. It could be more explicit about the full response structure (e.g., is there a status? any metadata?) but is otherwise complete for an image generation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so all parameters have descriptions. The description adds value beyond the schema by explaining the purpose of reference_file_ids (subject-consistent edits and composition) and the latency implications. However, it does not add new details for prompt or aspect_ratio beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool generates a PNG image from a text prompt using Gemini 2.5 Flash Image, specifies the return format (file_id), and distinguishes from sibling tools like images_search which search for existing images rather than generate new ones.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use the tool (generating images from text prompts, with optional reference images) and how the output integrates with other tools (messages.send). However, it does not explicitly contrast with alternatives like images_search or mention when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
images_searchAInspect
Searches images in this workspace by visual content using vector embeddings (Voyage multimodal-3). Pass a text description; returns ranked file_ids with cosine scores and presigned download URLs. Up to 50 results.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max number of results. | |
| query | Yes | Text description of what you're looking for (3-4000 chars). | |
| mime_type | No | Optional — restrict to a specific image MIME (e.g. "image/png"). Filter is applied after RAG (same caveat as collection_id). | |
| collection_id | No | Optional — restrict to images attached to this collection. Filter is applied after RAG, so you may get fewer than `limit` results; pass a larger limit to broaden if needed. | |
| score_threshold | No | Minimum cosine similarity (0.0 returns all, higher = stricter). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description must disclose behavior. It explains the search mechanism, output format, and result limit (up to 50). It implies a read-only operation but does not explicitly state no side effects or permissions required. The disclosure is adequate for a search tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise (two sentences, 61 words) and front-loaded with key information. Every word adds value; no redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 5 parameters, no output schema, and no annotations, the description covers the core functionality, output format, and limit. It omits details like URL expiration or post-processing steps, but the schema descriptions handle parameter specifics. Overall, it is complete enough for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds no new parameter-specific info beyond what the schema already provides (e.g., limit, query, mime_type, collection_id, score_threshold). The mention of 'up to 50 results' and 'vector embeddings' is context about the tool, not parameter details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches images by visual content using vector embeddings, specifies the model (Voyage multimodal-3), and outlines the output (ranked file_ids, cosine scores, presigned download URLs). It is distinct from sibling tools like workspace_search or web_search by focusing on image content search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context: pass a text description to find images. It implicitly advises when to use (when needing image search by content) but lacks explicit guidance on when not to use or alternatives. However, the context is sufficient for a typical use case.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
job_completeAInspect
Mark the job as completed. This sanitizes PII from the context and records a completion summary. Use when all tasks in the job are done.
| Name | Required | Description | Default |
|---|---|---|---|
| job_id | No | The ID of the job to complete | |
| summary | No | Brief summary of what was accomplished |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries the full burden of behavioral disclosure. It mentions sanitizing PII and recording a summary, which are key side effects. However, it does not specify whether the action is reversible, any authorization requirements, or rate limits, leaving gaps in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences totaling 19 words. The first sentence immediately states the core function and key side effects. The second gives a usage condition. No filler or repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool is simple with 2 optional parameters and no output schema. The description covers purpose and when to use, but lacks information on output/return value, implications of missing required fields (job_id may be logically required despite optional in schema), and integration with related tools (e.g., `job_read_context`). Given no annotations, the description could be more thorough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage for both parameters (`job_id` and `summary`), so baseline is 3. The description adds minimal extra meaning (e.g., 'completion summary' relates to `summary`), but does not provide details beyond what the schema already states.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (mark job completed) and resource (job), and adds specific details about sanitizing PII and recording a summary. However, it does not explicitly differentiate from sibling tools like `agents_task_complete` or `job_escalate`, which may cause confusion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use the tool ('Use when all tasks in the job are done'), providing clear context. However, it does not mention when NOT to use it or list alternative tools such as `job_escalate` for partial completion.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
job_escalateAInspect
Escalate the job to a human. Use when you cannot resolve an issue, someone is not responding, or a situation requires human judgment.
| Name | Required | Description | Default |
|---|---|---|---|
| job_id | No | The ID of the job to escalate | |
| reason | Yes | Why escalation is needed |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses the action of escalating to a human, but lacks details on side effects (e.g., job state changes, notifications). With no annotations provided, more transparency would be beneficial, but the description is adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no fluff, front-loaded with the action. Every word adds value, making it highly concise and structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description provides enough context for a simple escalation tool, explaining when to use it. Lacks details on return value or confirmation, but that is acceptable given no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema covers both parameters with clear descriptions, and the description adds high-level context for the 'reason' parameter. Baseline 3 is appropriate given full schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Escalate' and the resource 'job', and distinguishes from sibling tools like job_complete and job_update_context by indicating a specific action for human involvement.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use the tool (cannot resolve, no response, human judgment needed). Does not mention alternatives or when not to use, but the context is clear for a simple escalation action.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
job_read_contextAInspect
Read the current job context. Returns the full state of your active job including assignments, escalations, and any data you previously stored.
| Name | Required | Description | Default |
|---|---|---|---|
| job_id | No | The ID of the job to read |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Describes return content but does not clarify behavior when job_id is omitted (optional per schema) or authentication needs. Adequate but not rich.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with action, no redundant information. Efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description explains return value (full state). Missing detail on optional job_id behavior if omitted, but overall sufficient for a simple read tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameter with description 'The ID of the job to read'. Description adds no additional meaning beyond schema, so baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states verb 'Read' and resource 'job context', specifies return value (full state including assignments, escalations, stored data). Distinguishes from siblings like job_update_context and job_complete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage for reading context but lacks explicit guidance on when to use vs alternatives, or prerequisites like job activation. No mention of when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
job_update_contextAInspect
Update the job context by merging new data. Existing keys are preserved unless explicitly overwritten. Use this to record progress, update assignment statuses, or store intermediate results.
| Name | Required | Description | Default |
|---|---|---|---|
| job_id | No | The ID of the job to update | |
| updates | Yes | Key-value pairs to merge into job context |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses the merge behavior: 'Existing keys are preserved unless explicitly overwritten.' This is a key behavioral trait beyond the schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two short sentences, each adding value. The first sentence states the action and merge behavior; the second gives usage examples. No redundant words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple merge tool with no output schema and no annotations, the description covers the essential purpose and behavior. It could mention that it's a mutation (implied) or error conditions, but it's adequately complete for its complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% for both parameters. The description adds the merge semantics, explaining how the 'updates' object interacts with existing context, which goes beyond the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Update the job context by merging new data.' The verb 'update' and resource 'job context' are explicit. Distinguishes from siblings like job_read_context and job_complete by focusing on context updates.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases: 'record progress, update assignment statuses, or store intermediate results.' While it doesn't explicitly state when not to use it or mention alternatives, the context is clear enough for most usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
kg_find_entityAInspect
Find an entity by name in the Knowledge Graph.
USE WHEN user mentions a person, project, company by name and you need:
To resolve a name to entity_id for subsequent queries
'Кто работает над X?' → find X first
'Расскажи про Y' → find Y first
RETURNS entity_id for use in kg.get_relationships or kg.explore. ALWAYS use this as the FIRST step in KG query chains.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Entity name to search for. Can be in any language (Russian, English, etc.) - transliteration is automatic. | |
| limit | No | Maximum results to return (1-10). Default: 5 | |
| entity_type | No | Filter by entity type: - 'person': People, contacts - 'project': Projects, tasks - 'organization': Companies, teams - 'event': Meetings, deadlines - 'topic': Discussion topics - 'workspace': User's own facts (my/our company) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Despite no annotations, the description implies a read-only lookup. It adds useful behavioral details like automatic transliteration and language support, but does not explicitly state the operation is non-destructive or provide error behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise: three short, front-loaded paragraphs. Every sentence serves a purpose—purpose, usage guidelines, return value, and chaining instruction. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool is simple; description covers name resolution, return value (entity_id), and chaining behavior. Lacks details on pagination or error handling, but is sufficient given the tool's straightforward nature and the absence of an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the description does not add significant value beyond what is already in the schema. It repeats the language flexibility for name but does not further clarify parameter usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Find an entity by name in the Knowledge Graph' and explicitly positions it as the first step in KG queries, distinguishing it from siblings like kg_get_relationships and knowledge_query.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use scenarios (e.g., user mentions a person/project/company), example queries with translation, and directs to use entity_id in subsequent tools. States 'ALWAYS use this as the FIRST step.'
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
kg_get_relationshipsAInspect
Get relationships for a specific entity from Knowledge Graph.
USE WHEN:
'Кто работает над X?' - filter by works_on
'С кем общался Y?' - filter by discussed_with
'Кто из компании Z?' - filter by member_of
'Что связано с W?' - no filter, get all
REQUIRES: entity_id from previous kg.find_entity step. Use: {{step_N.entity_id}} where N is the find_entity step number.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum relationships to return (1-50). Default: 20 | |
| direction | No | Relationship direction: - 'outgoing': Entity → Others - 'incoming': Others → Entity - 'both': All relationships (default) | both |
| entity_id | Yes | Entity ID from kg.find_entity step. Use {{step_N.entity_id}} reference. | |
| relation_types | No | Filter by relationship types (optional): People: works_on, works_for, member_of, manages, knows, client_of, provides_service Communication: discussed_with, participated_in, mentioned_in Org/Project: developed_by, funded_by, partnered_with, integrates_with, depends_on, part_of Document: issued_by, issued_to, signed_by, authored_by Other: uses, located_in, about, follows, owns, related_to |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. It implies a read-only query but does not explicitly state that it does not modify data. The parameter descriptions (limit, direction, relation_types) add some behavioral context, but overall transparency is adequate but not comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise and well-structured with clear headings (USE WHEN, REQUIRES). Every sentence adds value, and the format is easy to parse for an AI agent.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 4 parameters and no output schema, the description covers usage context, parameter semantics, and prerequisites. It does not describe the return format, but this is not critical for an agent to invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds significant semantic context beyond the schema, especially for relation_types by linking them to user intents (e.g., 'Кто работает над X?' -> works_on). It also reinforces the entity_id requirement with a concrete reference syntax.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get relationships for a specific entity from Knowledge Graph' and provides specific use case examples with Russian phrases and filter types. It effectively distinguishes itself from sibling tools like kg_find_entity by focusing on relationships retrieval.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description includes a 'USE WHEN' section with concrete scenarios and explicit requirement for entity_id from kg.find_entity. However, it does not mention when not to use this tool or provide alternatives among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
knowledge_queryAInspect
Answer questions using knowledge base (uploaded documents, handbooks, files).
Use for QUESTIONS that need an answer synthesized from documents or messages. Returns an evidence pack with source citations, KG entities, and extracted numbers.
Modes:
'auto' (default): Smart routing — works for most questions
'rag': Semantic search across documents & messages
'entity': Entity-centric queries (e.g., 'Tell me about [entity]')
'relationship': Two-entity queries (e.g., 'How is [entity A] related to [entity B]?')
Examples:
'What did we discuss about the budget?' → knowledge.query
'Tell me about [entity]' → knowledge.query mode=entity
'How is [A] related to [B]?' → knowledge.query mode=relationship
NOT for finding/listing files, threads, or links — use workspace.search for that.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Query mode: - 'auto' (default): Smart routing based on question - 'rag': Pure semantic search with KG boost - 'entity': GraphRAG for entity queries - 'relationship': Two-entity relationship query - 'graph': Direct KG traversal only | auto |
| style | No | Answer style: concise, detailed, or bullet | concise |
| date_to | No | Filter messages until this date (ISO format: YYYY-MM-DD). | |
| file_ids | No | Specific file IDs to search within (for pinned files) | |
| question | Yes | The question to answer from user's knowledge base. Required even for entity queries. | |
| date_from | No | Filter messages from this date (ISO format: YYYY-MM-DD). Use for time-based queries like 'this week', 'last month'. | |
| thread_id | No | Limit search to a specific thread/chat | |
| query_type | No | Query classification hint. Skips internal AI analysis when provided. | |
| entity_name | No | Entity name for entity/graph modes (optional for auto mode) | |
| max_sources | No | Maximum number of sources to consider (1-10) | |
| entity_names | No | Both entity names for relationship queries. | |
| person_names | No | Person names mentioned in the query. Used for keyword fallback when RAG misses. | |
| search_keywords | No | Key search terms to enhance RAG matching. | |
| needs_aggregation | No | True if query asks for totals/sums/counts. | |
| include_relationships | No | Include KG relationships in answer (default: true for entity mode) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Describes return format: 'evidence pack with source citations, KG entities, and extracted numbers.' Explains modes and their behavior. Does not disclose any potentially destructive actions, but being a query tool, that is expected.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear sections: purpose, usage, return info, modes, examples, exclusions. Front-loaded with key information. No redundant sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (15 parameters, no output schema), the description provides sufficient context including examples, mode explanations, and an explicit exclusion. It enables an agent to correctly select and invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with detailed descriptions for each parameter. The description adds value by explaining modes with examples and providing context on when to use each parameter, such as date_from for time-based queries.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Answer questions using knowledge base'. It distinguishes itself from the sibling workspace.search by explicitly stating 'NOT for finding/listing files, threads, or links — use workspace.search for that.'
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit usage guidance: 'Use for QUESTIONS that need an answer synthesized from documents or messages.' Also specifies when not to use and directs to an alternative: 'NOT for finding/listing files, threads, or links — use workspace.search for that.' Examples illustrate appropriate use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_add_commentAInspect
Add a comment to a LinkedIn post. Use post_id from search results or thread data.
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Comment text to post | |
| post_id | Yes | LinkedIn post/activity ID (from search results or thread metadata) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose behavioral traits. It only states 'Add a comment', which implies a write operation, but does not mention side effects, rate limits, authentication needs, or error conditions. The lack of behavioral context is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of two concise sentences, front-loaded with the purpose. Every word adds value, with no unnecessary information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with two simple parameters and no output schema, the description is mostly complete. It explains the action and provides a source for the ID. It could mention potential failure or return behavior, but the overall context is adequate for a straightforward operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Both parameters have descriptive schemas (100% coverage). The description adds context for post_id by specifying its source, but does not elaborate on text constraints or formatting. This adds marginal value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Add' and the resource 'comment to a LinkedIn post'. It is distinct from sibling tools, which include other LinkedIn actions but no other comment-adding tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a hint on where to obtain the post_id ('from search results or thread data'), which aids usage. However, it does not explicitly state when to use this tool versus alternatives or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_get_companyAInspect
Get a LinkedIn company profile by company ID or vanity name. Returns company name, description, industry, size, and other details.
| Name | Required | Description | Default |
|---|---|---|---|
| identifier | Yes | Company ID or vanity name |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. It only lists returned fields but does not disclose any behavioral traits such as authentication requirements, rate limits, or what happens if the identifier is invalid. The description is minimal.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences: the first states the purpose and method, the second lists return fields. No extraneous words, front-loaded, and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple get tool with one parameter and no output schema, the description is adequate but leaves 'other details' vague. It does not specify whether all company details are returned or only a subset.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter 'identifier' is described in the schema as 'Company ID or vanity name', and the tool description repeats this. Since schema coverage is 100%, the description adds no new meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb 'Get' and resource 'company profile', clearly identifies the two ways to specify the company (ID or vanity name), and distinguishes it from sibling tools like linkedin_get_profile.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use it (when company details are needed) but provides no explicit guidance on alternatives or when not to use it. For a simple tool this is acceptable but not exemplary.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_get_profileAInspect
Get a LinkedIn user profile by ID, public identifier (vanity name), or profile URL. Returns name, headline, location, and other profile information.
| Name | Required | Description | Default |
|---|---|---|---|
| identifier | Yes | LinkedIn member ID, public identifier (vanity name), or full profile URL |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description bears full responsibility for behavioral transparency. It indicates a read operation (Get a profile) and lists returned fields, but does not disclose potential rate limits, authentication requirements, or the scope of data (e.g., public vs. private profiles). This is a moderate gap for a typical API tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, directly stating the purpose and key return fields. No extraneous information or repetition. Efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (one parameter, no output schema), the description adequately covers what the tool does and what it returns. However, it could mention edge cases (e.g., profile not found) or the format of the return data more explicitly. Overall, it is sufficiently complete for its complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema's description for 'identifier' already covers the same information as the tool description (ID, vanity name, URL). Since schema coverage is 100%, the description adds no new semantic value beyond what the schema provides. Baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves a LinkedIn user profile using an ID, public identifier, or URL, and lists key returned fields. It distinguishes itself from siblings like linkedin_get_company or linkedin_search by focusing on a single user profile.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains how to identify the profile (by ID, vanity name, or URL) but does not provide guidance on when to use this tool over alternatives like linkedin_search, which might also find profiles. No explicit 'when not to use' or prerequisites are mentioned, but for a simple get operation it is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_inviteAInspect
Send a connection invitation to a LinkedIn user. Optionally include a personalized message (max 300 characters). Rate limited: LinkedIn allows 80-100 invitations per day, max 200 per week.
| Name | Required | Description | Default |
|---|---|---|---|
| message | No | Optional personalized invitation message (max 300 characters) | |
| provider_id | Yes | LinkedIn provider ID of the person to invite (from search results or profile) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses rate limiting behavior but lacks details on error handling, success confirmation, or permission requirements. The message length constraint is already in the schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences cover purpose, parameter option, and rate limit with no redundant information. Highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 params, no output schema), the description is largely complete. Adding details on error handling or rate limit exceedance would improve, but not required.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%. The description adds context for provider_id (from search results or profile) and reiterates message max length. While it does not significantly expand on the schema, the provider_id source is helpful.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Send a connection invitation') and the resource ('to a LinkedIn user'). It is distinct from sibling LinkedIn tools such as linkedin_add_comment or linkedin_get_profile.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit rate limit guidance (80-100 per day, max 200 per week) to inform usage. However, does not specify when not to use the tool or mention alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_list_connectionsBInspect
List your LinkedIn connections, sorted by most recently added.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum connections to return | |
| cursor | No | Pagination cursor from previous response |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It only mentions sorting by most recently added, omitting behavioral traits like authentication requirements, rate limits, or that it is a read-only operation. More detail is needed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that is concise and front-loaded with the action and resource. Every word is functional with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with two parameters and no output schema, the description is adequate but misses context like pagination behavior or the fact that results are paginated. It covers the basics but lacks completeness for optimal usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds no additional meaning beyond the schema's parameter descriptions. It does not elaborate on how 'limit' or 'cursor' affect results.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists LinkedIn connections, sorted by most recently added. It uses a specific verb and resource, and it distinguishes itself from sibling tools like linkedin_get_profile or linkedin_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide any guidance on when to use this tool versus alternatives, nor does it mention any exclusions or prerequisites. For a simple list tool, some context about pagination or typical use cases would help.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_list_invitations_sentAInspect
List your pending sent connection invitations on LinkedIn.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum invitations to return | |
| cursor | No | Pagination cursor from previous response |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must carry disclosure. It states the action is 'list' (read), but does not detail pagination behavior, rate limits, or that only pending sent invitations are returned (implied but not explicit). Adequate but not exhaustive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, no filler. Efficiently communicates the tool's purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool with 2 parameters and no output schema, the description provides the essential purpose. Could mention response format or that it only shows pending sent, but it says 'pending sent' already. Minor gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both parameters (limit max 100, cursor for pagination). Description adds no extra meaning beyond schema, so baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states verb 'List' and resource 'pending sent connection invitations' with scope 'on LinkedIn'. Distinguishes from sibling tools like linkedin_invite (send) and linkedin_list_connections (accepted).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. No mention of when not to use it or what other tools cover related actions like accepted connections.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_list_reactionsAInspect
List all reactions (likes, celebrates, etc.) on a specific LinkedIn post.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum reactions to return | |
| post_id | Yes | LinkedIn post/activity ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It lists reactions but does not disclose behavioral traits like authentication requirements, rate limits, or handling of invalid post IDs. Minimal behavioral insight.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that is front-loaded and concise. No wasted words; every part serves the purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with 2 parameters and no output schema, the description is adequate. It could mention pagination or default limit, but the schema covers limit details. Overall sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with both parameters having descriptions. The description adds no additional meaning beyond what the schema already provides, so baseline of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the action (list all reactions), the resource (specific LinkedIn post), and scope (reactions like likes, celebrates). Distinguishes from sibling tools like linkedin_add_comment or linkedin_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description implies when to use (to get reactions on a post) but does not explicitly state when not to use or mention alternative tools. No usage guidance beyond the basic purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_raw_requestAInspect
Send an arbitrary LinkedIn API request via Unipile's magic route. Only GET and POST methods are allowed. WARNING: This bypasses structured rate limiting and can perform destructive actions. Use this only when no other LinkedIn tool covers the needed functionality.
| Name | Required | Description | Default |
|---|---|---|---|
| body | No | Request body (for POST requests) | |
| method | No | HTTP method (only GET and POST allowed) | GET |
| request_url | Yes | Target LinkedIn API endpoint URL | |
| query_params | No | URL query parameters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, description carries full burden. It warns about bypassing structured rate limiting and performing destructive actions, which is critical behavioral disclosure. However, it does not detail response format or success/failure behavior, but the key risks are covered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first states purpose, second gives usage guidance and warnings. Front-loaded, no fluff, every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 4 parameters and no output schema, the description covers purpose, allowed methods, and destructive risk. It does not explain authentication or response format, but for a raw request tool, these are often implied. Sufficient for most usage scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. Description reinforces allowed methods and adds warnings, but the schema already describes body, method, request_url, and query_params adequately. Description adds marginal value over schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Send an arbitrary LinkedIn API request via Unipile's magic route', identifying the verb and resource. It distinguishes from siblings by warning to use only when no other LinkedIn tool covers the needed functionality, and sibling list includes many specific LinkedIn tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use this only when no other LinkedIn tool covers the needed functionality' and restricts methods to GET and POST. This provides clear context for when to use, though no explicit exclusion list for when-not-to-use beyond the generic warning.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_searchAInspect
Search LinkedIn for people, companies, jobs, or posts. Supports filtering by keywords, location, industry, network distance, and more. Use linkedin.search_filters first to resolve filter keywords to LinkedIn parameter IDs.
| Name | Required | Description | Default |
|---|---|---|---|
| api | No | LinkedIn product to search with | classic |
| url | No | Direct LinkedIn search URL (alternative to keyword/filter search) | |
| role | No | Role/title filter | |
| limit | No | Maximum results to return | |
| category | No | What to search for | people |
| industry | No | Industry filter IDs | |
| keywords | No | Search keywords | |
| location | No | Location filter IDs (use linkedin.search_filters to resolve) | |
| has_job_offers | No | Filter for people with job offers | |
| network_distance | No | Connection degree: F=1st, S=2nd, O=3rd+ |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must convey behavioral traits. It implies a read-only search operation but does not explicitly state non-destructiveness, authentication needs, or rate limits. The description is adequate but could be more explicit about side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences covering purpose, filtering, and prerequisite. Every sentence earns its place with no redundancy. The description is front-loaded with the main action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a search tool with 10 parameters and no output schema, the description adequately covers the core functionality and a key usage hint. However, it omits details like pagination behavior or result format, which would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, but the description adds value by explaining that filter parameters (industry, location) should use IDs from linkedin.search_filters. This provides meaningful context beyond the schema's descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Search LinkedIn for people, companies, jobs, or posts' with a specific verb and resource. It lists filtering capabilities and distinguishes itself from sibling tools like linkedin_get_profile by being the general search entry point.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly advises using linkedin.search_filters first to resolve filter keywords, providing a clear prerequisite. While it doesn't list alternative tools for when not to use it, the context of sibling tools makes the purpose distinct.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_search_filtersAInspect
Get LinkedIn search filter parameter IDs. LinkedIn uses internal IDs instead of text for search filters (location, industry, etc.). Call this before linkedin.search to resolve filter keywords to their LinkedIn parameter IDs.
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Filter category to resolve (e.g. LOCATION, INDUSTRY, SKILL) | |
| limit | No | Max results per filter category | |
| keywords | Yes | Keywords to resolve to parameter IDs (e.g. 'Thailand' for LOCATION) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It describes the tool's behavior (resolve filter keywords to IDs) but does not mention potential errors, rate limits, or side effects. Adequate but could add more context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences that are front-loaded with key information. Every sentence adds value without extraneous detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple lookup tool with 3 params and no output schema, the description is fairly complete. It explains the 'why' and 'when', though it could mention the format of results (list of parameter IDs). Still, sufficient for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 3 params with descriptions (100% coverage). The description adds context about the purpose (resolving to parameter IDs) but does not add meaning beyond the schema descriptions. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: getting LinkedIn search filter parameter IDs. It explains why (LinkedIn uses internal IDs) and when to use it (before linkedin.search). This distinguishes it from sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use the tool ('before linkedin.search') and what it does. It does not provide when-not or alternatives, but given its specific helper role, the usage is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
linkedin_update_profileAInspect
Update the authenticated user's own LinkedIn profile. Supports adding/editing experience entries (role, company, skills, dates). Also supports updating location. Headline, summary, education are NOT supported by the API.
| Name | Required | Description | Default |
|---|---|---|---|
| location | No | Location to set on profile (requires LinkedIn location ID) | |
| experience | No | Add or edit a professional experience entry |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided; description carries full burden. It accurately describes adding/editing experience and updating location, but does not disclose potential side effects (e.g., whether existing experience is replaced or appended) or authentication details beyond 'authenticated user'. Acceptable but not thorough.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: first states the main action, second lists supported and unsupported operations. Front-loaded and every word adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and moderate complexity (nested objects, no required params), the description covers primary use cases and limitations. Lacks details on error handling or partial update behavior, but sufficient for most agents.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds meaning by explaining 'omit id to add new' and 'include id to edit', and clarifies that location requires a LinkedIn location ID obtainable via search_filters. This aids agent understanding beyond schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Update the authenticated user's own LinkedIn profile' and lists specific supported operations (experience, location) and unsupported fields (headline, summary, education). This distinguishes it from sibling tools like linkedin_get_profile and linkedin_add_comment.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description explicitly indicates when to use (updating experience/location) and what is not supported. It also references linkedin.search_filters for location ID lookup. No explicit when-not to use alternatives, but the scope is well-defined.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
messages_deleteAInspect
Delete a message from a thread. Supports Telegram, WhatsApp, and other connected channels. Note: Some channels have time limits on message deletion.
| Name | Required | Description | Default |
|---|---|---|---|
| thread_id | Yes | Thread/channel ID containing the message | |
| message_id | Yes | ID of the message to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description only states that the tool deletes a message and notes time limits. It does not disclose permissions, irreversibility, error handling, or return behavior, leaving significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two sentences: the first states the core action, and the second adds a relevant note. No unnecessary words or repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple deletion tool with no output schema, the description covers the essential purpose and a key constraint (time limits). It does not mention permissions or error cases, but given the tool's simplicity, it is reasonably complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage with self-explanatory parameter descriptions. The tool description does not add any additional meaning or context to the parameters beyond what is already in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete a message from a thread') and specifies supported channels (Telegram, WhatsApp, etc.), making the tool's purpose immediately obvious and distinct from siblings like messages_send or messages_read_history.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a practical usage note about time limits on deletion, which guides when the tool can be used, but does not explicitly state when to use it over alternatives or exclude certain contexts. The note is helpful but not exhaustive.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
messages_forwardAInspect
Forward a message from one thread to another. Supports native Telegram forwarding (preserves original sender attribution) and text-based forwarding for cross-channel scenarios.
| Name | Required | Description | Default |
|---|---|---|---|
| dest_thread_id | No | Destination thread to forward into. Provide at least one of dest_thread_id or recipient_name. To forward into the active conversation, pass the current thread_id. (If both are provided, dest_thread_id wins and recipient_name is ignored.) | |
| recipient_name | No | Name of person to forward to (channel auto-resolved). Provide at least one of dest_thread_id or recipient_name. Use only when forwarding to a different contact than the current conversation. | |
| source_thread_id | Yes | Thread containing the message to forward (e.g., 'telegram:123456' or numeric DB ID) | |
| source_message_id | Yes | ID of the message to forward |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided. The description adds behavioral context about preserving sender attribution in native forwarding and cross-channel capability, but does not disclose other traits like whether the original message is affected, auth requirements, or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two short sentences, front-loaded with the core verb and resource, no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the main functionality but lacks details on return values or side effects, especially since there is no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema descriptions cover all parameters (100% coverage), so baseline is 3. The description does not add new meaning beyond the schema, only summarizes the overall behavior.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool forwards a message between threads, with two modes (native and text-based), distinguishing it from sending new messages or reading history.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not explicitly state when to use this tool vs alternatives, nor when to choose native vs text-based forwarding. Some guidance is implicit in the schema (dest_thread_id vs recipient_name), but the description itself lacks explicit usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
messages_read_historyAInspect
Read messages from a conversation thread. Use text_contains to find specific messages by content. Returns the most recent messages, including sender info and timestamps.
Voice calls: each row carries a meta object with allowlisted keys (event_type ∈ 'call_started'|'call_ended'|null, source ∈ 'voice_transcript'|null, call_id, speaker_display_name, duration_seconds, outcome, direction) plus per-message channel. To find calls without scanning every row, use calls.list_history instead.
Usage:
Get thread_id from threads.list first, OR
Use contact_name to auto-resolve thread_id
Examples:
User: 'show me messages from chat with [contact]' → read_history(contact_name='[contact]', limit=10)
User: 'last 5 messages from thread 571' → read_history(thread_id=571, limit=5)
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of messages to return (default: 10, max: 100) | |
| offset | No | Number of messages to skip (for pagination, default: 0) | |
| thread_id | No | Thread ID to read messages from (e.g., '571' or 'telegram:571'). Optional if contact_name provided. | |
| contact_name | No | Contact/thread name to search for (optional if thread_id provided). Example: 'Jane Smith', 'John Doe' | |
| text_contains | No | Filter: only return messages containing this text (case-insensitive substring match) | |
| include_outgoing | No | Include messages sent by you (default: true) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Describes that it returns recent messages with sender info and timestamps, supports filtering and pagination. No annotations, so description carries full burden; lacks mention of error handling or read-only nature but is sufficiently transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with usage steps and examples, every sentence adds value, and it is appropriately concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 6 parameters, no output schema, and alternative identification methods, the description covers usage, filtering, pagination, and examples. Omits error cases but is largely complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and description adds value by explaining the interplay of thread_id and contact_name, usage patterns, and filtering capabilities beyond schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool reads messages from a conversation thread, distinguishes from siblings like messages_send and messages_delete, and mentions an alternative approach using text_contains.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit steps (get thread_id from threads.list or use contact_name) and examples, but does not explicitly state when not to use this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
messages_sendAInspect
Send a message to a thread, channel, or contact. Supports Telegram, Email, LinkedIn, and other connected channels. For LinkedIn posts (comment_thread kind), this posts a comment on the post. Can automatically resolve recipients and channels when not specified. Can send files/images/documents as attachments — pass attachments=[file_id, ...] with integer file IDs obtained from collections.list_files, workspace.search, or files.search. text is optional when attachments are provided.
| Name | Required | Description | Default |
|---|---|---|---|
| cc | No | Email addresses to CC (carbon copy). Only for email channel. | |
| bcc | No | Email addresses to BCC (blind carbon copy). Only for email channel. | |
| text | No | Message text to send. Optional if attachments provided. | |
| format | No | Message format | text |
| silent | No | Send without notification | |
| channel | No | Channel hint (e.g. 'telegram'). Required when using recipient_username. Only 'telegram' is currently accepted for handle-based routing. | |
| subject | No | Email subject line. Required for new emails, optional for replies (auto-generates 'Re: ...'). | |
| thread_id | No | Target thread. OMIT to reply in the same chat you received the triggering message from — the backend defaults to the current thread. Pass an explicit value ONLY to reply in a DIFFERENT thread, and only use: (a) a numeric DB thread id from threads.list / workspace.search, or (b) a channel_ref like 'telegram:-12345'. NEVER use a chat-type word (dm, group, channel, livechat) — those are category labels from the SITUATION block, not ids. | |
| attachments | No | Array of integer file IDs to send as attachments (images, documents, any files). Get file IDs from collections.list_files (field `file_id`), workspace.search with scope=['files'] (field `file_id`), or files.search. Example: [302237]. The file must already exist in the workspace (status=ready) — no separate upload step needed. When attachments are provided, `text` becomes optional (a caption can be included alongside). | |
| recipient_name | No | Name of person to send to (e.g., 'Jane', 'John'). Tool will auto-resolve channel. Optional if thread_id provided. | |
| recipient_email | No | Email address to send to (e.g., 'john@example.com'). Creates new email thread. Only for email channel. | |
| recipient_username | No | Telegram @handle (e.g. '@smartdeveloper' or 'smartdeveloper'). Resolves or opens a DM without needing a DB thread_id. Requires channel='telegram'. Only Telegram supported in this release. | |
| reply_to_message_id | No | ID of message to reply to (optional) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description discloses key behaviors: automatic recipient/channel resolution, attachment handling (file IDs, requirement of ready workspace file), optional text with attachments, and specific LinkedIn comment behavior. It also warns against using category labels for thread_id. This is thorough, though it does not mention auth requirements or destructive potential (likely non-destructive).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single well-structured paragraph that front-loads the main purpose, then covers channel specifics, attachment handling, and thread_id guidance. Every sentence adds value without redundancy. It is appropriately sized for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 13 parameters and no output schema, the description covers the core functionality and parameter nuances well. It explains automatic resolution, attachment requisites, and thread_id semantics. However, it does not mention the return value (e.g., message ID) which would be helpful for follow-up actions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage, but the description adds significant value beyond schema descriptions. For thread_id, it gives explicit instructions on when to omit vs. pass values and which values are valid. For attachments, it explains how to obtain file IDs. For recipient_username, it notes the Telegram-only requirement. These details raise the score above baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Send a message to a thread, channel, or contact' and specifies supported channels (Telegram, Email, LinkedIn, etc.). It also distinguishes special behavior for LinkedIn posts (comment_thread kind). This differentiates it from siblings like messages_forward or messages_delete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use the tool, including automatic recipient resolution and details on thread_id usage ('OMIT to reply in the same chat', 'Pass an explicit value ONLY to reply in a DIFFERENT thread'). However, it does not explicitly compare to alternative message tools or state when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notes_deleteBInspect
Delete a note by ID from the target notebook. Same identity rules as notes.save — agents can only delete from their own notebook.
| Name | Required | Description | Default |
|---|---|---|---|
| note_id | Yes | ID of the note to delete | |
| target_agent_id | No | Target notebook. In agent mode optional (defaults to your own); required from MCP. Agents cannot target other agents' notebooks. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses identity rules but omits side effects, irreversibility, or error behavior (e.g., what if note_id not found).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no redundancy; the key information is front-loaded and each sentence serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, so description should cover return behavior. It does not mention what happens on success or failure, leaving gaps for a deletion operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds context about identity rules but does not enhance parameter understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete a note by ID from the target notebook') and distinguishes it from siblings like notes_save and notes_search by focusing on deletion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions identity rules and ownership constraints but does not explicitly contrast with other note tools or provide when-to-use vs alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notes_recallAInspect
Recall notes from your notebook. By default returns only your own notes (all scopes, newest first). Pass filter_agent_id= to read another agent's notebook, or filter_agent_id="all" (or "*") to read across every agent in the workspace. Pass scope to narrow to global/thread/person. Each result includes agent_id and agent_name of the author.
| Name | Required | Description | Default |
|---|---|---|---|
| key | No | Recall a specific note by key | |
| limit | No | Max notes (default 20, max 50). Newest first. | |
| scope | No | Optional filter: global | thread | person. Omit for all scopes. | |
| scope_ref_id | No | Filter by specific thread_id or person_id | |
| filter_agent_id | No | Optional. Omit to read only your own notes. Pass a numeric agent_id as a string (e.g. "57") to read another agent's notebook (read-only). Pass "all" or "*" to read across all agents in the workspace. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries full burden. It discloses default behavior, filter agent_id semantics, and result contents (agent_id, agent_name). It does not explicitly state it is read-only, but the description implies no modification.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is four sentences, front-loaded with main purpose, and efficiently covers key details without extraneous information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema is provided, so the description should cover return structure. While it mentions each result includes agent_id and agent_name, it does not specify other fields like key, content, or timestamps. The schema parameter descriptions fill some gaps, but overall completeness is moderate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds value by explaining default behavior and result structure beyond schema descriptions. It clarifies filter_agent_id usage and includes example values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool recalls notes from the notebook. It specifies default behavior (own notes, all scopes, newest first) and distinguishes from siblings like notes_search by focusing on recall by key and filtering by agent/scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear usage context: when to use defaults, how to read another agent's notebook, and how to filter by scope. It does not explicitly list alternatives but implies which sibling tools to use for other operations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notes_saveAInspect
Save a note to your notebook. In agent mode writes to your own notebook by default; agents cannot write to other agents' notebooks. In MCP mode target_agent_id is required. If a note with the same key and scope already exists, it will be updated. Use scope to organize: 'global' for general knowledge, 'thread' for thread-specific context, 'person' for contact-specific info.
| Name | Required | Description | Default |
|---|---|---|---|
| key | Yes | Short identifier for this note | |
| scope | Yes | Scope of the note | |
| value | Yes | The note content | |
| pinned | No | Pin this note so it's always loaded first. Default false. | |
| scope_ref_id | No | Reference ID — thread_id (for scope=thread) or person_id (for scope=person). Required for thread/person scope. In MCP mode (no thread context), must be passed explicitly. | |
| target_agent_id | No | Target notebook. In agent mode optional (defaults to your own); required from MCP. Agents cannot target other agents' notebooks. | |
| expires_in_hours | No | Auto-delete after N hours. Omit for permanent notes. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the burden. It discloses mode-specific behavior, update semantics, and scope organization. However, it doesn't mention error conditions or return values (no output schema).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with 4 sentences, each adding value. It is front-loaded with the main purpose and uses no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 7 parameters and no output schema, the description covers key behaviors and mode differences. It could mention persistence guarantee or limits, but overall it is sufficient for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds significant meaning beyond the schema: update behavior on duplicate key+scope, scope usage guidance, and mode-specific target_agent_id requirement. This complements the 100% schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool saves a note to a notebook, differentiating it from siblings like notes_delete, notes_recall, and notes_search. It specifies the action and resource, and mentions update behavior.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides usage context (agent vs MCP mode, scope organization) but does not explicitly contrast with sibling tools. It implies when to use this tool for creating/updating notes, but could be more direct.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notes_searchAInspect
Full-text search in your notebook. By default searches only your own notes. Pass filter_agent_id= to search another agent's notebook, or "all" (or "*") for workspace-wide. Or list all notes for a person/thread by scope_ref_id.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 10, max 50) | |
| query | No | Text to search for in note keys and values. Optional if scope_ref_id is provided. | |
| scope | No | Limit search to scope | |
| scope_ref_id | No | Filter by specific thread_id or person_id. If provided without query, lists all notes for that ref. | |
| filter_agent_id | No | Optional. Omit to search only your own notes. Pass a numeric agent_id as a string (e.g. "57") to search another agent's notebook (read-only). Pass "all" or "*" to search across all agents in the workspace. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses default scope, read-only nature when searching other agents' notes, and the ability to list notes without a query. This adequately informs the agent of behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, each carrying essential information. No wasted words. Front-loaded with purpose, then expands on customization options.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (5 parameters, no output schema), the description covers all needed aspects: purpose, usage, parameter details, and edge cases. The agent can correctly invoke the tool without additional context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by explaining defaults, acceptable values, and interactions between parameters (e.g., query optional with scope_ref_id, filter_agent_id accepts 'all' or '*').
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs full-text search in notebooks, with specific filtering capabilities. It distinguishes from sibling tools like notes_recall, notes_delete, notes_save by focusing on search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains default behavior (own notes), how to search other agents' notebooks or workspace-wide, and how to list all notes for a person/thread. It provides concrete usage examples with parameters, making it clear when to use each option.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
prompts_getAInspect
Get full content of a prompt template: system instructions (prompt_text) and auto-reply rules.
Run prompts.list first to find the prompt_id.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt_id | Yes | ID of the prompt template to fetch |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided; the description relies on the word 'Get' to imply a read operation but does not explicitly confirm idempotency, permissions, or error handling. For a simple retrieval, this is adequate but not thorough.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences efficiently convey the tool's purpose and a key prerequisite. Every sentence adds value with no redundancy or unnecessary detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, no output schema), the description covers the essential aspects: what is retrieved and how to get the ID. It lacks details on invalid IDs or response structure, but these are common and expected.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter prompt_id is fully described in the schema (100% coverage). The description adds the usage hint about prompts.list but does not elaborate on parameter constraints or formatting beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves full content of a prompt template, including system instructions and auto-reply rules. It distinguishes from siblings like prompts_list (which only lists) and prompts_update (which modifies).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly advises to run prompts.list first to obtain the prompt_id, providing a clear prerequisite. However, it does not discuss when to avoid using this tool or compare alternatives like prompts_get vs prompts_get_history.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
prompts_listAInspect
List all prompt templates in this workspace.
Returns id + name + description + category so you know which prompt_id to use in prompts.get or prompts.update.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations given, so description carries full burden. It discloses the tool lists all prompts and returns specific fields. Lacks mention of potential performance or pagination, but for a simple list tool this is acceptable.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no waste, front-loaded with action and resource, then explains relevance. Each sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters, no output schema, and simplicity of the tool, the description fully covers what an agent needs: what it lists and how to use the result. No significant gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 0 parameters with 100% coverage. Baseline 3 applies since description adds no parameter info, but none is needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the verb 'List' and the resource 'all prompt templates in this workspace', and specifies the returned fields (id, name, description, category) which distinguish it from sibling tools like prompts_get that retrieve a single prompt.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly connects to subsequent tool use: 'so you know which prompt_id to use in prompts.get or prompts.update'. While it doesn't mention when not to use, the context is clear enough for an agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
prompts_prompt_historyAInspect
List past versions of a prompt template's prompt_text. Every edit is snapshotted to an append-only table — use this to browse history and find a version_number for prompts.prompt_restore.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max versions to return (1-200, default 50) | |
| prompt_id | Yes | ID of the prompt template | |
| before_version | No | Cursor: return versions strictly below this version_number |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses the append-only nature of the table, indicating no deletions. The 'List' verb implies read-only behavior, but no explicit readOnlyHint is given. Overall adequate transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words. The action is front-loaded, and the context about snapshots and restore linkage is included efficiently.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers purpose, linkage to restore tool, and high-level behavior. Given no output schema, it misses describing the return format or ordering. The cursor parameter (before_version) is described in schema but not repeated in description. Still, it is reasonably complete for a list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters. The description adds no additional meaning beyond what is in the schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List past versions'), specifies the resource ('prompt_text of a prompt template'), and distinguishes itself from sibling tools by mentioning the version_number is for use with prompts.prompt_restore.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly guides the agent to use this tool to browse history and find a version_number for the restore tool. It implicitly distinguishes from siblings, but could be more explicit about not using this for the current prompt text (use prompts_get).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
prompts_prompt_restoreAInspect
Restore a past version of a prompt template by version_number. Creates a new version pointing at the restored content — history is preserved. Fans out to every agent using this template without a per-agent override; the response includes affected_agents as a receipt of the fan-out.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Optional: why this restore is happening (shows up in history UI) | |
| prompt_id | Yes | ID of the prompt template | |
| version_number | Yes | The version_number to restore (get it from prompts.prompt_history) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully covers behavioral traits: it creates a new version, preserves history, and fans out to all agents. It also mentions the response includes affected_agents. This is comprehensive for a tool without annotations, though it could detail permissions or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences long, each serving a clear purpose: action, side effect, and additional detail. It is concise and front-loaded with the key purpose, though it could be slightly more streamlined.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description mentions that the response includes affected_agents, which is useful. It also explains the fan-out effect. This covers most necessary context for a restore operation, though it omits details on error handling or rate limits.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All three parameters have descriptions in the schema (100% coverage). The description adds only minor context (e.g., 'get version_number from prompts.prompt_history'), which is helpful but does not significantly enhance parameter understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool restores a past version of a prompt template by version_number. It distinguishes itself from siblings like prompts_prompt_history (which lists history) and prompts_update (which modifies current version) by emphasizing historical preservation and fan-out behavior.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context about when to use the tool—when restoring a version that affects all agents using the template. However, it does not explicitly state when not to use it or mention alternative tools, so it loses a point for missing exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
prompts_updateAInspect
Update a prompt template's name, system instructions, or auto-reply rules.
Changes affect every agent using this template, unless the agent has its own override (set via agents.update → prompt_text).
All parameters except prompt_id are optional — only provided fields are updated.
| Name | Required | Description | Default |
|---|---|---|---|
| name | No | New name for the prompt template | |
| prompt_id | Yes | ID of the prompt template to update | |
| description | No | New description for the prompt template | |
| prompt_text | No | The AI system prompt: persona, tone, rules, behavior. | |
| auto_reply_rules | No | Pre-classifier rules that run BEFORE the main AI. Format: bullet list of conditions → actions (SKIP / SIMPLE_REPLY / SEARCH / CALENDAR). Pass null to clear. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It discloses that changes affect every agent using the template unless overridden, and that only provided fields are updated. This covers major side effects, though immediate effect or caching is not mentioned.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences with front-loaded purpose. Each sentence adds value: purpose, global impact, optionality. No waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and moderate complexity, the description covers purpose, side effects, and optionality. Could mention error handling or validation, but it is sufficiently complete for an AI agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description reiterates optionality and maps high-level fields to parameters but adds no new semantic details beyond the schema's own descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool updates a prompt template and lists specific fields (name, system instructions, auto-reply rules). It distinguishes from agents.update by mentioning agent-specific overrides, differentiating it from siblings like agents_update.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states the tool is for updating a prompt template and mentions the alternative agents.update for per-agent overrides. However, it does not explicitly say when not to use this tool, but the context is clear enough for an AI agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
reminder_cancelBInspect
Cancel an active reminder by its trigger ID.
| Name | Required | Description | Default |
|---|---|---|---|
| agent_id | No | Agent ID (required when calling from MCP; ignored in agentic mode). | |
| trigger_id | Yes | ID of the reminder to cancel |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description does not disclose behavioral traits such as whether cancelling an already cancelled or non-existent reminder is idempotent, what side effects occur, or if it requires specific permissions. The term 'active' implies it only works on active reminders, but this is not elaborated.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that is concise and front-loaded with the key action and required parameter. Every word is necessary, with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple cancellation tool with only two parameters and no output schema, the description covers the core functionality. It could mention return values or error conditions, but it is reasonably complete given the tool's simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so the schema itself sufficiently describes the parameters. The description adds no additional meaning beyond what is in the schema, thus baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (cancel) and the resource (reminder) with the required identifier (trigger ID). It distinguishes this tool from sibling tools like reminder_set and reminder_list, though it does not explicitly differentiate from them.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is given on when to use this tool versus alternatives. It does not mention prerequisites (e.g., obtaining the trigger ID from listing reminders) or scenarios where cancellation might not be appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
reminder_listAInspect
List your active reminders (both one-time and recurring).
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 20) | |
| agent_id | No | Agent ID (required when calling from MCP; ignored in agentic mode). | |
| thread_id | No | Filter by thread | |
| include_fired | No | Include already-fired one-time reminders (default false) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but description indicates this is a read-only listing operation (non-destructive). The word 'list' implies safety, though more explicit mention of being safe/idempotent would be better.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single, front-loaded sentence that efficiently conveys purpose with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool with no output schema, the description adequately indicates return is a list of active reminders. However, lacks mention of ordering or pagination, which are useful but not critical.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage, so baseline is 3. Description adds no additional meaning beyond the schema parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb 'list', resource 'reminders', and scope 'active (both one-time and recurring)'. It differentiates from sibling tools reminder_set and reminder_cancel by specifying list operation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implied usage from description, but no explicit when-to-use, when-not-to-use, or alternatives are mentioned. Agent must infer from sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
reminder_setBInspect
Schedule a reminder. One-time reminders fire at a specific datetime. Recurring reminders fire on a schedule (daily, weekly, every N days, or every N minutes). Optionally scope to a thread or target another agent.
| Name | Required | Description | Default |
|---|---|---|---|
| time | No | Time of day HH:MM for daily/weekly/every_n_days (e.g. '09:00'). Required for daily/weekly/every_n_days. | |
| reason | Yes | What this reminder is for (you'll see this when it fires) | |
| agent_id | No | Agent ID (required when calling from MCP; ignored in agentic mode). | |
| datetime | No | ISO datetime for one_time (e.g. '2026-04-01T09:00:00+03:00'). Required for one_time. | |
| timezone | No | IANA timezone (e.g. 'Europe/Moscow'). Defaults to UTC. | |
| thread_id | No | Optional thread ID to scope the reminder to. Omit for workspace-level reminders. | |
| days_of_week | No | Days for weekly: 0=Mon, 1=Tue, 2=Wed, 3=Thu, 4=Fri, 5=Sat, 6=Sun. Required for weekly. | |
| interval_days | No | For every_n_days: fire every N days (min 2). | |
| schedule_type | Yes | one_time = fires once at datetime. daily = fires daily at time. weekly = fires on specific days_of_week at time. every_n_days = fires every N days at time. interval = fires every N minutes. | |
| interval_minutes | No | For interval: fire every N minutes (5-1440). | |
| target_agent_slug | No | Optional: activate a different staff member instead of yourself when the reminder fires. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must fully disclose behavior. It covers basic scheduling but omits side effects, permissions, or confirmation of success. The description is too minimal for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the purpose, and contains no redundant information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 11 parameters and no output schema, the description adequately covers schedule types and optional scoping. It could mention default timezone behavior, but the schema descriptions cover the rest. Overall complete for a creation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds no new parameter information beyond summarizing schedule types and optional scoping, which is already in parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool schedules reminders and distinguishes one-time from recurring types. It does not explicitly differentiate from sibling tools reminder_cancel and reminder_list, which would elevate the score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like reminder_cancel or reminder_list. It does not specify prerequisites or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
system_sleepAInspect
Pause execution for a given number of seconds (max 30). Use when you need to wait for an external process to complete before retrying — e.g. message sync, backfill, or API propagation. Total sleep per run is capped at 60 seconds.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Why you are waiting (logged for debugging) | |
| seconds | Yes | Number of seconds to sleep (1-30) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It mentions a cap on total sleep per run (60 seconds) and per-call max (30 seconds), but does not describe error behavior, idempotency, or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise with two sentences. Information is front-loaded: action, limit, use case. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity, the description covers purpose, usage, and key constraints. Lacks return value or error info, but sleep operations are straightforward.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds minimal value beyond schema: 'reason' for debugging, 'seconds' range implied by max 30. Could add more detail on parameter constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Pause execution') and resource ('execution') and clearly states the action. It distinguishes itself from sibling tools by focusing on timing delays.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'wait for an external process to complete before retrying' with examples like message sync and API propagation. Lacks explicit when-not-to-use, but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tasks_createCInspect
Create a new task in your to-do list.
| Name | Required | Description | Default |
|---|---|---|---|
| title | Yes | Task title | |
| due_at | No | ISO datetime when task is due (e.g. '2026-03-31T15:00:00') | |
| agent_id | No | Agent ID whose tasks to access. Required when calling from MCP. | |
| due_date | No | Date when task is due (e.g. '2026-03-31'). Use with due_time or alone. | |
| due_time | No | Time when task is due (e.g. '15:00'). Used with due_date. | |
| priority | No | Task priority (default: medium) | |
| thread_id | No | Related thread ID | |
| description | No | Detailed description | |
| assigned_to_contact_id | No | Contact ID if assigned to someone |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description does not disclose any behavioral traits beyond the obvious 'create' action. With no annotations, the agent is left uninformed about side effects, required permissions, or what happens on success/failure. The description adds no value beyond the tool's name.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single clear sentence with no wasted words. It is front-loaded and immediately understandable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite good schema coverage, the description lacks any mention of return values or output behavior. With no output schema, the agent would benefit from knowing what the tool returns (e.g., the created task object). The minimal description does not fill this gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Since the input schema covers 100% of parameters with descriptions, the description does not need to add more. Baseline 3 applies because the schema already does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('create') and the resource ('task in your to-do list'), providing a clear purpose. However, it does not differentiate this tool from sibling tools like tasks_update or tasks_delete, missing an opportunity to clarify its specific role.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., when to create vs. update), nor are any prerequisites or exclusions mentioned. The description assumes the agent knows when creation is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tasks_deleteBInspect
Delete a task from your to-do list by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| task_id | Yes | ID of the task to delete | |
| agent_id | No | Agent ID whose task to delete. Required when calling from MCP. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must disclose behavioral traits. It only says 'delete' without specifying permanence, required permissions, or side effects. For a destructive action, this is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single, front-loaded sentence with no extraneous words. Efficiently conveys the core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema is present, so the description should clarify return behavior (e.g., success confirmation) and the required agent_id for MCP calls. Both are missing, leaving the agent uninformed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. The description adds no extra meaning beyond what the schema provides, placing it at baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete') and the resource ('a task from your to-do list'), and includes the identifier ('by its ID'). It distinguishes from siblings like tasks_create and tasks_list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use delete vs update, nor any prerequisites or conditions. The description does not mention that agent_id is required when calling from MCP, which is important context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tasks_listAInspect
List your tasks, or another agent's tasks (read-only) using from_agent_id. Use filters to narrow results.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 20) | |
| status | No | ||
| overdue | No | If true, only return tasks past due_at that are not done | |
| agent_id | No | Agent ID whose tasks to list. Required when calling from MCP. | |
| thread_id | No | Filter by related thread | |
| from_agent_id | No | List tasks of another agent (read-only). Omit to list your own. | |
| assigned_to_contact_id | No | Filter by assigned contact |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose behavioral traits. It mentions 'read-only' for another agent's tasks but does not clarify whether listing one's own tasks also is read-only or has any side effects. It fails to explicitly confirm non-destructiveness.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—one sentence achieving clarity without waste. Every word serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having 7 parameters and no output schema, the description does not explain return format, pagination, ordering, or error conditions. It is incomplete for a tool with this complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 86%, so the schema already documents most parameters. The description adds context about from_agent_id and filters, but no new semantic meaning beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists tasks, including the ability to list another agent's tasks (read-only). It effectively distinguishes from sibling tools like tasks_create or tasks_delete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides guidance on when to use the tool, highlighting the use of from_agent_id for read-only access to other agents' tasks and mentioning filters. However, it does not explicitly state when not to use it or direct to alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tasks_updateAInspect
Update an existing task. Set status='done' to complete it, 'cancelled' to cancel. Use summary for completion notes.
| Name | Required | Description | Default |
|---|---|---|---|
| due_at | No | ISO datetime | |
| status | No | ||
| summary | No | Completion note (stored when marking done) | |
| task_id | Yes | ID of the task to update | |
| agent_id | No | Agent ID whose task to update. Required when calling from MCP. | |
| priority | No | ||
| description | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden but only mentions updating status and summary. It fails to disclose mutation behavior, permissions required, or whether partial updates overwrite unmentioned fields.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with front-loaded purpose and actionable details. No redundant or extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool has 7 parameters, no output schema, and no annotations. Description is too minimal; it does not explain return values, error handling, or behavior for partial updates.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is low (57%), and description adds meaning for status and summary (e.g., 'summary for completion notes'), but ignores due_at, priority, description, agent_id, and task_id parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Update an existing task.' This specific verb+resource combination distinguishes it from sibling tools like tasks_create, tasks_delete, and tasks_list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides specific guidance on using status and summary fields (e.g., 'Set status='done' to complete it'), but lacks explicit when-to-use or when-not-to-use instructions compared to alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
threads_listAInspect
List conversation threads with previews and metadata. Use before messages.read_history to resolve thread_id. Returns: id, title, last message, timestamp, unread count.
| Name | Required | Description | Default |
|---|---|---|---|
| kind | No | Thread type. | |
| limit | No | Maximum threads to return. | |
| order | No | Sort order. | desc |
| channel | No | Filter by channel. | |
| order_by | No | Sort field. | last_message_at |
| only_unread | No | Only threads with unread messages. | |
| include_archived | No | Include archived threads. | |
| participant_name | No | Filter threads by participant name. | |
| max_inactive_days | No | User sent a message within the last N days (recently active). | |
| min_inactive_days | No | User's last outgoing message older than N days (dormant). | |
| user_sent_message | No | Only threads the user has sent at least one message in. | |
| min_last_message_days | No | Last message (from anyone) older than N days. | |
| participant_contact_id | No | entity_id from contacts.find — returns all threads where this contact is an active participant. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description bears full burden. It specifies return fields (id, title, last message, timestamp, unread count) and implies a read-only operation. Could be enhanced by mentioning pagination or real-time nature, but sufficient for a list tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two essential sentences plus a return list. No fluff, front-loaded with purpose, every sentence adds value. Excellent conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 13 optional parameters and no output schema, the description covers the main use case and return format. It lacks mention of filtering capabilities, but the schema descriptions compensate. Mostly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (every parameter has a description), so the baseline is 3. The description adds no additional parameter information, which is acceptable as the schema already handles it.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists conversation threads with previews and metadata, and explicitly ties it to resolving thread_id for messages.read_history. This distinguishes it from sibling tools like threads_update.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It provides a specific usage guideline ('Use before messages.read_history to resolve thread_id'), indicating a clear workflow. However, it does not mention when to avoid using this tool or alternative listing tools, so not a 5.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
threads_updateAInspect
✏️ Update a conversation thread: rename it, add notes/description, or move to a folder.
When to use:
User wants to rename a chat or group
User wants to add notes/context about a conversation
User wants to organize threads into folders
For DM threads, renaming also updates the linked contact's display name by default. Requires thread_id from threads.list.
| Name | Required | Description | Default |
|---|---|---|---|
| title | No | New title for the thread (max 255 chars) | |
| folder_id | No | Move thread to this folder (null removes from folder) | |
| thread_id | Yes | Thread ID from threads.list | |
| description | No | AI context / notes for this thread. Empty string clears description. | |
| update_contact | No | For DM threads, also rename the linked contact (default: true) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that DM thread renaming updates linked contact display name by default, and that update_contact controls this. No annotations provided, but description could mention permissions, idempotency, or failure cases.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with a concise summary and bullet point list for usage. Could be slightly more terse without losing clarity, but overall efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers main operations, dependencies, and a side effect (DM contact rename). Lacks discussion of error scenarios or permissions, but adequate for a simple update tool with no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and description largely repeats schema descriptions. Adds minor context (e.g., max chars for title) but does not significantly extend meaning beyond input schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'update' and resource 'conversation thread', and lists specific actions (rename, add notes/description, move to folder). It distinguishes from sibling tools by detailing DM thread behavior.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Includes a 'When to use' section with three specific scenarios and mentions dependency on threads.list. However, it does not explicitly state when not to use or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
vision_queryAInspect
Look at the screen currently being shared in a meeting and answer a question about it. Returns a natural-language answer based on the visual content. Use ONLY when the user explicitly asks about the screen/slide/document being shown.
| Name | Required | Description | Default |
|---|---|---|---|
| question | Yes | Question about the shared screen. | |
| image_b64 | No | Base64-encoded JPEG image of the screen-share frame. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full weight. It discloses the behavior (returns a natural-language answer based on visual content) and implies a read-only operation. Minor omission: no mention of what happens if no screen is being shared, but the description is still transparent enough for typical use.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences long, with no unnecessary words. Every sentence adds crucial information: what the tool does and when to use it. Highly efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple query tool with two well-documented parameters and no output schema, the description provides all necessary context. It explains the purpose, usage constraint, and return type (natural-language answer). No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameter descriptions in the schema already cover the meaning. The description adds no extra semantic value beyond what's in the schema's property descriptions; it restates them. Baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's action: 'Look at the screen currently being shared in a meeting and answer a question about it.' This specific verb-resource combination distinguishes it from any sibling tools, none of which involve screen sharing.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use: 'Use ONLY when the user explicitly asks about the screen/slide/document being shown.' This provides clear usage context without needing to mention alternatives, as no sibling serves a similar purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
web_fetchAInspect
Fetches a single URL and returns its content. Use this when you have a specific URL in mind — for example, after web.search returns a link you want to read, or when the user pastes a URL.
Modes (extract):
'auto' (default): picks the right mode based on response content type.
'markdown': for HTML pages; returns cleaned markdown plus the page .
'text': for JSON/XML/plaintext APIs; returns the raw decoded body.
'file': for images, PDFs, audio, video, archives, or any binary — ingests the bytes into the user's file storage and returns a file_id you can pass to messages.send (to send as an attachment), agents.add_file (to add to agent knowledge), or files.read.
Use web.fetch (not files.upload) when you need the file_id immediately for the next tool call — files.upload(source_url=…) is async and won't have the file ready in the same turn.
Use web.search (not web.fetch) when you don't have a specific URL yet and need to find one.
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to fetch (http or https). Must be publicly reachable. | |
| extract | No | How to handle the response: 'auto' (default), 'markdown' (HTML → markdown), 'text' (raw body), or 'file' (ingest as binary, return file_id). | auto |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It explains extract modes and their outputs (markdown: cleaned markdown+title; text: raw body; file: returns file_id with usage hints). However, it misses potential error behaviors, timeouts, or limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a one-line summary followed by mode details and usage guidance. Every sentence adds value, though it could be slightly more concise without losing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately covers output formats for each mode (markdown: content+title, text: raw body, file: file_id) and distinguishes from siblings web_search and files_upload. Lacks error handling details but is sufficient for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. The description adds value by detailing what each extract mode returns (e.g., markdown gives title, text raw body, file provides file_id and usage) beyond the schema's enum descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Fetches') and resource ('a single URL') and immediately distinguishes from sibling tools like web_search and files_upload by specifying when to use this tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use (specific URL in mind, after web.search returns a link, or user pastes a URL) and when not to use (use web.search when no specific URL, use web.fetch not files.upload for immediate file_id). Provides clear alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
web__local_searchAInspect
Multi-source web research with citations. Returns a synthesized answer with numbered [^1] markers and a citations array of {url, title, snippet, index}. Use for evidence-backed synthesis (competitive analysis, regulatory summary, whitepaper section). For quick fact lookups use web.search instead.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Research question. Specific scoped questions outperform vague keywords. | |
| language | No | Search language hint (BCP-47, e.g. 'en', 'ru'). Defaults to 'en'. The synthesis output language matches the query language regardless. | en |
| num_sources | No | How many top search results to fetch and synthesize (1-4, default 4). Lower = faster + cheaper, higher = more comprehensive. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully describes the output format (synthesized answer with numbered markers and citations array). It does not mention potential limitations like rate limits or latency, but gives sufficient behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with no wasted words. The output format and citation mechanism are front-loaded, and usage guidance is immediately clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no output schema, the description explains the return structure sufficiently. Combined with schema, it covers purpose, usage, parameters, and output. No gaps for typical use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description adds extra guidance: query should be scoped, language synthesis behavior, and num_sources trade-off. This enriches the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs multi-source web research and returns a synthesized answer with citations. It distinguishes from the sibling web.search by noting that web.search is for quick fact lookups.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use this tool (evidence-backed synthesis like competitive analysis, regulatory summary, whitepaper section) and when not (quick fact lookups, pointing to web.search).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
web_searchAInspect
Search the web for current information, news, facts, prices, or events. Use this when the user asks about something that requires up-to-date information from the internet, or when internal knowledge base doesn't have the answer. Examples: recent news, stock prices, weather, product information, current events.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Search query - what to search for on the web. | |
| num_results | No | Number of results to return (1-10). | |
| search_type | No | Type of search: 'search' for general web, 'news' for news articles. | search |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries the full burden. It states the tool searches the web but does not disclose any behavioral traits such as rate limits, authentication requirements, or return format. It is adequate for a simple search but lacks detail on side effects or limitations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the main action, includes examples, and is concise with no unnecessary words. Every sentence serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description is reasonably complete for purpose and usage but lacks information about the return format (e.g., titles, URLs, snippets). It could be more helpful by describing what the agent will receive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the description only adds value by contextualizing the parameters with examples like 'stock prices' and 'weather'. It does not provide deeper semantics beyond the schema's own descriptions, meeting the baseline expectation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's function: 'Search the web for current information, news, facts, prices, or events.' It provides specific examples and distinguishes from internal knowledge base, effectively differentiating from siblings like knowledge_query and web_fetch.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives explicit when-to-use guidance: when the user needs up-to-date internet info or internal knowledge base falls short. It implies alternatives but does not list when-not-to-use or other sibling tools like web_fetch for fetching a URL.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
widgets_createAInspect
Create a new livechat widget for your website.
The widget will be created with default settings. You can customize theme, auto-reply mode, and more.
Use this when user wants to add a chat widget to their site.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Name for the widget (e.g., 'Website Chat', 'Support Widget') | |
| position | No | Widget position on screen | bottom-right |
| header_title | No | Title shown in chat header | Chat with us |
| primary_color | No | Primary color for widget theme (hex, e.g., '#2563eb') | #2563eb |
| auto_reply_mode | No | Auto-reply mode: 'draft' (review before sending) or 'auto' (send immediately) | draft |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description notes that the widget is created with default settings and can be customized, but without annotations, it lacks details on side effects, prerequisites, or failure modes. It provides basic transparency but not comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three short sentences with no redundancy. Every sentence adds value: purpose, default behavior, and usage context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description explains the purpose and when to use, but does not specify the return value (e.g., widget ID) or error conditions. Given the absence of an output schema, this is a notable gap for a creation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All 5 parameters are fully described in the input schema (100% coverage). The description only adds vague reference to customization (theme, auto-reply mode), which does not significantly augment the schema's semantic information.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it creates a new livechat widget, mentions default settings and customization, and distinguishes from sibling tools like widgets_update and widgets_delete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use this when user wants to add a chat widget to their site,' providing clear context. However, it does not mention when not to use it or alternative tools for updating or deleting widgets.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
widgets_deleteAInspect
Delete a livechat widget permanently.
This will remove the widget and its embed code will stop working. Existing chat history will be preserved.
Use this when user wants to remove a chat widget.
| Name | Required | Description | Default |
|---|---|---|---|
| widget_id | Yes | ID of the widget to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses the permanent nature, effect on embed code, and preservation of chat history. It appropriately informs about the destructive behavior beyond just saying 'delete'.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences: purpose, consequence, usage guidance. Every sentence earns its place, no fluff, front-loaded with the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple delete tool with one parameter and no output schema, the description covers essential behavioral and usage context. It could mention return values or error conditions, but it is largely complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a single parameter 'widget_id' described as 'ID of the widget to delete'. The description adds no additional semantic meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete') and the resource ('livechat widget'), with the word 'permanently' adding specificity. It is distinct from sibling tools like widgets_get or widgets_update.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use this when user wants to remove a chat widget' and details consequences (embed code stops working, chat history preserved). It does not mention alternatives or when not to use, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
widgets_getAInspect
Get full configuration of a single livechat widget.
Returns all settings including theme, identification, actions, and more.
Use this when user wants to see or verify a specific widget's settings.
| Name | Required | Description | Default |
|---|---|---|---|
| widget_id | Yes | ID of the widget to retrieve |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the burden. It implies a read-only operation by saying 'Get full configuration', but does not explicitly state that no data is modified, nor does it mention any authorization or rate limits. Adequate but not thorough.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each adding distinct value: action and resource, return content, and usage guidance. No redundant or irrelevant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple get tool with one parameter and no output schema, the description adequately explains what it does, what it returns, and when to use it. No gaps are apparent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema provides a description for the single parameter 'widget_id', achieving 100% coverage. The description adds no further meaning beyond the schema, so it meets the baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it retrieves full configuration of a single widget, with specifics on returned settings (theme, identification, actions). The phrase 'Use this when user wants to see or verify a specific widget's settings' implicitly distinguishes it from sibling tools like widgets_list (list all) and widgets_update (modify).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description concludes with explicit guidance: 'Use this when user wants to see or verify a specific widget's settings.' This provides clear context for when to use the tool, though it does not mention when not to use it or point to specific alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
widgets_get_embed_codeAInspect
Get the embed code snippet for a livechat widget.
Returns HTML/JavaScript code to add to your website. The code should be placed before the closing tag.
Use this when user wants to install the chat widget on their site.
| Name | Required | Description | Default |
|---|---|---|---|
| widget_id | Yes | ID of the widget to get embed code for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses the return type (HTML/JavaScript) and placement instruction, which is adequate for a read-only tool. However, it does not explicitly state that the operation is read-only or discuss any prerequisites or error conditions, leaving some behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise with three sentences: purpose, return type, and usage guidance. No extraneous information, and the key points are front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
There is no output schema, so the description must clarify the return value. It does so by stating 'Returns HTML/JavaScript code' and provides a placement instruction. It could mention more about prerequisites like widget existence, but overall it is sufficient for the tool's simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for the single parameter 'widget_id', and the schema already describes it as 'ID of the widget to get embed code for'. The description adds no additional semantic meaning beyond the schema, so a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'get' and the resource 'embed code snippet for a livechat widget', and distinguishes this tool from sibling tools like 'widgets_get' or 'widgets_list' by specifying that it returns embed code for installation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use this when user wants to install the chat widget on their site.', providing clear usage context. It does not list when not to use or alternatives, but the sibling tools include other widget operations, making the purpose distinct enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
widgets_listAInspect
List all livechat widgets.
Returns widgets with their configuration, embed code, and status.
Use this when user wants to see their widgets or chat widgets.
| Name | Required | Description | Default |
|---|---|---|---|
| active_only | No | Only return active widgets |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description states it returns widgets with configuration, embed code, and status, which indicates the output fields. However, it does not mention read-only nature, auth requirements, or any side effects. The behavioral coverage is adequate but minimal.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three concise sentences, each serving a distinct purpose: stating action, describing output, and offering usage guidance. No redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool with one optional parameter, the description adequately covers the purpose, output, and usage context. It does not mention pagination or sorting, but these are reasonable omissions for such a tool. The presence of sibling widget tools is not addressed, but that is not critical.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a clear description for the only parameter (active_only). The tool description does not add extra meaning beyond the schema, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'List all livechat widgets' using a specific verb and resource. It distinguishes this tool from siblings like widgets_get, which retrieves a single widget.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description advises 'Use this when user wants to see their widgets or chat widgets', which is helpful but does not explicitly exclude cases where alternatives (e.g., widgets_get) are better suited. No when-not-to-use or alternative naming is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
widgets_updateAInspect
Update an existing livechat widget configuration.
You can change name, theme, auto-reply mode, and other settings. Only provided fields will be updated.
Use this when user wants to modify their chat widget settings.
| Name | Required | Description | Default |
|---|---|---|---|
| name | No | New name for the widget | |
| position | No | Widget position on screen | |
| is_active | No | Enable or disable the widget | |
| widget_id | Yes | ID of the widget to update | |
| website_url | No | Website URL for product/site search integration | |
| calendly_url | No | Booking URL for calendar action (e.g., 'https://calendly.com/yourname') | |
| color_scheme | No | Widget color scheme. 'auto' follows the visitor's OS dark/light mode preference. | |
| header_title | No | Title shown in chat header | |
| greeting_text | No | Custom greeting message shown when visitor opens the chat (e.g., 'Hello! How can I help you today?') | |
| primary_color | No | Primary color for widget theme (hex, e.g., '#2563eb') | |
| voice_greeting | No | Spoken opening line when a visitor starts a voice call through this widget. Played via TTS before the AI model runs. Empty string disables the greeting. | |
| allowed_domains | No | List of allowed domains for the widget | |
| auto_reply_mode | No | Auto-reply mode: 'draft' or 'auto' | |
| header_subtitle | No | Subtitle shown in chat header | |
| greeting_enabled | No | Enable or disable the proactive greeting | |
| greeting_behavior | No | notification = show badge after delay; auto_open = open widget automatically after delay; on_open = greet only when visitor manually opens. | |
| enable_form_action | No | Enable or disable the contact form action button | |
| contact_form_fields | No | Fields to collect in contact form (e.g., ['name', 'email', 'phone']) | |
| enable_search_action | No | Enable or disable the search action button | |
| show_visitor_history | No | Show full chat history to returning visitors (true = show all past messages, false = current session only) | |
| identification_fields | No | Fields to require for visitor identification (e.g., ['name', 'email']) | |
| enable_calendar_action | No | Enable or disable the calendar booking action button | |
| greeting_delay_seconds | No | Delay in seconds before the proactive greeting appears (0–300). 0 = send immediately on page load. Default: 30. | |
| require_identification | No | Require visitor to identify before chatting | |
| returning_greeting_text | No | Greeting for returning visitors who already have chat history (e.g., 'Welcome back! How can I help you today?'). Falls back to greeting_text if not set. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses that 'Only provided fields will be updated', indicating a partial update. However, it does not mention other behavioral aspects such as side effects, return value, or prerequisites, making it adequate but limited.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise with four sentences, front-loaded with the core purpose and scope. No redundant or irrelevant information is present.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the high parameter count (25), the description covers the essential aspects: partial update, usability context, and a brief list of changeable categories. It does not explicitly state the return value, but that can be inferred. Overall, it is fairly complete for a standard update tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds a high-level overview ('name, theme, auto-reply mode, and other settings') but does not provide additional meaning beyond the schema's detailed parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Update' and the resource 'existing livechat widget configuration', and provides examples of settable fields. It distinguishes from siblings like widgets_create, widgets_delete, etc.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use this when user wants to modify their chat widget settings', providing clear context for when to use it. It does not mention when not to use or alternatives, but it is sufficient given the sibling diversity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
workspace_currentAInspect
Return the workspace this MCP API key is currently routed to, with the caller's role inside it. Use this to confirm context before/after workspace.switch.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Openly describes return values (workspace and role) and implies read-only operation; sufficient given no annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two succinct sentences, front-loaded, no extraneous words—maximizes clarity in minimal space.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Fully describes purpose and return content; though no output schema, the text conveys needed info for this simple tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters, so schema coverage is 100%; baseline of 4 applies as description adds no param info but none needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it returns the current workspace and the caller's role, distinct from sibling tools like workspace_list or workspace_switch.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly guides to use for confirming context before/after workspace.switch, providing clear when-to-use context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
workspace_listAInspect
List every workspace the caller is a member of, with is_current marking the workspace this MCP key is currently routed to. Pair with workspace.switch to change the active workspace without reconnecting.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but the description is straightforward: lists workspaces with an `is_current` marker. It lacks disclosure of potential permissions, rate limits, or output structure beyond the mention of `is_current`, which is adequate for a simple list tool with no parameters.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two short, front-loaded sentences with no unnecessary text. Every word contributes: purpose, unique feature, and pairing guidance.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero parameters and no output schema, the description fully covers what the tool does, the key output field, and how to use it in workflow. No gaps remain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has no parameters, so schema coverage is 100%. The description adds value by explaining the `is_current` field and suggesting usage with `workspace.switch`, which goes beyond the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'list' and resource 'workspaces', specifies scope 'the caller is a member of', and highlights the unique `is_current` field. It distinguishes itself from siblings like `workspace_search` and `workspace_current`.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly suggests pairing with `workspace.switch` to change active workspace without reconnecting, providing clear guidance on when to use this tool and how it relates to alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
workspace_searchAInspect
Universal workspace search: files, links, threads, messages. Runs scopes in parallel and returns sectioned results. Default scope='auto' detects target from query. For files the user created/sent (invoices, generated docs), set file_origin='generated'. Use this for all workspace search; lower-level tools are internal.
| Name | Required | Description | Default |
|---|---|---|---|
| kind | No | Thread type filter. | |
| limit | No | Maximum items to return. | |
| owner | No | Link owner: 'self' (user's own) or 'contact' (from others). | |
| query | No | Freeform search query for content/semantic search. | |
| scope | No | Where to search. 'auto' (default) detects from query, or a list: ['files'], ['links'], ['threads'], ['messages'], or any combination. | auto |
| channel | No | Filter by channel (telegram, email, ...). | |
| date_to | No | ISO8601 date (YYYY-MM-DD), end bound. | |
| file_id | No | Find file by exact ID. | |
| only_dms | No | Restrict to direct messages. | |
| date_from | No | ISO8601 date (YYYY-MM-DD). Supports {{context.yesterday|today|week_start|month_ago}}. | |
| days_back | No | For unanswered threads: only include activity in last N days. | |
| link_kind | No | Filter links by type. | |
| min_score | No | Minimum semantic (cosine) relevance score, 0.0-1.0, default 0.25. Applied to both text and image RAG matches. Lower = more recall (include marginal matches); higher = stricter. Typical range 0.15-0.45. Image searches score systematically lower than text; do not raise above 0.40 for image queries unless the user explicitly asks for near-exact matches. Ignored for exact filename queries (e.g. 'report.pdf'). | |
| summarize | No | With scope=messages: also return an AI-synthesized summary of matched messages. | |
| thread_id | No | Scope message search to a specific thread ID. | |
| file_origin | No | File origin: 'generated' (created by tools), 'received' (from messages), 'uploaded' (manual). Use 'generated' for 'invoices I sent/created'. | |
| only_unread | No | Limit to threads with unread messages. | |
| contact_hint | No | Name hint to filter links associated with a specific contact. | |
| document_type | No | Filter files by document type. | |
| only_incoming | No | Only messages from others. | |
| only_outgoing | No | Only messages the user sent. | |
| attachment_ext | No | Filter by extension (pdf, md, ...). | |
| attachment_name | No | Filter by attachment/file name. | |
| only_unanswered | No | Limit to unanswered threads (last message incoming). | |
| participant_name | No | Filter by participant/contact name. With scope=threads: chats with this participant. With scope=messages: messages from this participant. | |
| min_inactive_days | No | Only threads where the user hasn't replied in N days. | |
| require_attachments | No | Only hits that include attachments. | |
| min_last_message_days | No | Only threads where the last message (from anyone) is older than N days. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses parallel execution, sectioned results, and default scope='auto' detection. It does not mention potential rate limits or output structure, but for a search tool the transparency is good.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: two sentences plus a brief instruction. Every sentence adds value, and it is front-loaded with the main purpose. No redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 28 parameters and no output schema, the description provides a sufficient high-level overview. It explains key behaviors (parallel execution, auto scope) but could elaborate on result structure. However, the schema fills many gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% so baseline is 3. The description adds practical guidance beyond schema, e.g., explaining file_origin='generated' with a concrete example. It also explains the scope parameter's default behavior.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Universal workspace search: files, links, threads, messages' with a specific verb and resource. It differentiates from siblings by saying 'lower-level tools are internal' and 'Use this for all workspace search'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly directs to use this tool for all workspace search and warns that lower-level tools are internal. It also gives specific guidance like 'For files the user created/sent... set file_origin='generated''. While it does not enumerate alternatives, the context implies when to switch.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
workspace_switchAInspect
Re-point the active MCP API key to a different workspace. Pass exactly one of workspace_id or slug (find them via workspace.list). Takes effect on the very next tool call — no MCP reconnect, no new API key. Sequential checkpoint: do not parallelize tool calls across a switch — calls already in flight when the switch commits will run against the previous workspace.
| Name | Required | Description | Default |
|---|---|---|---|
| slug | No | Workspace slug to switch to. Resolved within the caller's memberships, so cross-tenant slug collisions are not possible. Mutually exclusive with `workspace_id`. | |
| workspace_id | No | Numeric workspace id to switch to. Mutually exclusive with `slug`. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description bears full responsibility. It discloses that the switch takes effect on the next tool call, no reconnect or new API key is needed, and highlights a sequential checkpoint constraint. Slightly lacks explicit statement about error behavior if both parameters are passed, but overall very transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is four sentences with no extraneous content. The main purpose is stated first, followed by parameter usage, behavioral effects, and a specific constraint. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, but the description sufficiently explains the effect and a key constraint (sequential checkpoint). Could be improved by noting potential error scenarios (e.g., invalid workspace), but for a simple switch tool it covers the necessary context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema covers both parameters with descriptions of mutual exclusivity. The description adds value by specifying where to find the parameters (via workspace.list) and reinforcing the mutual exclusivity with 'pass exactly one'. Schema coverage is 100%, so baseline is 3; the added context raises it to 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states that the tool 'Re-point the active MCP API key to a different workspace', using a specific verb and resource. It clearly distinguishes from sibling tools like workspace_list (list workspaces) and workspace_current (get current workspace).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit usage instructions: pass exactly one of workspace_id or slug, and references workspace.list to find them. Also includes explicit when-not-to-use guidance regarding not parallelizing calls across a switch and warns about calls in flight.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!