local-mcp
Server Details
Let ChatGPT, Claude & Cursor use your Mac: email, calendar, iMessage, Teams, files. Local, free.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- lanchuske/local-mcp-releases
- GitHub Stars
- 21
- Server Listing
- Local MCP
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4/5 across 181 of 181 tools scored. Lowest: 2.4/5.
Each tool is prefixed with a clear domain (chrome_, safari_, m365_, notion_, etc.) making their purpose immediately apparent. Even where two tools perform similar actions (e.g., list_emails vs m365_list_emails), the descriptions explicitly distinguish which app they target. No two tools appear to do the same thing.
The vast majority of tools follow a consistent verb_noun pattern with domain prefixes. Verbs like create_, list_, get_, read_, search_, delete_ are used predictably. Minor exceptions like 'daily_brief' are rare and still descriptive. The overall system is highly uniform.
With 181 tools, the server is far beyond the typical well-scoped range of 3-15. While each tool may have a specific purpose, the extreme size makes it unwieldy and suggests a lack of focus. This qualifies as an extreme mismatch given the server's apparent purpose as a local assistant.
The tool surface covers an impressively wide range of apps and operations, but notable gaps exist: Notion and Signal are read-only (no create/update), Slack lacks send capabilities, and there is no generic local file delete or rename. These missing operations mean agents will encounter dead ends for common workflows.
Available Tools
181 toolschrome_clickChrome ClickBInspect
Clicks the first element matching a CSS selector in the current Google Chrome tab. Returns the tag name and visible text of the clicked element so you can confirm the right thing was hit. Pass wait_for_navigation: true to wait up to 3 seconds for the page to load after the click.
| Name | Required | Description | Default |
|---|---|---|---|
| nth | No | Which match to click if there are several (0-based, default 0) | |
| selector | Yes | CSS selector (e.g. 'button.primary', '#save', '[data-testid=login]') | |
| wait_for_navigation | No | Wait up to 3s for page load after click (default false) |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations providing safety context, the description carries the burden. It mentions the return value and the wait_for_navigation feature but omits important behaviors such as handling of multiple matches (despite the nth parameter), error conditions, or what happens if the click triggers navigation beyond 3 seconds. The description does not contradict annotations, but misses key behavioral details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with just two sentences. The first sentence clearly defines the action and return value, and the second sentence adds a key usage note. No unnecessary words or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and the presence of schema coverage and an implied output schema, the description covers the core functionality and the important navigation option. It lacks some edge-case details (e.g., element visibility, error handling) but is largely complete for a straightforward click action. The sibling tools do not require deeper elaboration.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (all parameters have descriptions), so baseline is 3. The description adds no significant meaning beyond the schema: it repeats the wait_for_navigation behavior already described in the schema, and does not elaborate on selector or nth usage. Minimal added value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool clicks the first element matching a CSS selector in the current Chrome tab and returns the tag name and visible text. This verb+resource combination is specific, but it does not explicitly differentiate from sibling tools like chrome_type or chrome_fill_form, though the purpose is still clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a useful guideline for using the wait_for_navigation option, but it does not specify when to use this tool versus alternatives like chrome_query_selector_all or chrome_evaluate_js. No when-not-to-use or explicit alternative comparisons are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
chrome_evaluate_jsChrome Evaluate JSARead-onlyInspect
Runs arbitrary JavaScript in the current Google Chrome tab and returns its result. Requires 'Allow JavaScript from Apple Events' in Chrome's View → Developer menu.
| Name | Required | Description | Default |
|---|---|---|---|
| script | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds the prerequisite setup requirement, which is useful context. It does not elaborate on potential side effects of running arbitrary JS, but the annotations cover the destructive aspect adequately.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is only two sentences: the first states the core function, the second provides a prerequisite. It is front-loaded and contains no superfluous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple single-parameter tool with an output schema, the description covers the essential behavior and setup. It lacks mention of execution context or limitations, but the presence of an output schema reduces the need for return value details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It mentions 'arbitrary JavaScript', which explains the purpose of the 'script' parameter, adding meaning beyond the schema's bare string type.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool runs arbitrary JavaScript in the current Chrome tab and returns the result. This specifies the verb, resource, and output, distinguishing it from sibling tools like safari_evaluate_js or chrome_click.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a prerequisite condition ('Allow JavaScript from Apple Events' enabled), which guides the user on necessary setup. However, it does not explicitly differentiate when to use this tool versus alternative Chrome actions like chrome_click or chrome_fill_form.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
chrome_fill_formChrome Fill FormARead-onlyInspect
Fills multiple form fields in one shot in the current Google Chrome tab. Pass fields as a JSON object mapping CSS selector to value.
| Name | Required | Description | Default |
|---|---|---|---|
| fields | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| result | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description claims the tool fills form fields (a write operation), but the annotations include readOnlyHint: true, which strongly suggests the tool does not modify state. This is a direct contradiction. No additional behavioral context is provided.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, each necessary: first states purpose, second specifies parameter format. No redundancy or wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is concise but lacks details about behavior on errors (e.g., if selectors not found) and does not mention that it contradicts the annotation. Given the presence of an output schema, some behavior might be inferred, but the annotation contradiction leaves a significant gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has one parameter 'fields' of type string with 0% coverage, but the description adds critical meaning: it specifies that the string should be a JSON object mapping CSS selectors to values. Without this, the agent would not know the expected format.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fills multiple form fields in one shot in the current Chrome tab, using a JSON object mapping CSS selectors to values. This distinguishes it from siblings like chrome_type (single field) and chrome_click.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies it is for filling multiple fields at once, but does not explicitly state when to use this tool vs alternatives (e.g., chrome_type for single fields) or provide any exclusions or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
chrome_go_backChrome Go BackBRead-onlyInspect
Navigates the current Google Chrome tab back to the previous page.
| Name | Required | Description | Default |
|---|---|---|---|
| window_index | No | 0 |
Output Schema
| Name | Required | Description |
|---|---|---|
| to | No | |
| from | No | |
| went_back | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, making the tool's safety profile clear. The description adds minimal behavioral details beyond the default interpretation of 'go back', such as whether it waits for page load or how errors are handled.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise (11 words, one sentence). It is front-loaded with the key action. However, it could be improved by including a brief note about the parameter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple navigation tool, the description captures the essential purpose. However, given the presence of a parameter and an output schema (not shown), the description should explain the output behavior and the parameter's role for full completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage for the single parameter `window_index`, and the description does not mention it or clarify its meaning. The description adds no value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's action ('navigates back') and resource ('current Google Chrome tab'). It is specific and distinguishes itself from siblings like 'safari_go_back' and 'chrome_navigate'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no explicit guidance on when to use this tool versus alternatives (e.g., chrome_navigate with a back action). It does not mention prerequisites, context, or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
chrome_list_tabsChrome List TabsARead-onlyInspect
Lists every open tab across all Google Chrome windows with title, URL, and whether it is active.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| tabs | No | |
| count | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, indicating safe read-only behavior. The description adds value by stating the scope (all windows) and the returned fields (title, URL, active status), which goes beyond what annotations convey.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that concisely states what the tool does. Every word adds value; no extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero parameters and an existing output schema, the description fully covers the tool's behavior. It clearly states what information each tab entry includes, making it complete for an AI agent to understand and invoke.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has no parameters, and schema coverage is 100% (no params). Following guidelines, a baseline of 4 is appropriate since the description does not need to add parameter meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'lists' and resource 'every open tab across all Google Chrome windows', and states the returned fields (title, URL, active status). This clearly distinguishes from sibling tools like chrome_search_tabs (which implies filtering) and safari_list_tabs (different browser).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description clearly implies the tool is used to get a snapshot of all open Chrome tabs. While it does not explicitly state when to use versus filtering alternatives like chrome_search_tabs, the purpose is obvious given the tool's name and zero parameters.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
chrome_query_selector_allChrome Query Selector AllARead-onlyInspect
Runs document.querySelectorAll in the current Google Chrome tab and returns a compact summary of each match.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | 50 | |
| selector | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| result | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, so the safety profile is covered. The description adds 'returns a compact summary' which hints at the output but doesn't detail limitations like the limit parameter's effect or behavior on no matches.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence that front-loads the core action and resource, with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having an output schema, the description fails to explain the parameters and does not clarify what a 'compact summary' includes. The agent needs more context to use the tool correctly, especially regarding the limit parameter and return format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, meaning no parameter descriptions. The description does not explain what 'selector' or 'limit' mean, forcing the agent to rely on names and defaults. This is a significant gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool runs document.querySelectorAll in the current Chrome tab and returns a compact summary, distinguishing it from sibling tools like chrome_click or chrome_evaluate_js.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies the context (current Google Chrome tab) but does not explicitly mention when to use this tool over alternatives like safari_query_selector_all or chrome_evaluate_js. However, the purpose is clear enough for an agent to infer.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
chrome_read_tabChrome Read TabARead-onlyInspect
Reads the rendered text content of a Google Chrome tab. Identify the tab either by url_match (substring match against URL; first hit wins) or by window_index + tab_index (from chrome_list_tabs). Text is capped at max_bytes (default 100 KB). Pass include_html: true to also get the raw HTML source. Pass include_links: true to extract all links with their href and text. Requires 'Allow JavaScript from Apple Events' (Chrome → View → Developer); run chrome_setup_check if reads come back empty.
| Name | Required | Description | Default |
|---|---|---|---|
| max_bytes | No | Max bytes of text (and html) to return (default 102400) | |
| tab_index | No | Tab index from chrome_list_tabs (default active tab of that window) | |
| url_match | No | Substring to match against the tab URL. Takes precedence over indices. | |
| include_html | No | Also return the HTML source (default false) | |
| window_index | No | Window index from chrome_list_tabs (default 0) | |
| include_links | No | Extract all links with href + visible text (default false). Great for navigating SPAs. |
Output Schema
| Name | Required | Description |
|---|---|---|
| url | No | |
| html | No | |
| text | No | |
| links | No | |
| title | No | |
| truncated | No | |
| html_bytes | No | |
| link_count | No | |
| text_bytes | No | |
| links_error | No | |
| html_truncated | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations confirm read-only and non-destructive behavior. The description adds behavioral details: text capped at max_bytes, ability to include HTML and links, and the prerequisite. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (5 sentences) with front-loaded purpose. Each sentence adds distinct value: capability, parameter usage, tips, prerequisite. No redundant or vague statements.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 6 parameters, presence of output schema (even if not shown), and background requirement, the description covers all essential aspects: identification methods, output controls, limits, and setup prerequisites. It is fully informative for agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions. The description adds context: default for max_bytes is 102400, url_match is substring 'first hit wins', include_links is 'Great for navigating SPAs'. This enriches understanding beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's function: 'Reads the rendered text content of a Google Chrome tab.' It specifies identification methods (url_match or window_index+tab_index) and distinguishes from sibling tools like chrome_list_tabs and chrome_navigate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use url_match vs indices, mentions default behavior, and provides a prerequisite ('Requires 'Allow JavaScript from Apple Events'') plus troubleshooting advice ('run chrome_setup_check if reads come back empty'). However, it does not explicitly list alternatives or when not to use this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
chrome_search_tabsChrome Search TabsARead-onlyInspect
Searches the rendered text of every open Google Chrome tab for a substring. Returns each matching tab with the surrounding snippet. Useful for 'do I have a tab open with X?' across many tabs. Requires 'Allow JavaScript from Apple Events' (Chrome → View → Developer).
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Substring to search for (case-insensitive) | |
| context | No | Characters of context around each match (default 120) | |
| max_tabs | No | Max tabs to scan (default 30). Higher = slower. |
Output Schema
| Name | Required | Description |
|---|---|---|
| hits | No | |
| query | No | |
| scanned | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds that it searches 'rendered text' and returns snippets, providing behavioral context beyond the annotations. It could mention performance more explicitly.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two informative sentences and a setup requirement, each sentence earning its place. Front-loaded with the action and return value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema for return values, the description covers purpose, usage, setup, and parameters adequately. It lacks error scenarios but is sufficient for an agent to decide when to use the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers all parameters. Description adds default values (context=120, max_tabs=30) and a performance hint for max_tabs, enhancing understanding beyond the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches rendered text of open Chrome tabs for a substring and returns matches with snippets, distinguishing it from sibling tools like chrome_list_tabs which only list tabs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a concrete use case ('do I have a tab open with X?') and a required setup step, but does not explicitly exclude usage scenarios or compare with alternatives like chrome_list_tabs.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
chrome_setup_checkChrome Setup CheckARead-onlyInspect
Reports whether Google Chrome is ready for interactive tools (chrome_click, chrome_type, chrome_evaluate_js, chrome_read_tab text). Returns setup instructions if JavaScript from Apple Events is not enabled.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| tabs_open | No | |
| instructions | No | |
| ready_for_js_tools | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare read-only and non-destructive behavior; the description adds context about returning setup instructions if JavaScript from Apple Events is not enabled, which is useful. It doesn't mention any side effects or prerequisites like Chrome being open, but overall adds value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences that front-load the core purpose and output. No wasted words; each sentence contributes essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple check tool with no parameters and an output schema, the description covers the key outputs and conditions. It could mention that Chrome must be installed, but is otherwise sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has zero parameters, so the description does not need to add parameter information. Baseline score of 4 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: checking if Google Chrome is ready for interactive tools like chrome_click and chrome_type. It specifies the exact tools it supports, making the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use before interacting with Chrome via the listed tools, but does not provide explicit when-to-use or when-not-to-use guidance, nor does it differentiate from sibling tools like safari_setup_check.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
chrome_typeChrome TypeBRead-onlyInspect
Sets the value of an input/textarea matching a CSS selector in the current Google Chrome tab and fires input/change events.
| Name | Required | Description | Default |
|---|---|---|---|
| clear | No | true | |
| value | Yes | ||
| selector | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| result | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds some context by mentioning that input/change events are fired, but it contradicts the readOnlyHint=true annotation since setting a value and firing events are clearly side effects. This contradiction reduces transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence that front-loads the action and key details. It could be slightly more structured but contains no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the main action and event firing but lacks details about the clear parameter, error handling (e.g., selector not found), and any prerequisites like tab existence. An output schema exists, so return values are not required, but behavioral completeness is moderate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description should explain each parameter. It mentions 'CSS selector' and 'value' but does not explain the 'clear' parameter or its default behavior, leaving ambiguity.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it sets the value of an input/textarea matching a CSS selector and fires input/change events. It uses specific verb 'Sets' and resource 'input/textarea', distinguishing it from sibling tools like chrome_click or chrome_fill_form.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. Siblings include chrome_fill_form and chrome_click, but the description does not differentiate use cases or mention any prerequisites or when-not-to-use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
chrome_wait_forChrome Wait ForARead-onlyInspect
Polls the current Google Chrome tab until a CSS selector appears (or its text matches, if text_match is provided). Useful after chrome_click to wait for the next page or a modal to render.
| Name | Required | Description | Default |
|---|---|---|---|
| selector | Yes | CSS selector to wait for | |
| text_match | No | Optional substring that must appear inside the matched element | |
| timeout_ms | No | Max time to wait (default 10000 = 10s, max 30000) |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds polling behavior and context about use after clicks, which complements annotations without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: first defines core functionality, second provides usage context. No redundant or missing information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While annotations and schema are rich, the description does not mention return values or error behavior. Output schema exists but is not shown; still, description could be more complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameters with descriptions. Description adds minimal extra value by reinforcing the optional nature of text_match and tying it to CSS selector logic.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it polls for a CSS selector appearance, with optional text matching. It distinguishes itself from sibling Chrome tools like chrome_click or chrome_navigate by being a wait/observer tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly suggests usage after chrome_click to wait for page or modal render. Does not provide when-not-to-use or alternatives, but the context is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
complete_omnifocus_taskComplete OmniFocus TaskAInspect
Marks an OmniFocus task as complete by task ID or name. Requires confirm=true.
| Name | Required | Description | Default |
|---|---|---|---|
| confirm | No | ||
| task_id | No | ||
| task_name | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| name | No | |
| completed | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate a write operation (readOnlyHint=false) and non-destructive (destructiveHint=false). The description adds the requirement that confirm=true must be passed, which is a behavioral constraint. However, it does not explain error handling or side effects, such as behavior when the task is already complete.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with no unnecessary words. It efficiently conveys the action, resource, and a key requirement.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool, the description covers the core action and a critical parameter requirement. However, it lacks information about prerequisites (e.g., OmniFocus connection) or edge cases (e.g., duplicate task names). The presence of an output schema mitigates the need for return value details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description adds meaning by stating tasks can be identified 'by task ID or name', implying alternative identification. It also notes the confirm parameter must be true. However, it does not clarify that task_id and task_name are optional or mutual exclusivity.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description states 'Marks an OmniFocus task as complete by task ID or name.' Clearly identifies verb and resource, distinguishing it from sibling tools like complete_reminder. However, it does not explicitly differentiate from similar tools like todo_complete_task.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions 'Requires confirm=true' as a usage requirement but provides no explicit guidance on when to use this tool versus alternatives like search_omnifocus_tasks or complete_reminder. Usage context is implied but not fully specified.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
complete_reminderComplete ReminderAInspect
Marks a reminder complete in Apple Reminders (Reminders.app). Requires confirm=true. For Microsoft To Do use todo_complete_task instead.
| Name | Required | Description | Default |
|---|---|---|---|
| confirm | No | Must be true to complete | |
| reminder_id | Yes | Reminder ID from list_reminders |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations show readOnlyHint=false, so writing is expected. Description adds the requirement 'confirm=true', which is a behavioral safety measure beyond the schema. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words, front-loaded with the core action. Every sentence serves a purpose: action, requirement, alternative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with full schema coverage and an output schema, the description covers all essential information: what it does, a key requirement, and differentiation from similar tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions. The description reinforces the confirm parameter requirement but adds no additional semantic value beyond the schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb+resource: 'Marks a reminder complete in Apple Reminders (Reminders.app).' It clearly distinguishes from sibling tools like complete_omnifocus_task and todo_complete_task by specifying the app.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'Requires confirm=true' as a precondition and provides an alternative: 'For Microsoft To Do use todo_complete_task instead.' This gives clear when-to-use and when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
connect_m365_accountConnect Microsoft 365 AccountAInspect
Connect your Microsoft 365 account. Call once to get a login code, then call again after you've authenticated at microsoft.com/devicelogin to confirm the connection.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| message | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations are minimal; the description adds valuable behavior details about the two-step flow and external authentication, though no side effects are noted.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words, front-loaded with purpose and clear steps.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all needed context for a connection tool: what it does and how to use it. Output schema exists so return values need not be described.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so description adds no param info. Baseline 4 applies per guidelines for zero parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it connects a Microsoft 365 account and distinguishes it from sibling tools like disconnect_m365_account.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly describes the two-step process: call to get login code, then call again after authentication. Provides clear sequence.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
connect_servicenowConnect ServiceNowBInspect
Connect to your ServiceNow instance using your username and password. Credentials are stored locally and never sent to Claude's servers.
| Name | Required | Description | Default |
|---|---|---|---|
| instance | Yes | Your ServiceNow instance hostname, e.g. 'mycompany.service-now.com' or just 'mycompany' | |
| password | Yes | Your ServiceNow password | |
| username | Yes | Your ServiceNow username |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | Yes | |
| message | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds a transparency note about credentials being stored locally and not sent to servers, which is beyond the annotations. However, it does not disclose whether a session is created, if reconnection is needed, or other behavioral traits. Annotations already indicate non-read-only and non-destructive, so the description adds moderate value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: one for purpose, one for privacy. Front-loaded with the core action. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has an output schema (not shown), but the description does not mention what the tool returns (e.g., connection status, session token). For a connection tool, more behavioral context would be helpful. Still, the output schema likely fills this gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers all 3 parameters with descriptions. The description adds no parameter-specific details beyond the schema, but includes a general privacy statement. With 100% schema coverage, the baseline is 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Connect to your ServiceNow instance using your username and password,' providing a specific verb (connect) and resource (ServiceNow instance). It distinguishes from sibling ServiceNow tools that focus on CRUD operations or search, though it does not explicitly contrast with them.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is given on when to use this tool, its prerequisites (e.g., needing a valid instance), or that it should be called before other ServiceNow operations. The description lacks explicit usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_calendar_eventCreate Calendar EventAInspect
Creates an event in the Mac's Calendar app (Calendar.app). Requires title, start_date, end_date. Optionally invite attendees by email (CalDAV/Exchange calendars only). For Microsoft 365 use m365_create_event instead.
| Name | Required | Description | Default |
|---|---|---|---|
| notes | No | Event notes (optional) | |
| title | Yes | Event title | |
| confirm | No | Must be true to create the event | |
| calendar | No | Calendar name to match (optional, alternative to calendar_id) | |
| end_date | Yes | ISO 8601 datetime | |
| location | No | Location (optional) | |
| attendees | No | List of email addresses to invite (optional, CalDAV/Exchange only) | |
| start_date | Yes | ISO 8601 datetime (YYYY-MM-DDTHH:MM:SS) | |
| calendar_id | No | Calendar UUID from list_calendar_names (optional, defaults to default calendar) |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| end | No | |
| start | No | |
| title | No | |
| created | No | |
| attendees_note | No | |
| attendees_requested | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds value beyond annotations: notes that attendee invitation works only on CalDAV/Exchange calendars. However, it omits the important confirm parameter requirement (must be true to create event), which is critical for behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with front-loaded purpose. Every part earns its place: app identification, required params, optional feature caveat, and sibling alternative. No fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (9 params, output schema), the description covers the essentials: purpose, requirements, and key limitation. Could briefly mention confirm requirement but not critical due to schema coverage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds context by reiterating required params and the attendee limitation. It enhances understanding of key parameters without overwhelming detail.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it creates events in Mac's Calendar.app, with specific verb and resource. It distinguishes from the sibling m365_create_event tool, making purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use (Mac Calendar) and directs to m365_create_event for Microsoft 365. Lacks mention of other constraints like the confirm parameter requirement, but the guidance is clear and helpful.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_draftCreate DraftAInspect
Saves an email to the Mail.app Drafts folder for the user to review and send manually — never sends. Composes a new draft (pass to/subject/body), or a reply draft (pass reply_to_message_id plus body). On a multi-account Mac, pass account (an account name from list_email_accounts) or from (a sender address) to place the draft in that account's Drafts; otherwise it lands in the default account. Use this for the cautious user who wants AI-composed mail but insists on sending it themselves.
| Name | Required | Description | Default |
|---|---|---|---|
| cc | No | ||
| to | No | ||
| bcc | No | ||
| body | No | ||
| from | No | ||
| account | No | ||
| subject | No | ||
| html_body | No | ||
| reply_all | No | false | |
| reply_to_message_id | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| to | No | |
| from | No | |
| kind | No | |
| account | No | |
| subject | No | |
| saved_draft | No | |
| reply_to_message_id | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that drafts are never sent and explains account placement; annotations only provide readOnlyHint=false and destructiveHint=false, so description adds context about draft creation behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences that are front-loaded with key behavior and each adds value; could be slightly tighter but overall well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers two main use cases and account handling; output schema is present so return values don't need explanation; missing some parameter details but sufficient for correct use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description explains the roles of to, subject, body, reply_to_message_id, account, and from, covering key parameters but omitting cc, bcc, html_body, and reply_all.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool saves an email to Mail.app Drafts folder and never sends, distinguishing it from sending tools like send_email or reply_email among siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Describes when to use (cautious user wanting manual send) and how to use for new draft vs reply draft, but doesn't explicitly list alternatives or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_email_folderCreate Email FolderBInspect
Creates a new mailbox folder in Mail.app.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Folder name | |
| account | No | Account name (optional, uses default) | |
| confirm | No | Must be true to create |
Output Schema
| Name | Required | Description |
|---|---|---|
| name | Yes | |
| created | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=false and destructiveHint=false, so the description's 'Creates' aligns. No additional behavioral traits such as permission requirements or side effects are disclosed beyond what the schema provides.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence of six words, directly stating the tool's purpose with no wasted words. It is front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and full parameter descriptions, the description is minimally adequate. However, it lacks usage context and does not cover when or why to use this tool, leaving some gaps for a complete understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all three parameters described clearly. The description adds no extra meaning or usage examples for the parameters, so it meets the baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool creates a new mailbox folder in Mail.app, specifying the action and target application. However, it does not explicitly distinguish itself from other creation tools for different apps, though that is implicit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives, nor are there any prerequisites or context for selecting this tool over other folder-related tools like move_email or list_email_folders.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_noteCreate NoteBInspect
Creates a new note in Apple Notes. Requires confirm=true to execute.
| Name | Required | Description | Default |
|---|---|---|---|
| body | Yes | ||
| name | Yes | ||
| folder | No | ||
| confirm | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| name | No | |
| created | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds the requirement for confirm=true, going beyond annotations. However, it doesn't disclose other behavioral traits like permissions or side effects. Annotations already indicate non-read-only and non-destructive, so the description provides marginal additional value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two sentences and no redundancy. However, it could be more structured to improve readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 4 parameters and no nested objects, the description should cover basic parameter purposes. It lacks information on name, body, and folder, making it incomplete for agent decision-making. Output schema exists but does not compensate for missing parameter guidance.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description fails to explain any of the 4 parameters except mentioning confirm. No details on name, body, or folder are given, leaving the agent dependent on parameter names alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool creates a new note in Apple Notes, using specific verb and resource. It distinguishes from siblings like update_note and read_note by its creation action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use or when-not-to-use guidance is provided. The description only mentions a mandatory parameter but does not differentiate from alternatives like update_note or search_notes.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_omnifocus_taskCreate OmniFocus TaskADestructiveInspect
Creates a new task in OmniFocus. Requires confirm=true to execute.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | ||
| note | No | ||
| confirm | No | ||
| flagged | No | false | |
| project | No | ||
| due_date | No | ||
| defer_date | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| name | No | |
| created | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare destructiveHint=true, so the agent knows this is a mutation. The description adds the requirement for confirm=true, which is non-obvious behavioral context beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no redundancy: first states purpose, second provides a critical usage requirement. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having seven parameters and an output schema, the description only covers one parameter (confirm). It lacks context on parameter defaults (e.g., flagged defaults to false), date formats, and project usage, making it incomplete for effective tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It only mentions the confirm parameter, leaving the other six parameters (name, note, flagged, project, due_date, defer_date) unexplained.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Creates') and resource ('a new task in OmniFocus'), distinguishing it from sibling tools like complete_omnifocus_task and search_omnifocus_tasks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly requires confirm=true for execution, providing a clear prerequisite. However, it does not discuss when to use this tool versus alternatives like create_reminder or other task creation tools in the sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_referral_invitesCreate Referral InvitesAInspect
Step 2 of the colleague-invite flow. Given the recipients the user PICKED, records each invite and returns a UNIQUE referral link per person (so the user can later see who installed/activated). Call this AFTER the user chooses, then put each returned link into that person's email and send via send_email. Does NOT send anything itself. Pass lang = the language you're actually writing the invite in (the user's conversation language, e.g. "es", "en") — it's recorded with the invite.
| Name | Required | Description | Default |
|---|---|---|---|
| lang | No | ISO language of the invite you're writing (the user's conversation language, e.g. 'es', 'en'). Defaults to the Mac's language. | |
| recipients | Yes | The picked recipients. |
Output Schema
| Name | Required | Description |
|---|---|---|
| next | No | |
| invites | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds context beyond annotations: records invites, returns unique links per person, records language. No contradiction with annotations (readOnlyHint=false, destructiveHint=false). Full behavioral disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with key information. No fluff; every sentence adds value. Well-structured and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given low complexity (2 params, clear workflow) and presence of output schema (context signals), the description fully explains the tool's purpose, parameters, and return behavior. Complements schema and annotations perfectly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds value for both parameters. For 'lang', explains why it's needed ('recorded with the invite') and provides usage examples. For 'recipients', contextualizes them as 'the picked recipients' aligning with the flow.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's role as 'Step 2 of the colleague-invite flow' and specifies it records invites and returns unique referral links per person. It distinguishes itself from the send_email sibling by explicitly stating it does not send anything.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit guidance: 'Call this AFTER the user chooses, then put each returned link into that person's email and send via send_email.' It also notes what the tool does NOT do ('Does NOT send anything itself'), providing clear when-to-use and when-not.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_reminderCreate ReminderCInspect
Creates a reminder in Reminders.app.
| Name | Required | Description | Default |
|---|---|---|---|
| notes | No | Notes (optional) | |
| title | Yes | Reminder title | |
| confirm | No | Must be true to create | |
| due_date | No | ISO 8601 date (optional) | |
| priority | No | Priority: none | low | medium | high (optional) | |
| list_name | No | Reminder list name (optional) |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate a non-destructive write, but description omits the crucial requirement of a 'confirm' parameter being true. No disclosure of side effects or state changes beyond creation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, efficient and to the point. Could benefit from a slightly more structured format (e.g., including key requirements) but is not verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 6 parameters, including a required confirm flag and optional fields like due_date and list_name, the description is too sparse. Does not explain the need for confirmation or how options like priority work.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage with descriptions. Description adds no additional information about parameters, meeting the baseline expectation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it creates a reminder in Reminders.app, distinguishing it from list creation. Lacks explicit differentiation from sibling creation tools like todo_create_task, but the target app is specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives (e.g., update_reminder, complete_reminder). Does not specify prerequisites or typical use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_reminder_listCreate Reminder ListAInspect
Creates a new list in Apple Reminders (Reminders.app). Requires confirm=true.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Name for the new reminder list | |
| confirm | No | Must be true to create |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate this is not read-only and not destructive. The description adds the confirm requirement, but does not disclose behavior like duplicate handling or error states, beyond what annotations already imply.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences, with the purpose stated first and the requirement immediately following. No unnecessary text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple tool with output schema, the description adequately covers the core action and key requirement. It could mention persistence or error cases, but is sufficient for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the description adds minimal value beyond what the schema already provides. The confirm requirement is already documented in the schema description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool creates a new list in Apple Reminders, using specific verbs and resource. It distinguishes from siblings like create_reminder (creates a reminder) and list_reminder_lists (lists existing lists).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies the requirement 'confirm=true', providing a clear usage condition. However, it does not explicitly exclude alternatives or provide when-not guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
daily_briefDaily BriefARead-onlyInspect
Returns a single morning briefing combining today's calendar events, overdue and due-today reminders, and unread inbox email count + subjects. Perfect for starting each day: one call gives you everything on your plate.
| Name | Required | Description | Default |
|---|---|---|---|
| include_emails | No | Include unread email summary from Mail.app (default true, skipped gracefully if Mail is not running) |
Output Schema
| Name | Required | Description |
|---|---|---|
| date | Yes | Today's date (YYYY-MM-DD). |
| note | No | Onboarding enrichment shown when nothing is scheduled. |
| emails | Yes | Unread email summary (unread_count + recent_unread), or {skipped} / {error}, or null when not requested. |
| events | Yes | Today's calendar events. |
| reminders | Yes | Reminders due today or overdue. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint and destructiveHint. Description adds value by noting that email inclusion can be skipped gracefully if Mail is not running, providing extra behavioral insight.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with front-loaded content: first describes output, second gives use case. No extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and annotations, the description adequately covers what the tool does and key behaviors. It could mention prerequisites or limitations but remains sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for the single parameter, and the schema description is detailed. The tool description does not add additional parameter information beyond what the schema provides, meeting the baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it returns a morning briefing combining calendar events, reminders, and unread email summary. This distinguishes it from sibling tools that provide individual data sources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly suggests use for starting each day, implying context. However, it does not explicitly state when not to use or name alternatives, leaving some ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_calendar_eventDelete Calendar EventADestructiveInspect
Deletes an event from the Mac's Calendar app (Calendar.app) by ID. Requires confirm=true. For Microsoft 365 use m365_delete_event instead.
| Name | Required | Description | Default |
|---|---|---|---|
| confirm | No | Must be true to delete | |
| event_id | Yes | Event identifier from list_events |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark destructiveHint=true. The description adds that the deletion requires explicit confirmation (confirm=true), which is a key behavioral trait beyond the annotation. It does not contradict and provides useful safety context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences with no redundancy. The most critical information (action, target, requirement, alternative) is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple delete operation with annotations and likely output schema, the description covers the essential aspects: action, target, required parameter, and sibling tool. Minor omissions like return behavior are acceptable given the structured context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters described. The description adds meaning by noting that event_id is the identifier 'from list_events' and that confirm must be true to execute, complementing the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the action (deletes), the target resource (event from Mac's Calendar.app), the method (by ID), and distinguishes from the M365 variant. This provides clear, specific purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description includes a requirement (confirm=true) and a sibling alternative (m365_delete_event), offering guidance on when to use this tool. However, it does not elaborate on prerequisites or other exclusion criteria, but the context is largely sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_reminderDelete ReminderADestructiveInspect
Permanently deletes a reminder in Apple Reminders (Reminders.app) by ID. Get the reminder_id from list_reminders. Requires confirm=true.
| Name | Required | Description | Default |
|---|---|---|---|
| confirm | No | Must be true to delete | |
| reminder_id | Yes | Reminder identifier from list_reminders |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true, readOnlyHint=false. The description adds that deletion is permanent and requires confirmation, which goes beyond the structured annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three short, front-loaded sentences with no unnecessary words. The critical information is immediately available.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of annotations, output schema, and full schema coverage, the description covers all essential aspects: purpose, source of identifier, and required confirmation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds context for reminder_id (source from list_reminders) and reiterates the confirmation requirement, adding value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states 'Permanently deletes a reminder in Apple Reminders (Reminders.app) by ID', providing a specific verb, resource, and method. It clearly differentiates from siblings like 'delete_reminder_list' and 'complete_reminder'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description tells the user to get the reminder_id from list_reminders and that confirm=true is required. It does not explicitly state when not to use it versus alternatives, but the constraints are clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_reminder_listDelete Reminder ListADestructiveInspect
Deletes an Apple Reminders list AND all reminders inside it — cannot be undone. Pass the list name (or list_id from list_reminder_lists). Requires confirm=true.
| Name | Required | Description | Default |
|---|---|---|---|
| name | No | List name to delete (or pass list_id) | |
| confirm | No | Must be true to delete | |
| list_id | No | List identifier from list_reminder_lists (alternative to name) |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description goes beyond annotations by specifying that all reminders inside the list are also deleted and the action cannot be undone. Annotations already set destructiveHint=true, but the description adds critical context about cascading destruction and irreversibility.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise, front-loaded sentences. First sentence states purpose and behavioral impact, second gives usage instructions. Every sentence provides essential information with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all essential aspects: what the tool does, what it affects (list + reminders), how to identify the list, and confirmation requirement. References sibling tool list_reminder_lists for obtaining list_id. Output schema exists, so return value explanation is unnecessary.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with descriptions. The description adds value by clarifying that name and list_id are alternatives and that confirm must be true, providing meaning beyond the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool deletes an Apple Reminders list and all its contents, with explicit verb and resource. It distinguishes from siblings like rename_reminder_list or delete_reminder by specifying cascading deletion and irreversibility.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit guidance on how to specify the list (name or list_id from list_reminder_lists) and the requirement for confirm=true. It implicitly warns against accidental use by emphasizing irreversibility, though does not explicitly list when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
disconnect_m365_accountDisconnect Microsoft 365 AccountBInspect
Disconnect your Microsoft 365 account and remove stored tokens.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| message | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description says 'remove stored tokens', which implies a destructive or state-changing action. However, annotations set destructiveHint=false, creating a contradiction. No additional behavioral context beyond the conflicting statement.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no wasted words. Clearly front-loads the action and purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool is simple with no parameters and has an output schema. The description covers the basic function, but could benefit from noting consequences (e.g., need to re-authenticate to use M365 tools again).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
There are no parameters, and the schema coverage is 100%. The description adds no parameter information, which is acceptable for zero parameters. Baseline 4 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action 'disconnect' and the resource 'Microsoft 365 account', and specifies what it does: removes stored tokens. This distinguishes it from the sibling connect_m365_account.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidelines on when to use this tool versus alternatives (e.g., connect_m365_account). No exclusions or context provided about when disconnection is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
disconnect_servicenowDisconnect ServiceNowAInspect
Disconnect from ServiceNow and remove stored credentials.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | Yes | |
| message | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description states 'remove stored credentials,' which is a destructive action, but the annotations have destructiveHint: false. This is a direct contradiction. The description does not disclose other behaviors or consequences.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence with no wasted words. It is front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a zero-parameter tool with an output schema, the description covers the purpose but lacks behavioral context due to the contradiction with annotations. It does not explain when disconnection is appropriate or what side effects occur.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has no parameters (0 params), and the schema coverage is 100%. The description does not add any parameter information, but none is needed. Baseline 4 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action: 'Disconnect from ServiceNow and remove stored credentials.' It uses a specific verb ('Disconnect') and resource ('ServiceNow'), and distinguishes from sibling tools like connect_servicenow and servicenow_* operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use the tool (when disconnecting is needed), but it does not explicitly state when to use alternatives (e.g., connect_servicenow) or provide exclusions. Additional guidance would improve clarity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
excel_createExcel CreateAInspect
Creates a new Excel (.xlsx) file with headers and optional data rows.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Output path for the .xlsx file | |
| rows | No | Array of row arrays with data (optional) | |
| confirm | No | Must be true to create | |
| headers | Yes | Column headers |
Output Schema
| Name | Required | Description |
|---|---|---|
| path | Yes | Path of the created .xlsx file |
| rows | Yes | Number of data rows written |
| created | Yes | True when the file was created |
| headers | Yes | Column headers written |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate non-read-only and non-destructive behavior. The description adds that the file includes headers and optional data rows but does not disclose the required confirm parameter or behavior if the file already exists. With annotations present, the added context is adequate but incomplete.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that is concise, front-loaded with the key action, and contains no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 4 parameters and annotations, the description is minimal. It does not mention the mandatory confirm parameter or any return value details (though an output schema exists). It adequately covers the basic purpose but lacks completeness for a mutating tool with a safety flag.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters. The description adds minimal extra meaning beyond restating headers and optional data rows. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool creates a new Excel (.xlsx) file with headers and optional data rows. It uses a specific verb and resource, distinguishing it from siblings like excel_read and excel_write_cell.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when a new Excel file is needed but provides no explicit guidance on when to use this tool versus alternatives, such as reading or modifying existing files. No exclusions or context are given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
excel_readExcel ReadARead-onlyInspect
Reads data from an Excel (.xlsx) file. Returns sheets and cell values.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the .xlsx file | |
| max_rows | No | Max rows to return (default 100) | |
| sheet_name | No | Sheet name to read (optional, reads first sheet) |
Output Schema
| Name | Required | Description |
|---|---|---|
| rows | Yes | Row data as arrays of cell-value strings |
| count | Yes | Number of rows returned |
| sheet | Yes | Name of the sheet that was read |
| sheets | Yes | All sheet names in the workbook |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, so the description confirms a safe read operation. It adds 'returns sheets and cell values' but does not disclose limitations like max_rows default or behavior when sheet_name is omitted. The output schema likely covers return structure, so description adds marginal behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two short, front-loaded sentences with no redundancy. Every word is essential and efficiently conveys the tool's purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is complete enough for a simple read tool with a full output schema and 100% schema coverage, but it lacks mention of default max_rows (100) and the fact that sheet_name is optional. These are covered in the schema, but the description could be more explicit about behavioral defaults.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the description does not need to add parameter explanations. It does not provide additional meaning beyond the schema (e.g., path format or max_rows usage), earning a baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool reads data from Excel files and returns sheets and cell values, distinguishing it from sibling tools like excel_create (creates) and excel_write_cell (writes).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., for writing or editing). It only states what it does, without specifying context or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
excel_write_cellExcel Write CellCInspect
Writes a value to a specific cell in an Excel file.
| Name | Required | Description | Default |
|---|---|---|---|
| row | Yes | Row number (1-based) | |
| path | Yes | Path to the .xlsx file | |
| value | Yes | Value to write | |
| column | Yes | Column number (1-based) | |
| confirm | No | Must be true to modify | |
| sheet_name | No | Sheet name (default: first sheet) |
Output Schema
| Name | Required | Description |
|---|---|---|
| col | Yes | Column that was written (1-based) |
| row | Yes | Row that was written (1-based) |
| value | Yes | Value written to the cell |
| written | Yes | True when the cell was written |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=false, but the description does not clarify whether writing overwrites existing values or if the file must exist. With destructiveHint=false, the agent might underestimate the modification impact.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence with no wasted words. While it is efficient, it could benefit from additional context without being verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 6 parameters and an output schema, the description is minimally adequate. It does not explain the return value, the requirement for the 'confirm' parameter, or the default sheet behavior, but these are partially covered by the schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the parameters are fully documented in the schema. The description adds no extra meaning, meeting the baseline expectation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool writes a value to a specific cell in an Excel file, distinguishing it from sibling tools like excel_read (reads) and excel_create (creates files). However, it does not explicitly mention that it overwrites existing content.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as excel_read or excel_create. It lacks context for selecting the tool appropriately.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
finder_listFinder ListARead-onlyInspect
Lists files and folders at any absolute path (Spotlight-free directory listing, including outside the home folder). For sandboxed reads under the user's home, prefer fs_list.
| Name | Required | Description | Default |
|---|---|---|---|
| path | No | Absolute path to list (default: ~) | |
| limit | No | Max items (default 100) |
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| path | No | |
| count | No | Items returned in this response. |
| items | No | |
| total | No | Total items when the listing was truncated. |
| truncated | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the description adds context beyond this: 'Spotlight-free directory listing' and 'including outside the home folder'. This informs the agent about the tool's scope and reliance on Spotlight, which is useful behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of two concise sentences. The first sentence states the purpose and key features, while the second provides usage guidance. No unnecessary words, efficiently front-loading essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of annotations and an output schema, the description is complete. It covers the tool's purpose, key differentiators, and usage context. The agent has enough information to decide when and how to invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the schema fully documents both parameters (path and limit) with descriptions. The tool description adds no additional parameter-specific meaning beyond what the schema provides, so the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's function: 'Lists files and folders at any absolute path'. It distinguishes itself from the sibling fs_list by noting it is 'Spotlight-free' and can list 'including outside the home folder', which differentiates it effectively.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells when to prefer the alternative fs_list ('For sandboxed reads under the user's home, prefer fs_list'), providing a clear usage guideline. While it doesn't mention other siblings, the key alternative is addressed.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
finder_searchFinder SearchARead-onlyInspect
Searches for files by name anywhere on the filesystem (uses mdfind/Spotlight).
| Name | Required | Description | Default |
|---|---|---|---|
| path | No | Limit search to this directory (optional) | |
| limit | No | Max results (default 50) | |
| query | Yes | Filename or content to search for |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | |
| query | Yes | |
| results | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds that it uses mdfind/Spotlight, implying reliance on system indexing, but does not disclose potential limitations (e.g., unindexed files, performance). Some added context but not rich.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single concise sentence that conveys the core purpose and implementation details without unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, the description adequately covers the tool's function (searches files by name system-wide). It could mention reliance on Spotlight indexing for completeness, but overall sufficient for a simple search tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with each parameter described in the schema. The description does not add additional meaning or usage tips beyond what the schema provides, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the verb 'searches', resource 'files by name anywhere on the filesystem', and the underlying mechanism 'uses mdfind/Spotlight'. This distinguishes it from sibling tools like fs_search or finder_list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies system-wide search via 'anywhere on the filesystem', but lacks explicit guidance on when to use this tool versus alternatives like fs_search or finder_list. No when-not-to-use conditions or alternative names provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fs_listFile ListARead-onlyInspect
Lists files and folders in a local directory. Defaults to the user's home directory. Returns name, path, type (file/directory), size, and modification date for each item. Sorted: directories first, then files, both alphabetically.
| Name | Required | Description | Default |
|---|---|---|---|
| path | No | Absolute path to the directory. Defaults to the home directory (~) if omitted. | |
| show_hidden | No | Include hidden files (starting with '.'). Default false. |
Output Schema
| Name | Required | Description |
|---|---|---|
| dirs | No | |
| path | No | |
| count | No | |
| files | No | |
| items | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds useful context beyond annotations: returns specific fields (name, path, type, size, modification date), sorting order (directories first, alphabetical), and default path. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences with no wasted words; each sentence adds essential information. Front-loaded with the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given output schema existence, description adequately covers default behavior, returned fields, and sorting. Could mention non-recursive nature, but not essential for a basic listing tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description does not add meaning beyond what the schema already provides. The default path hint is duplicated from the schema description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it lists files and folders in a local directory. It does not explicitly differentiate from siblings like finder_list or fs_search, but the purpose is specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage for listing local directory contents with default home directory. No explicit when-to-use or when-not-to-use compared to alternative tools like finder_list or cloud file listers.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fs_readFile ReadARead-onlyInspect
Reads a text file from the local filesystem. Supports .txt, .md, .csv, .json, .xml, .log, .yaml, .toml and common code file types. For PDFs use pdf_read, for Word use word_read, for Excel use excel_read.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the file | |
| offset | No | Start reading at this byte offset (default 0) | |
| max_bytes | No | Maximum bytes to read (default 1 MB, max 10 MB) |
Output Schema
| Name | Required | Description |
|---|---|---|
| path | Yes | Resolved absolute path of the file |
| bytes | Yes | Total file size in bytes |
| offset | No | Byte offset the read started at |
| content | Yes | Decoded file text content |
| encoding | No | Encoding used to decode (utf8 | cp1252 | latin1) |
| truncated | No | True if more content remains beyond what was returned |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and non-destructive. Description adds value by specifying supported file types, default and max byte limits (1 MB, 10 MB), which are behavioral traits beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first states core action, second lists supported types and alternative tools. Extremely concise with zero waste and well-front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple read tool with output schema, the description covers purpose, usage guidelines, supported types, and size limits. Complete and self-contained.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema covers all 3 parameters with descriptions (100% coverage). Description adds value by listing supported file types not in schema, though it doesn't detail parameters further. Baseline 3, plus for file type info.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Reads a text file from the local filesystem' and lists supported file types, distinguishing it from sibling tools like pdf_read, word_read, and excel_read.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when not to use this tool: 'For PDFs use pdf_read, for Word use word_read, for Excel use excel_read,' providing clear guidance on alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fs_searchFile SearchARead-onlyInspect
Searches for files and folders by name (case-insensitive, partial match) starting from a root directory. Defaults to the home directory. Returns matching items with path, type, and size.
| Name | Required | Description | Default |
|---|---|---|---|
| root | No | Root directory to search from. Defaults to home directory (~). | |
| query | Yes | Filename pattern to search for (partial, case-insensitive) | |
| file_type | No | Filter by extension, e.g. 'pdf', 'docx', 'xlsx'. Omit for all types. | |
| max_results | No | Maximum number of results to return. Default 50, max 200. |
Output Schema
| Name | Required | Description |
|---|---|---|
| root | No | |
| count | No | |
| query | No | |
| results | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the description adds value by specifying case-insensitive partial matching, root directory defaulting, and return fields. It does not contradict annotations. Some behavioral details (e.g., recursion depth, handling of symlinks) are omitted, but overall it is sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loading key details (search behavior, root directory, default, return fields). There is no unnecessary information or repetition. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity, annotations, and presence of output schema, the description covers essential information. It could explicitly state that it searches only local filesystem (not cloud or virtual drives) to be more complete, but this is implicitly clear from the context of sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by providing examples for 'file_type' (e.g., pdf, docx) and clarifying defaults for 'root' (home directory) and 'max_results' (50, max 200). This enhances understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it searches for files and folders by name with case-insensitive partial matching, starting from a root directory, defaulting to home. It distinguishes from sibling tools like 'finder_search' (macOS Finder) and 'gdrive_search_files' (Google Drive) by specifying local filesystem search and return fields (path, type, size).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description explains what the tool does, it does not provide explicit guidance on when to use this tool versus alternatives like 'fs_list' for directory listing or 'gdrive_search_files'. Usage context is implied from the tool name and description but lacks direct comparison or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
gdrive_file_infoGdrive File InfoARead-onlyInspect
Metadata for a file/folder in the synced Google Drive: size, dates, type. Cheaper than listing the whole directory.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the file or folder |
Output Schema
| Name | Required | Description |
|---|---|---|
| name | No | |
| path | No | |
| size | No | |
| type | No | |
| created | No | |
| modified | No | |
| size_human | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true and destructiveHint=false. The description adds that it is 'cheaper' and specifies return metadata fields, but does not disclose error handling or limitations. It adds some value beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence plus a short clarifying sentence. It is concise, front-loaded, and contains no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has an output schema, the description adequately covers the purpose and cost comparison. It is mostly complete for a simple metadata retrieval, though missing error behavior information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with one parameter 'path' clearly described. The description does not add additional meaning to the parameter beyond what the schema provides, so baseline score applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves metadata (size, dates, type) for a single file/folder. It explicitly distinguishes from sibling tools by noting it is cheaper than listing the whole directory, which differentiates from gdrive_list_files.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use (for single file metadata) and hints at an alternative (listing the whole directory). However, it does not explicitly state exclusions or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
gdrive_list_filesGdrive List FilesARead-onlyInspect
Lists files and folders in a Google Drive path (the locally-synced folder). Use gdrive_root first for valid roots — 'My Drive' and 'Shared drives' live inside each mount. Returns up to limit entries (default 1000).
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the Google Drive folder | |
| limit | No | Max entries (default 1000, max 5000) |
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| count | No | |
| items | No | |
| total | No | |
| truncated | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true. Description adds value by explaining the limit behavior (default 1000, up to limit) and mentioning it operates on locally-synced folders, which aids understanding of scope.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences with no wasted words. Front-loaded with primary purpose, followed by necessary usage and behavior details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, the description doesn't need to detail return values. It covers prerequisite, parameter usage, and limit behavior adequately. Minor omission: no mention of error handling for invalid paths.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both parameters have descriptions). The description does not add new meaning beyond the schema's own parameter descriptions, merely restating the limit default.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Lists files and folders in a Google Drive path', with specific verb and resource. It distinguishes from sibling tools by mentioning prerequisite of gdrive_root and specifying the limit parameter.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises to use gdrive_root first for valid roots, providing context for correct usage. Does not explicitly state when not to use or contrast with alternatives, but gives valuable prerequisite guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
gdrive_read_fileGdrive Read FileARead-onlyInspect
Reads a text file from the synced Google Drive folder (.txt, .md, .csv, .json, code files...). Note: native Google Docs/Sheets/Slides sync as .gdoc/.gsheet pointers, not real files — export them from Drive or read Office/PDF copies instead. Auto-detects UTF-8 with Latin-1/CP1252 fallback.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the file | |
| offset | No | Start byte offset (default 0) | |
| encoding | No | 'auto' (default), 'utf8', 'latin1', 'cp1252', 'ascii', 'utf16' | |
| max_bytes | No | Max bytes (default 1MB, cap 10MB) |
Output Schema
| Name | Required | Description |
|---|---|---|
| path | Yes | Absolute path of the file |
| bytes | Yes | Total file size in bytes |
| offset | No | Byte offset the read started at |
| content | Yes | Decoded file text content |
| encoding | No | Encoding used to decode (utf8 | cp1252 | latin1 | ascii | utf16) |
| truncated | No | True if more content remains beyond what was returned |
| bytes_read | No | Number of bytes read in this slice |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the agent knows it's safe. The description adds value by detailing auto-detection of UTF-8 with Latin-1/CP1252 fallback, which is behavioral context beyond the annotations. However, it does not cover all possible behaviors (e.g., what happens if file is binary).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences long, front-loads the purpose, and includes an important caveat and encoding behavior without any fluff. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read tool with an output schema (present but not shown), the description covers file types, encoding, and the important limitation regarding native Google Docs. It is complete and informative.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage with descriptions for all four parameters. The description adds no new parameter-specific details beyond what the schema provides, except for mentioning encoding auto-detection which relates to the encoding parameter. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool reads text files from the synced Google Drive folder, lists supported extensions, and explicitly distinguishes it from native Google Docs which are not real files. This provides a specific verb+resource and differentiates from potential confusion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly warns against using this tool for native Google Docs/Sheets/Slides and suggests alternatives: export them or read Office/PDF copies. This provides clear when-to-use and when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
gdrive_rootGdrive RootARead-onlyInspect
Lists the Google Drive folders synced on this Mac (My Drive, Shared drives, per-account mounts). Start here to get valid paths for the other gdrive_* tools. Reads the folder Google Drive for Desktop already syncs — no Google API, no OAuth.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| roots | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnly=true and destructive=false. The description adds value by explaining it reads from Google Drive for Desktop's local sync, no API/OAuth, which is consistent and informative beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each serving a distinct purpose: purpose (list folders), usage guidance (starting point), and technical context (local sync, no API). No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters, existing annotations, and an output schema, the description fully covers necessary context: what it does, when to use, and how it works technically.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
There are no parameters in the input schema, so the description does not need to add parameter meaning. Baseline for 0 parameters is 4; the description adds no parameter info but this is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the tool's purpose: listing Google Drive folders synced on this Mac (My Drive, Shared drives, per-account mounts). The verb 'Lists' and explicit resource differentiation distinguish it from sibling gdrive_* tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool: 'Start here to get valid paths for the other gdrive_* tools.' This provides clear guidance on usage context and prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
gdrive_search_filesGdrive Search FilesARead-onlyInspect
Searches the synced Google Drive folder for files by name (recursive). Returns up to max_results matches (default 50).
| Name | Required | Description | Default |
|---|---|---|---|
| root | No | Restrict to this Drive path (optional - defaults to all mounts) | |
| query | Yes | Filename pattern to search for | |
| max_results | No | Maximum results (default 50) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| results | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, confirming it is a safe read operation. The description adds that the search is recursive and returns up to max_results (default 50), which is helpful. However, it does not disclose whether it searches file contents or just filenames (it says 'by name', which is clear enough), nor does it mention that the root parameter defaults to all mounts (schema already covers this). The behavioral additions are adequate but not extensive, earning a 3.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences with no wasted words. The first sentence states the core action and method; the second clarifies the result limit. It is front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple search tool with 100% schema coverage, clear annotations, and an output schema, the description is sufficiently complete. It covers the search method (by name, recursive), result limit, and context (synced folder). No critical information is missing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds value by mentioning recursion and the default max_results value (50), but these are already implied or explicit in the schema. No additional semantic depth beyond the schema is provided, so a 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'searches', the resource 'synced Google Drive folder', and the method 'by name (recursive)'. It is distinct from sibling tools like gdrive_list_files (lists all files) and gdrive_file_info (gets file details), making it easy for an agent to select this tool for name-based file search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies it should be used for searching files by name in the synced Drive folder, but does not explicitly state when to use it versus alternatives like finder_search or onedrive_search_files, nor does it mention when not to use it (e.g., for content search). The usage context is implied but not detailed.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
gdrive_write_fileGdrive Write FileAInspect
Writes or overwrites a text file in the synced Google Drive folder — it uploads automatically via the official client. First call returns a preview; pass confirm=true to write.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path under a Google Drive mount | |
| confirm | No | Must be true to actually write | |
| content | Yes | Text content to write |
Output Schema
| Name | Required | Description |
|---|---|---|
| path | Yes | |
| bytes | Yes | |
| written | Yes | |
| overwrote | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations (readOnlyHint=false, destructiveHint=false) already indicate it's a write operation but not destructive? The description adds critical behavioral context: the two-step process with preview, the need for confirmation, and automatic upload. This clarifies the side effect beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences. The first states the primary function and mechanism. The second explains the crucial preview/confirmation workflow. No unnecessary words or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a write tool with 3 fully documented parameters and an output schema, the description covers the essential behavioral pattern. It might be slightly improved by noting that it only accepts text content (implied by string type), but overall it's complete enough for agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so parameters are fully described in the schema. The description adds value by explaining the confirm parameter's role ('first call returns a preview; pass confirm=true to write'), which is not evident from the schema alone. Path and content are clear from their descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the action (write/overwrite) and resource (text file in the synced Google Drive folder). Distinguishes itself from sibling tools like gdrive_read_file and onedrive_write_file through its specific focus and the preview-then-confirm workflow.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit usage pattern: first call returns a preview, then pass confirm=true to actually write. This guides the agent on the required sequence. However, it doesn't explicitly compare to alternatives or state conditions for not using it, though the purpose is clear enough given the sibling context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_audit_logGet Audit LogARead-onlyInspect
Returns recent LMCP tool call history from the local audit log. Each entry shows timestamp, tool name, call source (local/cloud), success status, and duration. Useful for GDPR Article 30 compliance reporting and debugging.
| Name | Required | Description | Default |
|---|---|---|---|
| ok | No | Filter to successes (true) or failures (false) only (optional) | |
| tool | No | Filter to entries for a specific tool name (optional) | |
| limit | No | Number of recent entries to return (default 50, max 200) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| entries | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, indicating no side effects. The description adds value by specifying that results are from the 'local audit log' and listing the exact fields returned, which is beyond what annotations convey. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences: the first states the core action and output, the second adds use cases. Every sentence is purposeful with no redundant information, making it efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present and full schema coverage, the description adequately covers the tool's purpose, output structure, and use cases. It does not need to explain return values further. The combination of annotations, schema, and description is sufficient for an AI agent to select and invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage for its three parameters (ok, tool, limit), so the description does not need to add parameter details. It adds no new semantic meaning beyond what the schema already provides, meeting the baseline of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns 'recent LMCP tool call history from the local audit log' and specifies the fields: timestamp, tool name, call source, success status, and duration. It distinguishes from sibling tools—none of which serve an auditing purpose—by its unique verb ('get') and resource ('audit log').
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly mentions two use cases: 'GDPR Article 30 compliance reporting and debugging,' providing clear context for when to use the tool. However, it does not mention when not to use it or list alternative tools, which would have improved the score.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_configGet ConfigARead-onlyInspect
Returns the current LMCP configuration (api_key masked).
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and non-destructive behavior. The description adds a key behavioral detail: the api_key is masked in the output. This goes beyond what annotations provide, fully disclosing the important masking behavior. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that is concise and front-loaded. It communicates the essential information without any superfluous words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has no parameters, a strong set of annotations (readOnlyHint, destructiveHint), and an output schema (though not shown), the description is complete. It covers what the tool returns and the masking behavior, which is sufficient for an agent to understand its use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has no parameters, so schema coverage is effectively 100%. The description does not need to add parameter information. Baseline for 0 parameters is 4, which is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns the current LMCP configuration with the API key masked. It uses specific verb 'returns' and specific resource 'LMCP configuration', distinguishing it from sibling tools like lmcp_state (which likely returns state information).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives. While the purpose is clear, the description does not mention when to prefer this over other read-only tools like lmcp_state or provide any context about prerequisites or typical use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_contactGet ContactARead-onlyInspect
Gets a contact from the Mac's Contacts app (Contacts.app) by name or ID. Pass name to look up directly by name (no need to search_contacts first — if several people match it returns a compact list to choose from), or contact_id for an exact lookup. For Microsoft 365 use m365_get_contact instead.
| Name | Required | Description | Default |
|---|---|---|---|
| name | No | Full or partial contact name — the one-step path. Provide this OR contact_id. | |
| contact_id | No | Exact identifier from list_contacts/search_contacts. Provide this OR name. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds behavioral details beyond annotations: describes compact list returned on multiple name matches and exact lookup with contact_id.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two efficient sentences, front-loaded with purpose, then specific guidance. No waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given simple 2-param tool with output schema and annotations, description fully covers scope, behavior, and alternatives.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description enriches behavior meaning for each parameter, especially the compact list behavior for name, adding value beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States it gets a contact from Mac's Contacts.app by name or ID, distinguishing from sibling tools like search_contacts and m365_get_contact.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly explains when to use name vs contact_id, notes that search_contacts is unnecessary, and directs to m365_get_contact for Microsoft 365.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_datetimeGet DatetimeARead-onlyInspect
Get the current date and time of the machine where LMCP runs — with timezone and UTC offset. Call this whenever you need the real 'now' on the user's computer: before creating calendar events or reminders, resolving relative dates like 'today'/'tomorrow'/'next Friday', or timestamping. Takes no arguments.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| human | Yes | Human-readable local date/time. |
| iso_utc | Yes | Current time in ISO 8601, UTC. |
| weekday | Yes | |
| timezone | Yes | IANA timezone identifier. |
| iso_local | Yes | Current time in ISO 8601 with the machine's local UTC offset. |
| utc_offset | Yes | UTC offset like +02:00. |
| epoch_seconds | Yes | Unix epoch seconds. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds context that it returns timezone and UTC offset, and that it's the machine's local time. This is helpful but could detail the time format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: first stating the function, second providing usage scenarios and confirming no parameters. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple zero-parameter tool with an output schema, the description sufficiently covers what the tool returns (datetime with timezone and offset) and when to use it. Output schema likely details format, so no further info needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, and the description explicitly states 'Takes no arguments,' which adds value beyond the empty schema by confirming the tool requires no input.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns the current date and time with timezone and UTC offset, using a specific verb and resource. No sibling tool provides similar functionality, so it stands out.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly advises when to use it: before creating calendar events, resolving relative dates, or timestamping. It also notes it takes no arguments, providing complete usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_m365_personGet Microsoft 365 PersonARead-onlyInspect
Get detailed information about a specific person in your Microsoft 365 directory by their user ID or email address. Use 'me' to get the currently authenticated user's profile.
| Name | Required | Description | Default |
|---|---|---|---|
| id | Yes | User ID (GUID), email address (UPN), or 'me' for the authenticated user, e.g. 'sarah@contoso.com', 'a1b2c3d4-...', or 'me' |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| upn | No | |
| name | No | |
| No | ||
| title | No | |
| mobile | No | |
| office | No | |
| phones | No | |
| department | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, openWorldHint=false, destructiveHint=false. The description adds minor context (e.g., using 'me' for authenticated user) but does not contradict annotations. It does not disclose additional behaviors like rate limits or schema of returned data, but annotations suffice.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences that front-load the core purpose and a helpful shorthand ('me'). No redundant or extraneous information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple, single-parameter tool with a complete schema and output schema, the description fully covers the tool's function. It is complete and self-contained.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the schema already describes the parameter id with examples. The description does not add semantic value beyond what the schema provides, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves detailed information about a specific person in the Microsoft 365 directory using user ID, email, or 'me'. It distinguishes itself from sibling tools like m365_list_contacts and search_m365_directory by focusing on a single person by identifier.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when a specific identifier is known but provides no explicit guidance on when to use alternatives like search_m365_directory for searching or list_m365_people_insights for insights. No when-not-to-use advice is given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_accountsList AccountsARead-onlyInspect
Lists email accounts configured in Mail.app.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | |
| accounts | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, indicating safe read. The description adds the Mail.app context but does not elaborate on return format or behavior beyond that.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that conveys the essential purpose with no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters and an output schema, the description is largely sufficient but could mention what specific data is returned (e.g., account names).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
There are zero parameters, so baseline is 4 per guidelines. The description appropriately states no parameters are needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the verb 'lists', the resource 'email accounts', and the context 'in Mail.app', distinguishing it from generic account listing tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like list_emails or list_email_folders, leaving the agent to infer usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_calendar_eventsList Calendar EventsARead-onlyInspect
Lists events from the Mac's Calendar app (Calendar.app, local/iCloud calendars) in a date range. Defaults to today + 7 days. For a Microsoft 365 calendar use m365_list_events instead.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max number of events to return (most recent first within the range). Optional; defaults to all in range. | |
| calendar | No | Filter by calendar name — partial, case-insensitive match (optional). Use list_calendar_names to see available names. | |
| end_date | No | ISO 8601 date (YYYY-MM-DD). Defaults to start_date + 7 days. | |
| start_date | No | ISO 8601 date (YYYY-MM-DD). Defaults to today. | |
| calendar_id | No | Filter by a single calendar UUID from list_calendar_names (optional). | |
| calendar_ids | No | Filter by multiple calendar UUIDs (optional). |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| events | No | |
| end_date | No | |
| start_date | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds that it accesses Mac's Calendar.app, local/iCloud, and defaults to today+7 days, which is useful beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences containing only essential information: purpose, scope, defaults, and alternative. No redundant words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers source, default range, and alternative tool. Output schema exists, so return format explanation not needed. Missing detail on what happens if no calendar filter is provided, but schema implies all calendars are included.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description does not detail individual parameters, but schema descriptions are comprehensive. Description adds no extra parameter meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Explicitly states it lists events from Mac's Calendar app, specifying local/iCloud calendars. Mentions default date range and directly distinguishes from m365_list_events for Microsoft 365 calendars.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides default range and explicitly tells when to use an alternative (Microsoft 365). Lacks guidance on when to apply specific parameters like calendar vs calendar_id, but overall clear context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_calendar_namesList Calendar NamesARead-onlyInspect
Lists the calendars in the Mac's Calendar app (Calendar.app, local/iCloud). For Microsoft 365 calendars use the m365 calendar tools instead.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| calendars | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds context about source (Mac Calendar app, local/iCloud), but no additional behavioral traits beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, no wasted words. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No parameters to document; output schema presumably covers return values. Description fully covers what the tool does and when to use it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters in input schema, so schema coverage is 100%. Baseline for zero parameters is 4; no need for additional description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Lists the calendars in the Mac's Calendar app' with specific verb and resource. Distinguishes from M365 calendars by mentioning alternative tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use this tool (for Mac native calendars) and when not (for M365, use m365 calendar tools). Provides clear alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_contactsList ContactsARead-onlyInspect
Lists contacts from the macOS Contacts app. Optionally filter by group.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max contacts to return (default 100) | |
| group_name | No | Filter by group name (optional) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | |
| contacts | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, indicating safe read operation. Description adds no further behavioral context such as pagination, rate limits, or response size.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no wasted words. Essential information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool is simple with only 2 optional params and output schema (unseen but present). Description adequately covers core functionality; missing mention of default limit or that it returns a list, but schema compensates. Minor gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, with both parameters documented. Description only restates the group filter option, adding no new meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb 'Lists' and resource 'contacts from the macOS Contacts app', with optional group filtering. It distinguishes from sibling tools like m365_list_contacts by specifying it's macOS-specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description implies usage for listing macOS contacts but does not explicitly state when to use this tool versus alternatives like search_contacts or m365_list_contacts. No guidance on not using it for M365 contacts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_email_accountsList Email AccountsARead-onlyInspect
Lists all Mail.app accounts by name. Call this first to discover account names, then use list_emails(account=name) to fetch messages from a specific account. Much faster than querying all accounts at once.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| tip | No | |
| count | No | Number of accounts. |
| accounts | No | Mail.app accounts, by name. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and non-destructive. Description adds performance advantage ('much faster than querying all accounts at once'), which is a useful behavioral trait beyond annotation scope.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, front-loaded with main purpose. No extraneous words, every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Simple tool with no parameters and output schema present. Description fully covers its role, usage sequence, and performance note. Complete for its complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters in schema, so schema coverage is 100%. Per rubric, baseline for 0 parameters is 4. Description adds no param info because none are needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists all Mail.app accounts by name. It includes specific verb 'list' and resource 'accounts', and distinguishes itself from siblings by positioning as a discovery step before list_emails.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises to call this first to discover account names, then use list_emails(account=name) to fetch messages. Provides clear sequence and alternative usage, with the performance benefit stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_email_foldersList Email FoldersARead-onlyInspect
Lists the full folder (mailbox) tree for Apple Mail (Mail.app) accounts, including nested subfolders. Use this to discover the exact folder names that move_email(target_mailbox=...) and list_emails(mailbox=...) expect. Outlook.com, Exchange, Gmail, iCloud and IMAP accounts added to Mail.app are all included. For a Graph-only Microsoft 365 mailbox not added to Mail.app, use m365_list_emails instead.
Pass account= (from list_email_accounts) to enumerate one account fully; without it, every account is walked which can be slow on macOS 15+. Message counts are off by default (slow on IMAP) — pass include_counts=true to add unread/total per folder.
| Name | Required | Description | Default |
|---|---|---|---|
| account | No | ||
| include_counts | No | false |
Output Schema
| Name | Required | Description |
|---|---|---|
| accounts | No | |
| truncated | No | |
| folder_count | No | |
| next_actions | No | |
| account_count | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark it as read-only and non-destructive. Description adds behavioral context: it walks all accounts if no account specified (slow), default for include_counts is false, and performance trade-offs. No contradictions. Could mention response format or pagination but not needed given output schema exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two paragraphs with front-loaded purpose and clear structure. First paragraph states main function and alternatives; second details parameters. Slightly verbose but efficient. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given only 2 parameters, no required ones, and an output schema exists, the description covers all relevant aspects: purpose, usage, parameter details, performance notes, and alternatives. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 0% schema coverage, the description fully explains both parameters: account (from list_email_accounts, optional, speeds up), include_counts (boolean, default false, adds counts, slow on IMAP). This adds meaning beyond the bare schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool lists the full folder tree for Apple Mail accounts, including nested subfolders. Distinguishes its purpose from siblings like move_email and list_emails by explaining it discovers folder names those tools expect. Also mentions an alternative for Graph-only mailboxes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit guidance on when to use (discover folder names for move_email and list_emails) and when not (for Graph-only mailbox, use m365_list_emails). Also covers performance considerations: passing account avoids slowness on macOS 15+, and warning that include_counts is slow on IMAP.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_emailsList EmailsARead-onlyInspect
Lists email headers (subject, sender, date, unread) from the Mac's Apple Mail (Mail.app, local accounts). For a Microsoft 365 / Outlook mailbox use m365_list_emails instead. Returns headers only — call read_email(message_id) to get the full body.
IMPORTANT: On machines with 3+ accounts, always pass account= (from list_email_accounts) to avoid timeouts. Without account, all accounts are scanned which can be slow on macOS 15+.
Supports pagination: use offset to page through results (e.g. offset=20 for page 2 with limit=20). The limit parameter is capped at 50 per call (default 20); to read more, page with offset rather than requesting a larger limit.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | 20 | |
| offset | No | 0 | |
| account | No | ||
| mailbox | No | ||
| unread_only | No | false |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| offset | No | |
| messages | No | |
| next_actions | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, but the description adds critical behavioral details: scanning all accounts if no account is passed can be slow and cause timeouts on macOS 15+, and the limit is capped at 50 with default 20, requiring pagination for more results. This goes beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is highly concise with a clear structure: purpose and alternative first, then important account note, then pagination. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 parameters and an output schema, the description covers return type (headers), pagination, account filtering, and sibling tool distinction. Minor omission of mailbox and unread_only parameter details, but overall sufficient for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so the description must explain parameters. It explains limit (default 20, cap 50), offset (pagination), and account (use from list_email_accounts). However, mailbox and unread_only parameters are not described, leaving gaps in understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists email headers from Apple Mail, specifying the source (Mac's Apple Mail, local accounts) and explicitly distinguishes from the sibling m365_list_emails for Outlook/365. It also clarifies it returns only headers, referencing read_email for full body.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit when-to-use and when-not-to-use guidance: use m365_list_emails for Outlook/365 mailboxes, and for machines with 3+ accounts, always pass the account parameter to avoid timeouts. It also details pagination usage with offset and limit, ensuring correct invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_m365_people_insightsList Microsoft 365 People InsightsARead-onlyInspect
List the people most relevant to you in Microsoft 365 — based on your communication patterns, collaboration history, and org chart. Useful for meeting prep and contact enrichment.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Number of people to return (default 20, max 50) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| people | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds behavioral context about the criteria used (communication patterns, collaboration history, org chart), which is helpful but not extensive. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, front-loaded with the main purpose, no filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple tool (1 optional param, output schema exists), the description covers purpose, criteria, and use case adequately.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the description does not add any extra meaning for the single 'limit' parameter beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'list', the resource 'people most relevant to you in Microsoft 365', and the criteria 'based on your communication patterns, collaboration history, and org chart', distinguishing it from siblings like list_contacts or search_m365_directory.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides context with 'Useful for meeting prep and contact enrichment', implying when to use it, but does not explicitly exclude other tools or mention when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_message_chatsList Message ChatsBRead-onlyInspect
Lists recent iMessage/Messages.app conversations.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max conversations (default 30) |
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| chats | No | |
| count | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations already declare the tool as read-only and non-destructive (readOnlyHint=true, destructiveHint=false). The description adds 'recent', implying ordering by recency, but does not disclose default limit or pagination behavior. This adds some value but is minimal.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence that efficiently conveys the tool's purpose. It is front-loaded and contains no extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity (one optional parameter, no required fields, and an output schema exists), the description provides the core function. However, it omits details like the scope (user's iCloud account) and exact recency criteria. Relies on output schema for return format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for the single parameter (limit). The description does not add information beyond what the schema already provides (max conversations, default 30). Baseline score is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb (lists) and resource (recent iMessage/Messages.app conversations). It distinguishes from sibling tools by specifying the platform, though it could be more explicit about 'iMessage' vs other messaging apps.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives like search_messages or read_messages. The context is implied by the name, but the description doesn't clarify scenarios or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_notesList NotesARead-onlyInspect
Lists notes from Apple Notes app. Optionally filter by folder.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | 50 | |
| folder | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| notes | No | |
| total | No | |
| next_actions | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, making safety clear. The description adds that notes come from Apple Notes, but does not disclose other behaviors like sorting, pagination, or response format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that conveys the core functionality without any fluff or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has a limit parameter implying pagination, but no mention of how to handle pagination or what the output contains. Although an output schema exists (so return values need not be described), details on ordering and completeness (e.g., 50 max by default) are missing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description partially compensates by explaining the folder parameter ('Optionally filter by folder'). However, the limit parameter is not described, leaving its semantics and valid range unclear.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Lists' and the resource 'notes from Apple Notes app,' which distinguishes it from sibling tools like search_notes (text search) and read_note (single note).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is given on when to use this tool versus alternatives like search_notes. The description only mentions optional folder filtering, which implies a use case but provides no explicit when-to-use or when-not-to-use context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_omnifocus_foldersList OmniFocus FoldersARead-onlyInspect
Lists folders in OmniFocus. Folders group related projects (e.g. "Work", "Personal"). Use list_omnifocus_projects to see the projects inside them.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | 100 |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| folders | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds that folders group projects, which is contextual but not behavioral. No contradiction between description and annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three concise sentences: purpose, explanatory context, and cross-reference. No wasted words or redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool, the description is adequate, explaining the concept of folders and guiding to the sibling tool. However, the missing parameter documentation reduces completeness slightly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description should explain the `limit` parameter. However, it does not mention it at all, leaving the agent without guidance on its meaning or usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Lists folders in OmniFocus', specifying the verb and resource. It also differentiates from the sibling tool list_omnifocus_projects by suggesting to use that for seeing projects inside folders.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides context on when to use this tool (list folders) and explicitly mentions the alternative list_omnifocus_projects for projects. It lacks an explicit 'when not to use' but is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_omnifocus_projectsList OmniFocus ProjectsCRead-onlyInspect
Lists projects in OmniFocus.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | 100 | |
| include_completed | No | false |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| projects | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, and openWorldHint=false. The description adds no behavioral context beyond what annotations provide, such as pagination, output format, or side effects. With annotations present, a score of 3 is appropriate as the description does not contradict but adds no value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise at one sentence, which is efficient for stating the core function. However, it could be more structured by including a brief note on parameters or behavior. The sentence earns its place but misses an opportunity to improve clarity. Score 4 for appropriate length but lack of structure.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is incomplete given the tool's complexity (2 parameters, 0% schema coverage, no output schema details). Even with an output schema, the description omits parameter context and usage guidance. The agent cannot determine how to customize the list (limit, include_completed) without external knowledge. Score 2 reflects a major completeness deficiency.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description does not explain the meaning or usage of the 'limit' or 'include_completed' parameters. Since the schema lacks descriptions, the description should compensate but does not. The agent receives no help in understanding how to set these parameters correctly. Score 1 indicates a significant gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Lists projects in OmniFocus' clearly states the verb and resource, but it is essentially a restatement of the tool name and title without adding nuance. It distinguishes from sibling tools like list_omnifocus_tasks or list_omnifocus_folders because it specifies 'projects', but does not explicitly contrast. Score 4 because it is clear but fails to differentiate beyond the name.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There is no mention of parameters like limit or include_completed, no advice on when not to use it (e.g., for active vs. completed projects), and no reference to siblings like list_omnifocus_tasks. The agent is left to infer usage from the name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_omnifocus_tagsList OmniFocus TagsARead-onlyInspect
Lists all tags defined in OmniFocus.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| tags | No | |
| count | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the description does not need to repeat that. However, no additional behavioral traits (e.g., rate limits, authentication, or that it returns all tags without filtering) are disclosed beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence with no extraneous words. It is front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no parameters, clear read-only behavior, and an output schema presumably documenting return structure), the description is complete. It covers the essential information an agent needs to select and invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has zero parameters, so the description cannot add parameter semantics. The baseline for 0 parameters is 4, and the description does not repeat schema information. It adequately conveys what the tool does without needing parameter details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action 'Lists all tags' and the resource 'defined in OmniFocus'. It is specific and distinguishes from sibling tools like list_omnifocus_projects and list_omnifocus_tasks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool or when not to. There is no mention of alternatives or prerequisites, leaving the agent to infer usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_omnifocus_tasksList OmniFocus TasksBRead-onlyInspect
Lists tasks from OmniFocus. Filter by project, tag, inbox, due today, or flagged status.
| Name | Required | Description | Default |
|---|---|---|---|
| tag | No | ||
| inbox | No | false | |
| limit | No | 50 | |
| flagged | No | false | |
| project | No | ||
| due_today | No | false | |
| include_completed | No | false |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| tasks | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnly/destructive hints, so description doesn't add much. It states it lists tasks, which aligns. No additional behavioral context (e.g., pagination, ordering) beyond filters.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single concise sentence listing filters. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite output schema existing, description lacks details on default behavior, filter combination logic, and result limits. With 7 optional params, agent needs more context for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so description carries burden. Mentions 5 of 7 parameters (tag, inbox, flagged, project, due_today) but omits limit and include_completed. Adds meaning but incomplete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it lists tasks from OmniFocus and lists common filters. However, it doesn't distinguish from search_omnifocus_tasks for free-text search, so score is 4.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use vs alternatives like search_omnifocus_tasks. Only implies usage through filters, but no explicit exclusions or context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_referral_candidatesList Referral CandidatesARead-onlyInspect
Step 1 of recommending LMCP to a colleague. Returns the user's emailable contacts plus a playful invite template. Use this when the user wants to invite/recommend someone. Then: show a few candidates, let the user PICK (never email everyone), call create_referral_invites for the chosen ones to get each person's unique link, draft a short personalized message IN THE USER'S LANGUAGE (keep the warm-but-sarcastic tone, adapt per person), show it for EDITS, then send with send_email (one per recipient — send_email asks the user to confirm). Never send without the user picking + approving.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max contacts to return (default 60) |
Output Schema
| Name | Required | Description |
|---|---|---|
| lang | No | |
| count | No | |
| how_to | No | |
| candidates | No | |
| template_body | No | |
| template_subject | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark it as readOnlyHint=true and destructiveHint=false. The description adds context about returning emailable contacts and an invite template, and outlines the non-destructive workflow steps, enhancing transparency without contradicting annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is focused and front-loaded with the core purpose, then efficiently lays out the entire workflow in a few sentences. Every sentence adds value, and the structure guides the agent clearly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of the referral workflow, the description fully explains the tool's role as step 1, what to do next, and constraints like not emailing everyone. The presence of an output schema reduces the need to describe return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a description for the single 'limit' parameter. The tool description does not add further semantic details about the parameter beyond the schema, warranting a baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool as 'Step 1 of recommending LMCP to a colleague' and specifies it returns 'emailable contacts plus a playful invite template.' This provides a specific verb-resource pair and distinguishes it from sibling contact tools like list_contacts or search_contacts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'Use this when the user wants to invite/recommend someone' and provides a detailed step-by-step workflow, including when to stop ('never send without user picking + approving') and alternatives (call create_referral_invites for chosen ones).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_reminder_listsList Reminder ListsARead-onlyInspect
Lists the lists (folders) in Apple Reminders (Reminders.app) on this Mac. For Microsoft To Do use todo_list_lists instead.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| lists | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, so the description adds context about operating on 'this Mac', implying local scope. No contradictions. It does not elaborate on return format, but that is covered by output schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two sentences, front-loaded with the verb 'Lists', and every word serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no parameters, read-only, with output schema and annotations), the description is complete. It explains what is listed and distinguishes alternatives, covering all necessary context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has no parameters, so baseline is 4. No parameter descriptions needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists lists/folders in Apple Reminders on this Mac, specifying the app and platform. It also distinguishes from a sibling tool (todo_list_lists) by directing users to it for Microsoft To Do.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells when to use this tool (for Apple Reminders lists) and provides an alternative tool (todo_list_lists) for Microsoft To Do, offering clear guidance on tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_remindersList RemindersARead-onlyInspect
Lists reminders from Apple Reminders (Reminders.app) on this Mac. Optionally filter by completion status or list name. For Microsoft To Do use todo_list_tasks instead.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max number of reminders to return (earliest due first). Optional; defaults to all. | |
| completed | No | true=completed, false=incomplete (default), omit=all | |
| list_name | No | Filter by reminder list name (optional) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | Number returned in this response. |
| total | No | Total matching before the limit. |
| reminders | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description does not add behavioral context beyond stating it lists reminders, which is consistent. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two efficient sentences: first states the main action and scope, second adds optional filters and sibling differentiation. No redundancy, front-loaded with key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool with an output schema and well-documented parameters, the description provides complete context. It covers source (Apple Reminders), filters, and distinguishes from related tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so each parameter is already documented. The description briefly mentions optional filters but adds minimal extra meaning beyond the schema. Baseline 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists reminders from Apple Reminders on this Mac with optional filters. It distinguishes from a sibling tool (todo_list_tasks) for Microsoft To Do, making the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises when to use this tool (Apple Reminders) and when to use an alternative (todo_list_tasks for Microsoft To Do). Also mentions optional filters, providing clear usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_safari_bookmarksList Safari BookmarksARead-onlyInspect
Lists Safari bookmarks (title + URL) from the Mac's Safari (reads ~/Library/Safari/Bookmarks.plist — needs Full Disk Access).
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max bookmarks to return (default 100) |
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | Present when bookmarks can't be read (e.g. Full Disk Access needed). |
| count | No | |
| bookmarks | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint and destructiveHint false. The description adds the critical behavioral context of needing Full Disk Access and reading from a specific plist file, which goes beyond the annotations and helps the agent understand permissions and system interaction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that efficiently conveys purpose, data returned, source, and permission requirement. It is front-loaded with the main action 'Lists Safari bookmarks' and has no superfluous words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple read-only tool with one optional parameter and an output schema, the description covers the essentials: what it does, what data it returns, where it reads from, and a key prerequisite. It is complete enough for an agent to understand and invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema covers the single parameter 'limit' with a description, so schema_description_coverage is 100%. The tool description does not add any information about the parameter beyond the schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists Safari bookmarks with title and URL, specifies the source file, and is distinct from sibling tools like safari_list_tabs which list open tabs. The verb 'Lists' is specific and the resource 'Safari bookmarks' is unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly mentions the requirement for Full Disk Access, which is a key usage guideline. While it doesn't explicitly state when not to use or list alternatives, the context makes it clear this is for reading local bookmarks, contrasting with other list tools that might access different data sources.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
lmcp_stateLMCP StateARead-onlyInspect
Returns a structured snapshot of the LMCP environment: server/tray/teams-proxy versions, detected AI client, cloud relay state, TCC permission states (Calendar/Reminders/Contacts), and a compact summary of which services (Mail/Calendar/Contacts/Teams/OneDrive/Reminders/Notes) are reachable. Fast (<500ms), passive — never prompts the user, never opens app windows, never touches the network. Call this when you need to verify the environment is healthy before attempting a tool, or to understand what's installed and accessible. For reporting failures, use report_problem instead — it captures this same snapshot plus logs and submits to the team.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| tcc | No | TCC permission states, e.g. granted | denied | authorized. |
| arch | No | |
| update | No | |
| version | No | Serving (running) server version. |
| services | No | Per-domain reachability summary (mail, calendar, contacts, teams, onedrive, slack, …); shape varies by domain. |
| ai_client | No | |
| machine_id | No | |
| os_version | No | |
| tray_version | No | |
| last_activity | No | |
| license_status | No | trial | active | expired |
| cloud_token_set | No | |
| tunnel_connected | No | |
| cloud_data_enabled | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide readOnlyHint=true and destructiveHint=false, but description adds valuable behavioral traits: fast (<500ms), passive (never prompts user, never opens windows, never touches network). This exceeds what annotations convey.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences covering purpose, behavioral traits, and usage guidance. No redundant words, front-loaded with core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero parameters and an existing output schema, description fully covers what the tool returns, its performance, safety assurances, and usage context. No obvious gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so baseline is 4. Description appropriately adds no parameter info as none is needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it returns a structured snapshot of the LMCP environment, listing included components (versions, AI client, relay state, permissions, service reachability). Distinguishes from sibling report_problem by explicitly noting that tool captures same snapshot plus logs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to call (verify environment health before attempting a tool, understand installed/accessible) and when not to (use report_problem for failures), naming the alternative tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
lmcp_welcomeLMCP WelcomeARead-onlyInspect
Returns a live snapshot of what LMCP can currently see on this Mac: today's calendar events, due reminders, total contacts, and unread emails. Use this to show new users what LMCP has access to and suggest first steps. Only available before the first real tool call — call it immediately when the user first opens a chat.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| domains | No | Every domain LMCP reaches, with a short capability blurb. |
| snapshot | No | |
| automations | No | Example multi-step prompts the user can try immediately. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds the important constraint that it's only available before the first real tool call, which is critical behavioral context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste. All information is front-loaded and every sentence adds value. Highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters and strong annotations, the description fully informs the agent: what is returned, why to use it, and when. It lists the specific data elements (calendar events, reminders, contacts, unread emails). Complete for the tool's simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has no parameters, and schema coverage is 100% (none). The description doesn't need to add parameter info. It adequately describes the return contents, meeting expectations.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns a live snapshot of calendar events, reminders, contacts, and unread emails. It also explains its use case: showing new users what LMCP has access to. This distinguishes it from sibling tools as a one-time initial call.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use this to show new users what LMCP has access to and suggest first steps' and 'Only available before the first real tool call — call it immediately when the user first opens a chat.' This provides clear guidance on when and why to use the tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_create_eventMicrosoft 365 Create EventCInspect
Create a calendar event in your Microsoft 365 / Outlook calendar.
| Name | Required | Description | Default |
|---|---|---|---|
| end | Yes | End time in ISO 8601, e.g. '2026-05-20T11:00:00' | |
| body | No | Event description (optional) | |
| start | Yes | Start time in ISO 8601, e.g. '2026-05-20T10:00:00' | |
| subject | Yes | Event title | |
| calendar | No | Calendar name to create the event in — partial, case-insensitive match (optional). Omit to use the primary calendar. | |
| location | No | Location (optional) | |
| timezone | No | IANA timezone, e.g. 'America/New_York' (default: UTC) | |
| attendees | No | Comma-separated email addresses to invite (optional) |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| ok | No | |
| message | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate non-read-only and non-destructive nature. The description adds no behavioral details such as authentication needs, calendar permissions, or side effects beyond the schema definitions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence with no wasted words. However, it could benefit from a structured breakdown of core functionality and important notes.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 8 parameters and an output schema, the description is too brief. It omits details like default calendar behavior, timezone handling, and return value expectations, leaving an agent with insufficient context for complex invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptive parameter names and examples. The tool description does not add any additional semantic context beyond what the schema provides, warranting a baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Create calendar event') and the resource ('Microsoft 365 / Outlook calendar'). However, it does not differentiate from the sibling tool 'create_calendar_event' which serves a similar purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'create_calendar_event' or 'update_calendar_event'. The description lacks any context about prerequisites or preferred use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_delete_eventMicrosoft 365 Delete EventADestructiveInspect
Delete a calendar event from your Microsoft 365 / Outlook calendar by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| id | Yes | Event ID from m365_list_events | |
| confirm | Yes | Set to true to confirm deletion (required) |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| message | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare destructiveHint=true, and the description confirms the destructive nature. It adds no additional behavioral context beyond what annotations provide, but does not contradict them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single 14-word sentence that immediately conveys the action and object. No unnecessary words; front-loaded with the verb 'Delete'.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the existence of an output schema (not shown) and annotations covering destructiveness, the description is sufficiently complete for a simple delete-with-confirmation tool. Could mention permanence, but not critical.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both parameters. The description adds no extra meaning beyond 'by its ID', so it does not improve understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it deletes a calendar event by ID, using specific verb and resource. However, it does not differentiate from sibling tool 'delete_calendar_event', which likely performs the same action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for deleting events with an ID, referencing m365_list_events in the schema parameter, but provides no explicit guidance on when to use this tool versus alternatives like delete_calendar_event or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_get_contactMicrosoft 365 Get ContactARead-onlyInspect
Get full details of a specific Microsoft 365 contact by ID. Get the ID from m365_list_contacts or m365_search_contacts.
| Name | Required | Description | Default |
|---|---|---|---|
| id | Yes | Contact ID from m365_list_contacts or m365_search_contacts |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| name | No | |
| notes | No | |
| title | No | |
| emails | No | |
| mobile | No | |
| phones | No | |
| company | No | |
| surname | No | |
| given_name | No | |
| home_address | No | |
| business_address | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds that it returns 'full details', which provides additional context about the response content. No contradictions or missing behavioral disclosures.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two sentences, front-loading the purpose. Every sentence adds value: the first states the core function, the second provides parameter guidance. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple get-by-ID tool with an output schema (present but not shown), the description is complete. It covers the input requirement (ID) and its source, and the annotations cover safety. No additional information is necessary for an agent to correctly invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 100% description coverage, with the parameter description already explaining the source of the ID. The description merely repeats this information, adding no new semantic value beyond what the schema provides. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and the resource 'full details of a specific Microsoft 365 contact by ID'. It distinguishes itself from sibling tools like m365_list_contacts and m365_search_contacts by focusing on retrieving details for a single contact using an ID.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly instructs the user to obtain the ID from m365_list_contacts or m365_search_contacts, providing clear context. It does not include explicit exclusions or when not to use, but the guidance is sufficient for this simple retrieval tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_list_contactsMicrosoft 365 List ContactsARead-onlyInspect
List contacts from your Microsoft 365 / Outlook address book.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max contacts to return (default 50, max 100) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| contacts | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the description's 'list' action aligns. However, it adds no extra behavioral context (e.g., pagination, permissions, or return format). Given annotations cover safety, a 3 is appropriate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with verb front-loaded. No wasted words, efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool is simple with one parameter and an output schema, so the description is largely complete. However, adding a hint about default limit behavior would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with one 'limit' parameter described. The description adds no additional meaning beyond the schema, so baseline 3 is correct.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description states specific verb 'List' and resource 'contacts' from 'Microsoft 365 / Outlook address book', clearly distinguishing it from sibling tools like 'get_contact' (single) and 'search_contacts' (search).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives like 'm365_search_contacts'. The description simply states the action without context or exclusions, leaving the agent to infer usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_list_emailsMicrosoft 365 List EmailsBRead-onlyInspect
List emails from your Microsoft 365 / Outlook inbox. Returns subject, sender, date, and preview.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Number of emails to return (default 20, max 50) | |
| folder | No | Folder name: inbox (default), sentitems, drafts, deleteditems | |
| unread_only | No | If true, return only unread emails |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| emails | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the description's lack of additional behavioral context is acceptable but minimal. It does not disclose potential side effects (none expected), authentication requirements, or rate limits. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise (two sentences) and front-loaded with the core purpose. No redundant or extraneous information. Every word adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a list tool with three optional parameters and a read operation, the description is adequate but incomplete. It does not mention ordering (e.g., chronologically), result limits behavior, or pagination. However, an output schema exists (though not shown), so return field details are partly covered.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All three parameters have detailed descriptions in the input schema (100% coverage). The tool description adds no extra meaning beyond the schema (e.g., it does not explain the impact of combining filters or default values). Baseline score is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('list'), resource ('emails from Microsoft 365 / Outlook inbox'), and return fields ('subject, sender, date, and preview'). It is specific and unambiguous. However, it does not explicitly differentiate from siblings like search_emails or generic list_emails, which slightly reduces clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives such as search_emails or m365_search_emails. The description does not mention prerequisites, optimal use cases, or exclusion criteria. The agent must infer usage from the tool name and parameters alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_list_eventsMicrosoft 365 List EventsBRead-onlyInspect
List upcoming calendar events from your Microsoft 365 / Outlook calendar.
| Name | Required | Description | Default |
|---|---|---|---|
| days | No | Number of days ahead to look (default 7, max 30) | |
| limit | No | Max events to return (default 20, max 50) | |
| calendar | No | Calendar name to filter by — partial, case-insensitive match (optional). Omit to use the primary calendar. |
Output Schema
| Name | Required | Description |
|---|---|---|
| days | No | |
| count | No | |
| events | No | |
| calendar | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the safety profile is clear. The description adds 'upcoming' but does not elaborate on ordering, default calendar behavior, or pagination. Adequate but not extra insightful.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
One sentence, 12 words, front-loaded with the key verb and resource. It is concise but could be slightly expanded with context without losing efficiency.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool with output schema and annotated as read-only, the description is minimal but sufficient. It does not explain defaults (7 days, 20 events, primary calendar) which are in schema but might be helpful to restate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage with descriptions for all three parameters. The description adds no additional meaning beyond the schema, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists upcoming calendar events from Microsoft 365/Outlook calendar. However, it does not differentiate from the sibling tool 'list_calendar_events', which likely has similar functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., list_calendar_events, create_calendar_event). Lacks when-to-use and when-not-to-use context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_read_emailMicrosoft 365 Read EmailARead-onlyInspect
Read the full content of a specific Microsoft 365 email by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| id | Yes | The email message ID from m365_list_emails or m365_search_emails |
Output Schema
| Name | Required | Description |
|---|---|---|
| cc | No | |
| id | No | |
| to | No | |
| body | No | |
| date | No | |
| from | No | |
| is_read | No | |
| subject | No | |
| from_address | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, indicating no side effects. The description adds that it reads 'full content', but doesn't detail what that includes (e.g., attachments, headers). Since output schema exists, the return format is covered elsewhere. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single sentence with no extraneous words. It efficiently conveys the core purpose and the required input.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low parameter count, high schema coverage, and presence of an output schema, the description is sufficient. It could mention whether the full body includes formatting or attachments, but overall it provides a complete picture for a simple read operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter 'id' is well-described, noting it comes from m365_list_emails or m365_search_emails, which adds value beyond the schema's type definition. Schema description coverage is 100%, so the description compensates adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'read' and the resource 'specific Microsoft 365 email', and specifies the method 'by its ID'. It distinguishes from sibling tools like m365_list_emails and m365_search_emails which return lists or search results.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies that the tool should be used after obtaining an email ID from m365_list_emails or m365_search_emails, as indicated in the parameter description. However, it does not explicitly state when to use this tool over alternatives, nor when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_reply_emailMicrosoft 365 Reply EmailARead-onlyInspect
Reply to a Microsoft 365 email by its message ID. Shows a preview first — set confirm=true to actually send.
| Name | Required | Description | Default |
|---|---|---|---|
| id | Yes | Message ID to reply to (from m365_list_emails or m365_read_email) | |
| confirm | No | Set to true to actually send (default: shows preview only) | |
| message | Yes | Your reply text | |
| reply_all | No | If true, reply to all recipients (default: false) |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| message | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description states it sends an email (when confirm=true), but the annotation declares readOnlyHint=true, which is a direct contradiction. According to scoring rules, this earns a score of 1. The description itself does add behavioral info (preview then send), but the contradiction overrides.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: one sentence with a dash linking the preview behavior and the confirm flag. Every word is meaningful, no fluff, and it front-loads the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has an output schema (not detailed but present) and annotations, so the description doesn't need to cover return values. It explains the preview/send workflow. However, the annotation contradiction undermines trust and completeness, and it doesn't address prerequisites (e.g., must be connected to M365) or error cases. Given complexity, it's adequate but not thorough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (4 params fully described in schema), so baseline is 3. The description mentions confirm and implies others via message ID and reply text, but adds little beyond what the schema already has in descriptions (e.g., 'Set to true to actually send'). No significant additional meaning provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it replies to a Microsoft 365 email, specifying the action (reply) and resource (email by message ID). It distinguishes from sibling tools like 'm365_send_email' (send new email) and 'reply_email' (generic) by naming Microsoft 365 and mentioning the preview/send mechanism unique to this tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells the agent that it shows a preview first and to set confirm=true to actually send, which is a clear usage guideline. However, it does not provide explicit when-not-to-use or alternatives compared to siblings, but the name and context implicitly guide selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_search_contactsMicrosoft 365 Search ContactsBRead-onlyInspect
Search contacts in your Microsoft 365 address book by name, email, or company.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Search term — name, email, or company |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| contacts | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds that the tool searches by name, email, or company, which is already in the parameter schema. It does not disclose additional behavioral traits like searching behavior (exact match, substring), pagination, or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence of 12 words that front-loads the purpose. Every word contributes meaning with no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (1 parameter, no nested objects, and an output schema defining return values), the description sufficiently covers functionality. However, it could mention search behavior specifics (e.g., does it support partial matches?) to be fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the parameter description clarifies the query term. The tool description reiterates the same fields (name, email, company) without adding new meaning or usage constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it searches contacts by name, email, or company in the Microsoft 365 address book. The verb 'Search' and resource 'contacts' are specific, but it does not explicitly distinguish from sibling tools like list_contacts or search_contacts, leaving some ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives such as list_contacts (to retrieve all contacts) or get_contact (to fetch a specific contact). There is no mention of when not to use it or any prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_search_emailsMicrosoft 365 Search EmailsBRead-onlyInspect
Search your Microsoft 365 emails by keyword, sender, or subject.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 20, max 50) | |
| query | Yes | Search query, e.g. 'budget Q2', 'from:alice@contoso.com', 'subject:invoice' |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| emails | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations already indicate readOnlyHint=True, and the description aligns by stating 'search' without implying mutation. However, the description adds no new behavioral context beyond the annotations (e.g., no mention of rate limits, auth needs, or search scope).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence that efficiently communicates the tool's purpose without unnecessary words. It is front-loaded and directly actionable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low parameter count, complete schema coverage, and presence of an output schema, the description is sufficiently complete. It could mention the account scope or supported query syntax, but is adequate for a straightforward search tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. The tool's description adds no extra meaning beyond the schema; it merely restates the query purpose. Baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches Microsoft 365 emails by keyword, sender, or subject. It effectively distinguishes the tool from list-based siblings like list_emails, but does not differentiate from the similarly named sibling search_emails.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives such as list_emails, search_emails, or read_email. The description lacks context on prerequisites, scope, or limitations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
m365_send_emailMicrosoft 365 Send EmailAInspect
Send an email from your Microsoft 365 account. Shows a preview first — set confirm=true to actually send.
| Name | Required | Description | Default |
|---|---|---|---|
| cc | No | CC recipients (optional, comma-separated) | |
| to | Yes | Recipient email address. For multiple, separate with commas. | |
| body | Yes | Email body (plain text) | |
| confirm | No | Set to true to actually send (default: shows preview only) | |
| subject | Yes | Email subject |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| message | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds value beyond annotations by explaining the two-step behavior (preview then confirm). While annotations indicate it is not read-only and not destructive, the description clarifies the confirmation mechanism, which is key for an agent to understand the tool's side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—one sentence plus a dash clause. It front-loads the purpose and efficiently conveys the key behavioral detail. No extraneous words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of a full output schema and annotations, the description covers the essential behavioral nuance (preview + confirm). It does not mention sender defaults, error handling, or cc behavior, but the schema and context signals fill some gaps, making it nearly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Since the input schema already has descriptions for all parameters (100% coverage), the description does not add extra meaning. It reiterates the confirm parameter's purpose but without additional detail beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Send an email') and the resource ('from your Microsoft 365 account'). The mention of 'Shows a preview first' distinguishes it from other email-sending tools like 'send_email' or 'm365_reply_email', which may behave differently.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use the tool (when you want a preview) but does not explicitly state when not to use it or compare it to alternatives. There is a sibling 'send_email' that may offer direct sending, but no guidance is provided on which to choose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
move_emailMove EmailADestructiveInspect
Moves an email to another mailbox (nested target folders are found by name). Pass account= (returned by list_emails/search_emails) so the message lookup targets one account instead of scanning all of them — without it, multi-account Macs are slow and can time out on bulk moves. If you know the folder the message is in, also pass mailbox= (the mailbox field from the listing) so the lookup searches it first.
| Name | Required | Description | Default |
|---|---|---|---|
| account | No | ||
| confirm | No | false | |
| mailbox | No | ||
| message_id | Yes | ||
| target_mailbox | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| to | No | |
| moved | No | |
| warning | No | |
| message_id | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true, so the description need not repeat. It adds performance caveats about timeouts on multi-account Macs. However, it could mention that moving removes the email from the source mailbox.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each earning its place: purpose, optimization advice, and additional performance tip. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers the main use case and parameter rationale. With an output schema present, return value documentation is not needed. Could mention the destructive nature explicitly, but overall complete for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It explains 'account' and 'mailbox' parameter origins and benefits, but does not elaborate on 'confirm' or 'target_mailbox' beyond their names. Adds moderate value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it moves an email to another mailbox and mentions nested folders are found by name. It uniquely identifies the action and distinguishes from sibling tools like create_email_folder or search_emails.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit guidance on when to pass 'account' and 'mailbox' parameters for performance, and cites functions that return these values. Does not explicitly exclude alternatives but gives clear context for optimal use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
nordvpn_diagnoseNordVPN DiagnoseARead-onlyInspect
Run a diagnostic check on NordVPN: installation, login state, connection status, kill switch, and supported protocols. Useful for troubleshooting.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| report | No | Full formatted text report |
| account | No | |
| running | No | True if the NordVPN app is running |
| version | No | |
| connected | No | True if the VPN is connected |
| installed | Yes | True if NordVPN is installed |
| logged_in | No | True if a NordVPN account is logged in |
| protocols | No | Supported VPN protocols |
| kill_switch | No | True if the kill switch is enabled |
| auto_connect | No | True if auto-connect / connect on demand is on |
| subscription | No | |
| last_location | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows it's safe. The description adds context about what aspects are checked (installation, login, etc.), which is somewhat additional but not beyond what annotations imply.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with the action verb, no wasted words. Every part adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero parameters and annotations covering safety, the description is complete. It covers what the diagnostic checks and for what purpose. Output schema exists but description need not detail returns per rules.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has zero parameters, so the description does not need to explain parameters. Baseline score of 4 applies as per instructions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Run a diagnostic check on NordVPN: installation, login state, connection status, kill switch, and supported protocols.' It specifies the verb and resource, and distinguishes from siblings like nordvpn_status and nordvpn_servers.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description says 'Useful for troubleshooting,' which implies context but does not explicitly state when to use versus alternatives like nordvpn_status. No when-not or exclusion criteria are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
nordvpn_serversNordVPN ServersARead-onlyInspect
Get recommended NordVPN servers by country or specialty. Uses NordVPN public API (no account needed). Returns server name, hostname, country, city, load %, and supported technologies.
| Name | Required | Description | Default |
|---|---|---|---|
| type | No | Server type filter: 'standard', 'p2p', 'double_vpn', 'onion', 'dedicated_ip'. Default: standard. | |
| limit | No | Number of servers to return (1-10). Default: 5. | |
| country | No | Country name or 2-letter code (e.g. 'US', 'United States', 'JP'). Omit for auto-recommendation. |
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| count | No | |
| servers | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint and destructiveHint. The description adds value by disclosing the external API dependency and no authentication requirement, plus details on response fields.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no waste. First states purpose; second adds source and return info. Efficient and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description adequately covers the tool's purpose, source, and return fields. Could mention behavior when country is omitted (auto-recommendation), but schema already implies this.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so parameters are well-documented. The description adds minimal extra context (e.g., 'recommended' hint) but does not elaborate on parameter usage beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and the resource 'recommended NordVPN servers', and distinguishes itself from sibling tools like nordvpn_diagnose and nordvpn_status by specifying server retrieval with filtering.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explains when to use (needs server info), notes no account needed, and lists returned data. Lacks explicit when-not-to-use or alternatives, but sibling context provides differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
nordvpn_statusNordVPN StatusARead-onlyInspect
Check NordVPN connection status: connected/disconnected, auto-connect, snooze, and last known location. Does NOT open NordVPN.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| version | No | |
| connected | No | |
| installed | No | |
| app_running | No | |
| auto_connect | No | |
| last_location | No | |
| snoozed_until | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds value by specifying that the tool does not open NordVPN and lists exactly what status information is returned, providing context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, highly concise, front-loaded with the main action, and every word adds value. No unnecessary information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple status-check tool with no parameters and an output schema (not shown), the description thoroughly covers the key aspects: what it checks, and what it does not do. It is complete for the tool's scope.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has zero parameters, so the schema coverage is 100% and the description needs no parameter details. According to guidelines, zero parameters yields a baseline of 4, which is appropriate here.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool checks NordVPN connection status and lists the specific statuses (connected/disconnected, auto-connect, snooze, last known location). It distinguishes itself from sibling tools like nordvpn_diagnose and nordvpn_servers by focusing solely on status.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for usage: checking connection status. It includes a behavioral note that it does not open the app, but lacks explicit when-to-use vs alternatives guidance. However, given the distinct purpose among siblings, this is acceptable.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notion_list_databasesNotion List DatabasesARead-onlyInspect
Lists Notion databases cached on this Mac with their schema (column names and types). Use notion_read_database to get the rows.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| count | No | |
| databases | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark read-only and non-destructive; description adds key context about caching and schema, disclosing behavioral traits like local scope.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no fluff, front-loaded with verb and resource, every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Has output schema externally, so return values not needed; description covers purpose, scope, and schema info. Could mention staleness but complete for its complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so no description needed; baseline 4 applies as no params (schema coverage 100%).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists cached Notion databases with their schema, and points to a sibling tool (notion_read_database) for rows, distinguishing itself.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context that it lists cached databases and suggests notion_read_database for rows, but lacks explicit when-not or alternative conditions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notion_list_pagesNotion List PagesARead-onlyInspect
Lists Notion pages cached on this Mac (titles, last edited, hierarchy), newest first. Reads the Notion desktop app's local cache — no Notion API, no integration token. Note: only pages visited in Notion (or marked Available offline) are cached.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max pages (default 50, max 500) |
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| count | No | |
| pages | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark it as read-only and non-destructive. The description adds valuable context: it reads from the local cache without requiring API tokens, and clarifies the caching limitation. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with key information, and every sentence adds value. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has an output schema and a simple parameter, the description covers what the tool returns (titles, last edited, hierarchy), ordering, and source. It is complete for agent understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The sole parameter 'limit' has a full description in the schema. The description does not add further semantics about the parameter, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists Notion pages cached locally, including specific fields (titles, last edited, hierarchy), sorted newest first. It distinguishes from related tools like notion_search and notion_list_databases by specifying the local cache source.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains it uses the local cache, not the Notion API, and notes the limitation that only visited or offline-marked pages are cached. This provides clear context for when to use this tool versus alternatives like notion_search.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notion_list_workspacesNotion List WorkspacesARead-onlyInspect
Lists the Notion workspaces cached on this Mac with their members (names and emails).
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| workspaces | No | |
| known_users | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, covering safety. The description adds valuable context: the workspaces are 'cached on this Mac' and includes member details, which goes beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with 14 words. Every word is necessary, no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has no parameters and an output schema exists to define return values, the description sufficiently covers the tool's purpose and scope. It is complete for a simple list operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
There are no parameters (schema coverage 100%), so the description does not need to add parameter information. A baseline of 4 is appropriate for zero-parameter tools.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Lists' and the resource 'Notion workspaces cached on this Mac', and specifies the included data 'members (names and emails)'. It distinguishes from sibling tools like notion_list_databases and notion_list_pages.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide explicit guidance on when to use this tool versus alternatives. While the purpose is clear, there are no when-not-to-use instructions or comparisons to other list tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notion_open_pageNotion Open PageAInspect
Opens a Notion page in the desktop app (deep link). Accepts a page id or title. Useful to let the user view or edit a page, or to pull an uncached page into the local cache.
| Name | Required | Description | Default |
|---|---|---|---|
| page | Yes | Page id (UUID) or title (partial, case-insensitive) |
Output Schema
| Name | Required | Description |
|---|---|---|
| title | No | |
| opened | No | |
| page_id | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=false (potential modification) and destructiveHint=false. The description adds useful context: it opens a deep link to the desktop app, implying it may launch an external application. This goes beyond annotations by specifying the mechanism.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the core action, and contains no redundant information. Every word serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple parameter set (one required param), an output schema exists, and the tool has limited complexity, the description provides sufficient context: purpose, acceptable input types, and use cases. It does not need to explain return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema coverage is 100% and the schema description already details that 'page' can be a UUID or partial title (case-insensitive). The description repeats this verbatim, adding no new semantic value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('opens'), resource ('Notion page'), and method ('desktop app deep link'). It distinguishes from sibling tools like notion_list_pages or notion_read_page by specifying opening in the app rather than listing or reading content.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives context for when to use (view/edit page, pull uncached page) but does not explicitly state when not to use or provide alternatives like notion_read_page for content retrieval. Sibling tools are available for different use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notion_read_databaseNotion Read DatabaseARead-onlyInspect
Reads the cached rows of a Notion database with their properties mapped through the schema. Accepts the database id or name (partial match). Only locally-cached rows are returned.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max rows (default 50, max 500) | |
| database | Yes | Database id (UUID) or name (partial, case-insensitive) |
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| rows | No | |
| count | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and non-destructive behavior. The description adds context that only cached rows are returned and properties are mapped through the schema, providing additional insight beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences long, front-loads the key action, and includes essential details without excess. Every sentence serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and complete parameter descriptions, the description adequately covers what the tool does, its inputs, and its limitations. No missing critical information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. The description adds that the database name supports partial match and explains the default for limit, enhancing understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it reads cached rows of a Notion database using the database id or name (partial match). It distinguishes from sibling tools like notion_read_page and notion_search, which operate on individual pages or search results.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies that only locally-cached rows are returned, implying it is for cached data. However, it does not explicitly state when not to use it or suggest alternatives for fresh data.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notion_read_pageNotion Read PageARead-onlyInspect
Reads a Notion page from the local cache and returns its content as markdown (headings, lists, to-dos, code, files, subpage links). Accepts a page id or a title (partial match). If parts of the page aren't cached yet, says so — open the page in Notion or mark it Available offline for full content.
| Name | Required | Description | Default |
|---|---|---|---|
| page | Yes | Page id (UUID) or title (partial, case-insensitive) | |
| max_blocks | No | Max blocks to render (default 300, max 1000) |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| note | No | |
| title | No | |
| last_edited | No | |
| blocks_rendered | No | |
| uncached_blocks | No | |
| content_markdown | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds important behavioral context: reads from local cache, returns markdown, and handles uncached parts gracefully. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose and output format, then input details and fallback. Every sentence adds value, no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (2 params, output schema exists, annotations present), the description covers caching, input flexibility, missing content handling, and output format. Complete for a read operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description restates schema info for 'page' (accepts id or title partial match) and 'max_blocks' (max to render). Since schema coverage is 100%, the description adds minimal new semantics beyond restating.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool reads a Notion page from local cache and returns markdown content, with specific resource (Notion page) and verb (reads). It distinguishes from sibling tools like notion_search or notion_list_pages.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides guidance on cache behavior and fallback action when content is missing (open in Notion or mark offline). Does not explicitly compare to siblings, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
notion_searchNotion SearchARead-onlyInspect
Searches cached Notion content (page titles and block text) for a phrase, case-insensitive. Returns matching blocks with the page they belong to. Only locally-cached content is searched — pages never opened in Notion won't match.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 20, max 100) | |
| query | Yes | Text to search for |
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| count | No | |
| query | No | |
| results | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false. The description adds value by explaining the case-insensitive search, cached content scope, and return format (matching blocks with page ownership). This goes beyond annotations without contradicting them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each providing essential information: what it does, the return structure, and a key limitation. No redundant or unnecessary wording.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the core functionality, case-insensitivity, cache dependency, and output structure. With an output schema present, it does not need to detail return values. It lacks mention of pagination or default limits, but these are covered by parameter descriptions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters having descriptions. The description does not add additional meaning to the parameters beyond what the schema provides, so baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description specifies the tool searches cached Notion content for a phrase, case-insensitive, returning matching blocks with page context. It clearly distinguishes from sibling tools like notion_list_pages or notion_read_page by focusing on search rather than listing or reading specific pages.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description states that only locally-cached content is searched, providing a clear limitation. However, it does not explicitly guide when to use this tool over alternatives like search_notes or other search tools, nor does it provide explicit when-to-use or when-not-to-use instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onedrive_delete_fileOneDrive Delete FileBDestructiveInspect
Deletes a file or empty folder from OneDrive.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the file or folder | |
| confirm | No | Must be true to delete |
Output Schema
| Name | Required | Description |
|---|---|---|
| path | Yes | |
| deleted | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true. The description adds that only empty folders can be deleted, which is useful. However, it does not disclose that the operation is irreversible or that confirmation is required, leaving gaps beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence that directly states the purpose. It is efficiently structured but could include a bit more context without being verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive tool with a required confirm parameter, the description lacks important context such as the irreversibility of the operation or the need for confirmation. The mention of 'empty folder' is good, but overall completeness is moderate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description adds no additional meaning to the parameters beyond what the schema provides. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool deletes a file or empty folder from OneDrive. It uses a specific verb and resource, distinguishing it from siblings like onedrive_move_file or onedrive_read_file.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. Does not mention prerequisites (e.g., ensuring folder is empty) or that the confirm parameter must be true. The agent is left to infer usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onedrive_file_infoOneDrive File InfoARead-onlyInspect
Returns metadata for a file or folder: size, modification date, type, and extension. Faster than listing the parent directory when you only need info about one item.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the file or folder |
Output Schema
| Name | Required | Description |
|---|---|---|
| name | No | |
| path | No | |
| size | No | |
| type | No | file | directory |
| created | No | |
| modified | No | |
| extension | No | |
| size_human | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare read-only and non-destructive nature. Description adds value by specifying the metadata fields returned and the performance advantage over listing. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first states purpose and specifics, second provides usage guidance. No unnecessary words, front-loaded with key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With output schema present, return values need no additional explanation. Description covers purpose, when to use, and performance benefit. Completely adequate for a simple metadata retrieval tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Single parameter 'path' is fully described in schema (absolute path). Description does not add further details about allowed formats or examples, so baseline score applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb 'returns metadata' and resource 'file or folder', listing specific attributes (size, modification date, type, extension). It distinguishes from sibling tools that list parent directories or perform other operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: when you only need info about one item. Mentions alternative (listing parent directory) and provides a rationale (faster). No ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onedrive_list_filesOneDrive List FilesARead-onlyInspect
Lists files and folders in a OneDrive path. Use onedrive_root to find valid paths. Returns up to limit entries (default 1000, max 5000); large folders are truncated with a note — narrow the path for more specific results.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the OneDrive folder | |
| limit | No | Max entries to return (default 1000, max 5000). Folders with more entries are truncated; the response sets truncated=true and reports the total. |
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| count | No | Entries returned in this response. |
| items | No | |
| total | No | Total entries in the folder. |
| truncated | No | True when total exceeds the limit. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond the annotations (readOnlyHint, destructiveHint), the description discloses key behaviors: returns up to limit entries, truncation with a note and a truncated flag, and total count reporting. This adds significant value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences front-load the core purpose and immediately provide actionable guidance. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (assumed documented), the description covers the key behavioral aspects: listing scope, limit handling, and truncation. It also suggests using onedrive_root for path discovery. Could mention error cases, but overall sufficient for a simple list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema already covers both parameters (100% coverage). The description enhances them by specifying defaults (1000), maximum (5000), and what happens when a folder exceeds the limit (truncated=true, total field). This adds meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (lists) and resource (files and folders in a OneDrive path). It distinguishes itself from siblings like onedrive_root and onedrive_search_files by specifying the listing behavior.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides guidance on using onedrive_root to find valid paths and explains the limit parameter and truncation behavior. While it doesn't explicitly contrast with alternatives like onedrive_search_files, it gives useful context for effective use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onedrive_move_fileOneDrive Move FileBDestructiveInspect
Moves or renames a file/folder within OneDrive.
| Name | Required | Description | Default |
|---|---|---|---|
| source | Yes | Source path | |
| confirm | No | Must be true to move | |
| destination | Yes | Destination path |
Output Schema
| Name | Required | Description |
|---|---|---|
| to | Yes | |
| from | Yes | |
| moved | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses that the tool moves or renames, which is destructive (consistent with annotations destructiveHint=true). However, it does not add behavioral details beyond what annotations provide, such as whether overwriting occurs, if the operation is reversible, or if permissions are required. The confirm parameter's requirement (must be true) is documented only in the schema, not in the description.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, compact sentence that clearly and concisely states the purpose. It is front-loaded and contains zero wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that an output schema exists, the description does not need to cover return values. However, for a destructive tool, more context about safety (e.g., confirm requirement, side effects) would be beneficial. The description is minimally complete but lacks practical details that would aid agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage for all 3 parameters, so the description does not need to add extra meaning. The description itself does not mention any parameter details beyond the schema. Baseline score of 3 is appropriate as the schema already provides sufficient information.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action: moves or renames a file/folder within OneDrive. It is specific about the resource and operation, but does not differentiate between moving and renaming (which are essentially the same operation with different destination paths) or distinguish from sibling tools like onedrive_write_file (which may copy) or onedrive_delete_file.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
There is no guidance on when to use this tool versus alternatives. For example, no mention of prerequisites (e.g., source must exist), when to use onedrive_write_file instead for copying, or when to use confirm parameter. The description alone does not help an agent decide when to invoke this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onedrive_read_fileOneDrive Read FileARead-onlyInspect
Reads a text file from OneDrive or the local filesystem. Supports .txt, .md, .csv, .json, .xml, .log and several code file types. Auto-detects UTF-8, falls back to Latin-1/Windows-1252 for legacy files (common in Latin American banking .TXT padrones).
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the file | |
| offset | No | Start reading at byte offset (default 0) | |
| encoding | No | Force a specific encoding: 'auto' (default), 'utf8', 'latin1', 'cp1252', 'ascii', 'utf16' | |
| max_bytes | No | Maximum bytes to read (default 1048576 = 1 MB, capped at 10485760 = 10 MB) |
Output Schema
| Name | Required | Description |
|---|---|---|
| path | Yes | Absolute path of the file |
| bytes | Yes | Total file size in bytes |
| offset | No | Byte offset the read started at |
| content | Yes | Decoded file text content |
| encoding | No | Encoding used to decode (utf8 | cp1252 | latin1 | ascii | utf16) |
| truncated | No | True if more content remains beyond what was returned |
| bytes_read | No | Number of bytes read in this slice |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=True. The description adds encoding auto-detection with fallback to Latin-1/Windows-1252, which is valuable behavioral context not present in annotations. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, front-loaded with main purpose, no redundant information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (not shown), return values are covered. Description covers file types, encoding behavior, and source. Does not mention binary file exclusion but 'text file' at start implies it. Very good completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds context about encoding fallback relating to the encoding parameter, but does not significantly extend parameter understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool reads text files from OneDrive or local filesystem, lists supported file types, and describes encoding behavior. It differentiates from sibling tools like onedrive_list_files or fs_read by specifying text file focus and encoding handling.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for reading text files with encoding fallback, but does not explicitly state when to use this vs alternatives like fs_read for binary files or onedrive_file_info for metadata. No exclusions or when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onedrive_rootOneDrive RootARead-onlyInspect
Lists all mounted OneDrive directories on this Mac.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| roots | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds behavioral context by specifying 'mounted' and 'on this Mac', indicating local filesystem state. No contradictions or additional disclosures needed beyond what annotations provide, but the description adds useful nuance.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
One sentence of 9 words, front-loaded verb, no filler. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a zero-parameter tool with an output schema, the description provides essential information: what is listed (mounted directories), scope (all), and location (this Mac). It doesn't detail return format, but the output schema presumably handles that. Minor gap: could clarify what qualifies as 'mounted' (e.g., typical paths).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so baseline is 4 per rubric. The description adds no parameter-specific meaning, but the schema is fully covered by absence.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb 'Lists' and clearly identifies the resource as 'mounted OneDrive directories' with a scope of 'all' on 'this Mac'. It distinguishes from sibling tools like onedrive_list_files (which lists files within a directory) and gdrive_root (Google Drive equivalent).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implicitly indicates usage: use this tool to discover which OneDrive directories are locally mounted. It does not explicitly state when not to use it or provide alternatives, but for a simple listing tool with no parameters, the context is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onedrive_search_filesOneDrive Search FilesARead-onlyInspect
Searches for files by name in a OneDrive directory (recursive). Returns up to max_results matches (default 50); raise max_results or narrow the root for more.
| Name | Required | Description | Default |
|---|---|---|---|
| root | No | Root OneDrive path to search in (optional) | |
| query | Yes | Filename pattern to search for | |
| max_results | No | Maximum results (default 50) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| results | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and non-destructive behavior. The description adds recursive search behavior and default limit beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no waste: first sentence states purpose, second sentence gives actionable usage hint.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists. The description covers search behavior, recursion, result limits, and parameter tuning, which is sufficient given the complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage for parameters. The description adds usage advice (raise max_results or narrow root) that enhances meaning beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'searches' and resource 'files by name', specifies recursion, and distinguishes from siblings like onedrive_list_files.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides guidance on when to adjust max_results or narrow root for more results, but does not explicitly compare to alternatives or state when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onedrive_set_scopeOneDrive Set ScopeAInspect
Restricts LMCP's OneDrive access to a specific folder. Once set, all OneDrive tools (read, write, list, search, delete, move) only work inside the allowed folder. Pass an empty folder to remove the restriction. Changes take effect immediately.
| Name | Required | Description | Default |
|---|---|---|---|
| folder | No | Allowed folder path relative to the root (e.g. '/000-Claude Personal Agent'). Empty string removes the scope. | |
| confirm | No | Must be true to apply | |
| root_name | Yes | OneDrive root name (from onedrive_root, e.g. 'OneDrive-WPPCloud') |
Output Schema
| Name | Required | Description |
|---|---|---|
| root | No | |
| access | No | |
| effect | No | |
| scope_set | No | |
| scope_removed | No | |
| allowed_folder | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses behavioral traits beyond annotations: it explains that changes take effect immediately and affect all OneDrive tools. No contradiction with annotations (readOnlyHint: false, destructiveHint: false). It could have mentioned auth requirements or reversibility, but it does cover how to remove scope.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with four sentences, each adding value. It front-loads the main purpose and progresses to effects and usage. No waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (3 parameters, straightforward purpose) and the presence of an output schema (indicated), the description is complete. It covers the purpose, parameter usage, effect on siblings, and immediate effect.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already has full coverage with descriptions for all three parameters, so the baseline is 3. The description restates the empty string behavior mentioned in the schema but adds no additional meaning or format details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: restricting LMCP's OneDrive access to a specific folder. It uses a specific verb (restricts) and resource (OneDrive), and it distinguishes itself from other OneDrive sibling tools by being a scope-setting operation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context on when to use the tool (to restrict OneDrive access) and how to remove the restriction (empty folder). However, it does not explicitly mention when not to use it or provide alternative approaches.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onedrive_write_fileOneDrive Write FileBInspect
Writes text content to a file in OneDrive.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the file in OneDrive | |
| confirm | No | Must be true to write | |
| content | Yes | Text content to write |
Output Schema
| Name | Required | Description |
|---|---|---|
| path | Yes | |
| bytes | Yes | |
| written | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate mutating action (readOnlyHint=false) but destructiveHint=false is ambiguous for a write operation. Description does not clarify overwrite behavior, permission requirements, or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, no redundant information, efficiently communicates the core function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having an output schema (context signals), the description omits any return value details and fails to mention the required 'confirm' parameter. The tool's behavior regarding file creation vs overwrite is not addressed, leaving gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the description adds no additional meaning beyond what the schema already provides. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (writes), the resource (text content to a file in OneDrive), and distinguishes from sibling tools like onedrive_read_file or onedrive_delete_file.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives, no prerequisites, and no mention of whether the file should exist or if it creates/overwrites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
outlook_diagnoseOutlook DiagnoseARead-onlyInspect
Checks which email accounts are configured in Microsoft Outlook and compares them with Mail.app. If Outlook has accounts not in Mail.app, guides the user to add them so all email tools work seamlessly.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | Plain-language guidance |
| report | No | Full formatted text report |
| installed | Yes | True if Microsoft Outlook is installed |
| outlook_accounts | No | |
| mail_app_accounts | No | |
| missing_from_mail_app | No | Outlook account emails not present in Mail.app |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, and the description adds value by explaining the comparison logic and subsequent guidance. It does not contradict annotations and provides sufficient behavioral context for a read-only diagnostic tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the core action, and contains zero extraneous information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (not shown but indicated), the description does not need to detail return values. It covers the tool's purpose, behavior, and post-condition guidance adequately for a simple diagnostic tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has no parameters and schema coverage is 100%. The description does not need to add parameter details, and the baseline for such cases is 4. It correctly focuses on the tool's action rather than empty parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's specific verb-resource combination: 'Checks which email accounts are configured in Microsoft Outlook and compares them with Mail.app.' It distinguishes itself from siblings like list_email_accounts or run_diagnostics by specifying the cross-application comparison and guidance action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when email tools are not working seamlessly, but it does not explicitly state when to use or avoid this tool compared to alternatives like list_email_accounts or run_diagnostics. No contraindications or prerequisites are given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pdf_readPDF ReadARead-onlyInspect
Reads and extracts text from a PDF file.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the PDF file | |
| max_pages | No | Max pages to extract (default: all) |
Output Schema
| Name | Required | Description |
|---|---|---|
| text | Yes | Extracted text content |
| chars | Yes | Number of characters in the extracted text |
| pages | No | Total number of pages in the PDF |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds that it 'extracts text', which is behavioral detail, but does not disclose limitations like OCR or formatting preservation. With annotations covering safety, a score of 3 is appropriate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence with no unnecessary words. It efficiently communicates the tool's core function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has an output schema, so return value documentation is handled. However, the description is minimal and could benefit from mentioning that it only extracts text (not images) or handling large files, though for a simple tool it is adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and both parameters ('path' and 'max_pages') are well-described in the schema. The description adds no additional meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Reads and extracts text from a PDF file,' specifying both the verb (reads, extracts) and the resource (PDF file). It is distinct from sibling tools like excel_read or word_read.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is given on when to use this tool versus alternatives such as gdrive_read_file or onedrive_read_file. It does not indicate that it is specifically for PDF files or when to prefer it over other reading tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ppt_createPowerPoint CreateBInspect
Creates a PowerPoint (.pptx) file with title and bullet slides.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Output path for the .pptx file | |
| slides | Yes | Array of {title, bullets:[]} slide objects | |
| confirm | No | Must be true to create |
Output Schema
| Name | Required | Description |
|---|---|---|
| path | Yes | Path of the created .pptx file |
| slides | Yes | Number of slides created |
| created | Yes | True when the file was created |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=false (write operation) and destructiveHint=false (not destructive). The description's 'Creates' aligns but adds no extra context about file overwriting, permissions, or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with verb and resource, no fluff. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple creation tool with output schema present. Lacks details about file overwrite behavior or the required confirm parameter, but the schema covers basics.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all parameters. The description adds no additional meaning beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Creates' and the resource 'PowerPoint (.pptx) file' with 'title and bullet slides'. It distinguishes from the sibling ppt_read implicitly by being a creation tool, but does not explicitly contrast.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives (e.g., inserting images). No prerequisites, conditions, or exclusions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ppt_readPowerPoint ReadARead-onlyInspect
Reads slide text content from a PowerPoint (.pptx) file.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the .pptx file |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | Number of slides |
| slides | Yes | Per-slide extracted text |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds that it reads 'slide text content' (not all content), but doesn't disclose behavior like reading all slides or error handling. Adds some context but limited beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, no wasted words, front-loaded with key action and resource.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With output schema present and simple one-parameter schema, description is adequate. Could specify scope (e.g., all slides) but not necessary given low complexity and output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so description adds no extra meaning beyond the schema's 'Absolute path' hint. The tool description reiterates file type, providing minimal additional context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'reads' and resource 'slide text content from a PowerPoint (.pptx) file', clearly distinguishing from sibling tools like ppt_create (creates) and other read tools for different formats.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use vs alternatives. Implies usage for reading .pptx text, but lacks exclusions or references to sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
read_emailRead EmailARead-onlyInspect
Reads the full content of an email by its ID. Pass account= (and mailbox= if known, both from list_emails/search_emails) so the lookup targets one account instead of scanning all of them. Call sequentially, not in parallel — concurrent calls serialize behind Mail.app's JXA lock and later calls will time out.
Performance: body fetch is the primary latency source (avg 20s on slow IMAP). Pass include_body=false to skip it and get metadata-only (fast). Pass max_body_chars=N to cap the body at N chars after HTML stripping (default 30000; 0=unlimited). Response includes body_fetch_ms when fetch took >2s, body_omitted=true when skipped, body_truncated_at=N when cut.
| Name | Required | Description | Default |
|---|---|---|---|
| account | No | ||
| mailbox | No | ||
| message_id | Yes | ||
| include_body | No | true | |
| max_body_chars | No | 30000 |
Output Schema
| Name | Required | Description |
|---|---|---|
| cc | No | |
| id | No | |
| to | No | |
| body | No | |
| date | No | |
| from | No | |
| unread | No | |
| account | No | |
| mailbox | No | |
| subject | No | |
| body_omitted | No | |
| body_fetch_ms | No | |
| body_truncated_at | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds critical behavioral details: concurrency locking (serial JXA lock causing timeouts on parallel calls), performance characteristics (20s average body fetch on slow IMAP), and response signals (body_fetch_ms, body_omitted, body_truncated_at). This goes beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured: first sentence states purpose, followed by usage advice and performance notes. Every sentence contributes essential information without redundancy. It is comprehensive yet concise for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (context signals indicate yes), the description need not explain return values, but it still mentions key response fields. It covers performance, concurrency, parameter behavior, and integration with sibling tools. For a read tool with 5 parameters, this is fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description carries full burden. It explains that account and mailbox come from list_emails/search_emails, include_body skips slow body fetch, and max_body_chars defaults to 30000 (0=unlimited). These details add meaning that the bare schema does not provide.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Reads the full content of an email by its ID', which is a specific verb-resource pair. It distinguishes itself from sibling tools like list_emails (metadata-only) and search_emails by focusing on full content retrieval, and provides context about using account and mailbox from those tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance: call sequentially to avoid locking, use account and mailbox for targeted lookup, and pass include_body=false for fast metadata-only access. However, it does not differentiate between this tool and other email reading tools like m365_read_email, which may cause ambiguity in choosing the correct tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
read_messagesRead MessagesARead-onlyInspect
Reads messages from an iMessage conversation by chat ID or contact name.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max messages (default 50) | |
| chat_id | No | Chat identifier from list_message_chats | |
| contact_name | No | Contact name substring (alternative to chat_id) |
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| count | No | |
| chat_id | No | |
| messages | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, which the description aligns with by stating 'Reads messages'. The description adds the identification methods but no additional behavioral context like pagination or marking as read. With annotations present, this is adequate but not exceptional.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 13 words that conveys the core functionality without redundancy or fluff. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (reading messages), presence of annotations, and existence of an output schema, the description sufficiently informs the agent. It covers the primary use case but could mention ordering or output details, though not strictly necessary.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents each parameter's meaning. The description does not add semantic value beyond what the schema provides, meeting the baseline of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb 'Reads' and resource 'messages from an iMessage conversation', and specifies two methods (chat ID or contact name), distinguishing it from siblings like 'list_message_chats' and 'search_messages'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description implies usage when you have a chat_id or contact_name, but does not explicitly mention when to use alternatives like 'list_message_chats' to get chat IDs or 'search_messages' for broader search. No exclusions provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
read_noteRead NoteARead-onlyInspect
Reads the full content of a note by name or ID.
| Name | Required | Description | Default |
|---|---|---|---|
| note_id | No | ||
| note_name | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| body | No | |
| name | No | |
| folder | No | |
| modified | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, so the bar is lower. Description fully aligns with annotations (read operation). It adds the method of identification (by name or ID) but no additional behavioral context like rate limits or permissions. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 11 words, front-loaded with the action. Every word is meaningful; no redundancy. Highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple nature and presence of annotations and output schema, the description covers the basic function. However, it lacks information on error handling, parameter combinations, or output format. It is adequate but not complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description must compensate. It does add that the note can be identified 'by name or ID', but provides no details on parameter format, constraints, or which to use when both are provided. This is insufficient for full compensation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Reads the full content of a note by name or ID', specifying the verb 'reads', resource 'note', and scope 'full content'. It distinguishes from sibling tools like 'search_notes' (partial content) and 'list_notes' (titles).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when full note content is needed, but does not explicitly mention when to use this tool over alternatives like 'search_notes' or 'list_notes'. No when-not or alternative guidance is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
rename_reminder_listRename Reminder ListAInspect
Renames an existing Apple Reminders list. Pass the current list name (or list_id from list_reminder_lists) and new_name. Requires confirm=true.
| Name | Required | Description | Default |
|---|---|---|---|
| name | No | Current list name (or pass list_id) | |
| confirm | No | Must be true to apply | |
| list_id | No | List identifier from list_reminder_lists (alternative to name) | |
| new_name | Yes | New name for the list |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds behavioral context by stating 'Requires confirm=true,' which is a safety mechanism beyond the annotations. Annotations already indicate it is not read-only and not destructive, and the description confirms it is a write operation with a confirmation requirement, adding useful context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two sentences. The first sentence states the purpose, and the second gives essential usage parameters. No extraneous words, perfectly front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (so return values are documented elsewhere), the description covers the key input details and the confirmation requirement. It lacks explicit preconditions (e.g., list must exist) but is sufficient for a simple mutation tool. Minor gap on error handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage with descriptions for all 4 parameters. The description summarizes the parameters ('current list name or list_id', 'new_name', 'confirm=true') but does not add new meaning beyond what the schema already provides. Baseline for high coverage is 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Renames an existing Apple Reminders list,' specifying the action (rename) and resource (list). It distinguishes from sibling tools like 'create_reminder_list' and 'delete_reminder_list' by indicating it modifies an existing list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear usage instructions: 'Pass the current list name (or list_id from list_reminder_lists) and new_name. Requires confirm=true.' It hints at the prerequisite of listing lists but does not explicitly state when to use versus alternatives like 'create_reminder_list' or 'delete_reminder_list'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
reply_emailReply EmailADestructiveInspect
Replies to an email by message ID. Supports plain text or HTML body. Pass account (from list_emails/search_emails results) to skip scanning other accounts and avoid timeouts on multi-account Macs.
| Name | Required | Description | Default |
|---|---|---|---|
| body | No | ||
| account | No | ||
| confirm | No | false | |
| html_body | No | ||
| reply_all | No | false | |
| message_id | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| replied | No | |
| message_id | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true. Description adds that it replies, confirming mutation, but no additional behavioral context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, front-loaded with key action and supported content types, no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists so returns are covered. Description covers purpose and a key usage tip but omits details on half the parameters (confirm, reply_all, html_body) and their defaults.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 0% coverage. Description adds meaning for 'account' (source and benefit) and mentions body types, but does not explain other parameters like confirm, reply_all, or html_body vs body.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it replies to an email by message ID and supports plain text or HTML body. Distinguishes from siblings like send_email or create_draft.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides specific advice on passing the 'account' parameter to avoid timeouts, but does not explicitly state when not to use or list alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
report_problemReport ProblemAInspect
Report a problem, feature request, or integration request to the LMCP team. IMPORTANT: Do NOT call this tool automatically. ALWAYS ask the user first: "Would you like me to report this issue to the LMCP team?" Only call this tool if the user explicitly agrees. When called without confirm=true, returns a preview of the anonymous data that will be sent. Show this preview to the user and only set confirm=true after they approve. No personal data is included — only version, OS, and permission status. Use type='feature' when the user wants a new capability. Use type='integration' when the user wants to connect an unsupported app.
| Name | Required | Description | Default |
|---|---|---|---|
| confirm | No | Must be true to submit the report. Without it, shows a preview. | |
| symptom | No | Required for type=problem: what is broken, in your own words. | |
| expected | No | What you or the user expected to happen. | |
| description | No | Required for type=feature or integration: what the user wants. | |
| report_type | No | 'problem' (default) | 'feature' | 'integration' | |
| user_request | No | What the user originally asked the AI to do. | |
| error_message | No | For type=problem: verbatim error string from the failed tool. | |
| tool_attempted | No | For type=problem: name of the LMCP tool that failed. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description reveals key behavioral traits: it requires user consent, returns a preview without confirm=true, and states no personal data is included (only version, OS, permission status). This adds significant context beyond annotations (which only indicate non-read-only, open-world, non-destructive).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured, starting with the most critical warning (do not call automatically). Every sentence is informative and no information is redundant. It is comprehensive yet concise for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 8 parameters, three report types, and a two-step submission process (preview then confirm), the description covers all necessary context for correct agent usage. It explains consent, preview, data sent, and type selection, making it complete even without needing to reference the output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, baseline is 3. The description enhances understanding by explaining conditional requirements (symptom/description based on type), the preview role of confirm, and how parameters like tool_attempted and error_message are used for problem reports. It does not repeat schema but adds usage context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool reports problems, feature requests, or integration requests. It distinguishes between these via the 'type' parameter and explicitly contrasts with using 'report_problem' for all three categories, making the purpose specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit instructions: never call automatically, always ask user first, use the specified phrasing, and only set confirm=true after user approval. It also explains when to use each report type with concrete examples ('Use type=feature when...').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
request_featureRequest FeatureAInspect
Submit a feature request to the LMCP team — a new capability, a tool that doesn't exist yet, or an app/integration the user wishes LMCP supported. Ask the user first, then call with confirm=true. Without confirm, returns a preview. The request is sent with your machine ID and (if set) your account email so the team can follow up — not anonymous. Tip: requesting features raises the user's LMCP engagement rank.
| Name | Required | Description | Default |
|---|---|---|---|
| confirm | No | Must be true to submit. Without it, shows a preview. | |
| feature | Yes | What the user wants LMCP to do — a capability, tool, or integration. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds details beyond annotations: confirmation step, preview behavior, machine ID and email tracking, and engagement rank effect. Annotations (readOnlyHint=false) already indicate a write operation, but the description enriches the behavioral model significantly.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single focused paragraph, front-loaded with purpose and workflow. Every sentence is informative and concise, with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the existence of an output schema (not shown but indicated), the description covers all necessary aspects: purpose, usage, parameter semantics, and behavioral nuances. It is complete for an AI agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline 3. The description adds value by explaining the confirm parameter's role (must be true to submit, otherwise preview) and clarifying the feature parameter's scope. This goes beyond the schema's generic descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Submit a feature request to the LMCP team' and specifies what types of requests (new capability, tool, etc.), making the tool's purpose unambiguous and distinct from all sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit workflow: 'Ask the user first, then call with confirm=true. Without confirm, returns a preview.' It also mentions non-anonymity and engagement rank tip. However, it does not explicitly state when not to use this tool, though no alternatives exist.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
run_diagnosticsRun DiagnosticsARead-onlyInspect
Runs a fast health check of all LMCP integrations on this machine. Shows what works, what doesn't, and how to fix it. Optionally submits a report to the LMCP team.
| Name | Required | Description | Default |
|---|---|---|---|
| focus | No | Integration to focus on: calendar, mail, contacts, reminders, omnifocus, outlook, notes, finder, onedrive. Leave empty to check all. | |
| submit | No | Send the diagnostic report to the LMCP team for analysis (default: false) |
Output Schema
| Name | Required | Description |
|---|---|---|
| report | No | Full formatted text report |
| summary | Yes | Plain-language summary of overall health |
| ok_count | Yes | Number of integrations working |
| submitted | No | True when the report was sent to the LMCP team |
| warn_count | Yes | Number of integrations with warnings / not running |
| integrations | Yes | |
| problem_count | Yes | Number of integrations with errors or missing permissions |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false. Beyond that, the description adds that it is 'fast' and optionally submits a report, giving context on its non-destructive nature and optional data submission.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences with no wasted words. It front-loads the main purpose and is easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and annotations, the description covers the essential purpose, optional submission, and diagnostic output. It is complete for a read-only diagnostic tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both parameters. The description only hints at the submit parameter and does not add significant meaning beyond the schema, earning the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool runs a fast health check of all LMCP integrations on the machine, showing what works and how to fix it. This distinguishes it from sibling tools like nordvpn_diagnose or outlook_diagnose, which focus on specific integrations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains what the tool does but does not provide explicit guidance on when to use it versus alternatives like nordvpn_diagnose or outlook_diagnose. It implies a comprehensive check but lacks 'when not to use' instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_clickSafari ClickAInspect
Clicks the first element matching a CSS selector in the current Safari tab. Returns the tag name and visible text of the clicked element so you can confirm the right thing was hit. Pass wait_for_navigation: true to wait up to 3 seconds for the page to load after the click (useful when clicking links or buttons that trigger navigation).
| Name | Required | Description | Default |
|---|---|---|---|
| nth | No | Which match to click if there are several (0-based, default 0) | |
| selector | Yes | CSS selector (e.g. 'button.primary', '#save', '[data-testid=login]') | |
| wait_for_navigation | No | Wait up to 3s for page load after click (default false) |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations, describes wait up to 3s for navigation and return values for verification. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: purpose first, then return info and parameter guidance. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists; description covers return values and the only notable behavioral nuance (wait for navigation). Sufficient for a click tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Adds context beyond schema: wait_for_navigation useful for navigation triggers, nth 0-based default 0. Schema coverage 100% already provides base.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb ('clicks') and resource ('first element matching a CSS selector in the current Safari tab') clearly differentiate from sibling tools like chrome_click and other Safari actions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Describes when to use wait_for_navigation (for links/buttons triggering navigation) and confirms click via return values, but does not explicitly state when not to use or mention alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_evaluate_jsSafari Evaluate JSBRead-onlyInspect
Runs arbitrary JavaScript in the current Safari tab and returns its result. Requires 'Allow JavaScript from Apple Events' in Safari's Develop menu.
| Name | Required | Description | Default |
|---|---|---|---|
| script | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, openWorldHint=false, destructiveHint=false, so the safety profile is clear. The description adds the prerequisite about Safari settings, which is useful behavioral context. However, it does not mention potential limitations (e.g., script execution timeouts, result size limits) or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences with no wasted words. The first sentence states the primary purpose, and the second adds a critical prerequisite. Appropriately sized for a simple tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
An output schema exists (not shown), so return value details are presumably covered elsewhere. However, the description lacks information on error behavior (e.g., if JS throws an exception), script execution context, or how to handle multiple returns. For a 1-parameter tool, it is minimally complete but could be improved.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. The parameter 'script' is not explained beyond 'Runs arbitrary JavaScript', leaving the agent to infer format (string of JS code), allowed syntax, or how the result is returned. No details on parameter semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'runs' and the resource 'arbitrary JavaScript in the current Safari tab', with the result returned. This distinguishes it from other Safari tools like safari_read_tab (reads page content) and safari_click (clicks elements), and the name inherently differentiates from chrome_evaluate_js.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a prerequisite ('Requires Allow JavaScript from Apple Events in Safari's Develop menu'), but does not specify when to use this tool versus alternatives like safari_read_tab for reading page content or safari_click for interacting with elements. No when-not-to-use guidance is given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_fill_formSafari Fill FormARead-onlyInspect
Fills multiple form fields in one shot. Pass fields as a JSON object mapping CSS selector to value.
| Name | Required | Description | Default |
|---|---|---|---|
| fields | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| result | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, but the description indicates a write operation (filling forms), creating a contradiction. No other behavioral traits are disclosed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences front-load the main purpose and parameter semantics, with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, the description need not cover return values, but the annotation contradiction and lack of usage guidelines leave gaps in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema only defines 'fields' as a string; the description adds crucial meaning that it is a JSON object mapping CSS selectors to values, compensating for 0% schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it fills multiple form fields in one shot with a JSON object mapping CSS selectors to values, distinguishing it from single-field tools like safari_type and chrome_fill_form.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use for multiple fields but does not explicitly state when not to use it or mention alternatives like safari_type for single fields.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_go_backSafari Go BackBRead-onlyInspect
Navigates the current Safari tab back to the previous page.
| Name | Required | Description | Default |
|---|---|---|---|
| window_index | No | 0 |
Output Schema
| Name | Required | Description |
|---|---|---|
| to | No | |
| from | No | |
| went_back | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description only repeats the name's meaning without adding behavioral context. Annotations indicate readOnlyHint=true, but the described navigation changes the tab's state, which is a potential contradiction. No details on required permissions, side effects, or edge cases.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence with no unnecessary words. It is front-loaded and direct.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple navigation tool, the description covers the basic action. However, it lacks parameter documentation and usage context. With an output schema present, return values are less critical, but the missing parameter guidance reduces completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The one parameter (window_index) has no description in the schema (0% coverage) and is not mentioned in the tool description. The agent receives no help understanding its purpose or default behavior.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (navigates back) and the resource (current Safari tab), distinguishing it from similar tools like safari_navigate or safari_list_tabs. The tool name and title reinforce the purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives (e.g., chrome_go_back, safari_navigate). No prerequisites mentioned, such as having a previous page in history.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_list_tabsSafari List TabsARead-onlyInspect
Lists every open tab across all Safari windows with title, URL, and whether it is active.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| tabs | No | |
| count | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the description simply confirms the read operation. It does not add further behavioral details such as window scope or tab filtering, but the tool's behavior is straightforward and consistent with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that efficiently communicates the tool's purpose, scope, and output. Every element is necessary and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters, presence of an output schema (implied), and annotations that cover safety, the description is complete. It describes what the tool returns and its scope, requiring no additional context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has no parameters and schema coverage is 100%. The description correctly implies no user input is needed. With zero parameters, the baseline is 4, and the description adequately conveys that.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists all open tabs across all Safari windows with specific attributes (title, URL, active status). This distinguishes it from siblings like safari_search_tabs (which filters) and chrome_list_tabs (different browser).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide explicit guidance on when to use this tool versus alternatives like safari_search_tabs or safari_read_tab. The usage is implied from the description, but no when-not or alternative context is given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_query_selector_allSafari Query Selector AllARead-onlyInspect
Runs document.querySelectorAll in the current Safari tab and returns a compact summary of each match.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | 50 | |
| selector | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| result | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint and destructiveHint, so the description's mention of 'compact summary' adds minimal but consistent context. No additional behavioral details are disclosed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single, concise sentence that efficiently conveys the tool's action and result without unnecessary detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple two-parameter tool with an output schema and annotations covering safety, the description is functionally complete but lacks usage context compared to similar tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description does not explain the 'selector' or 'limit' parameters. The mention of 'document.querySelectorAll' implies selector is a CSS selector, but 'limit' remains undocumented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it runs document.querySelectorAll in the current Safari tab and returns a compact summary, differentiating it from sibling tools like chrome_query_selector_all and safari_evaluate_js.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like safari_evaluate_js or safari_read_tab. The description only states functionality, not context of use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_read_tabSafari Read TabARead-onlyInspect
Reads the rendered text content of a Safari tab. Identify the tab either by url_match (substring match against URL; first hit wins) or by window_index + tab_index (from safari_list_tabs). Text is capped at max_bytes (default 100 KB). Pass include_html: true to also get the raw HTML source. Pass include_links: true to extract all links with their href and text (useful for following navigation in SPAs like dashboards).
| Name | Required | Description | Default |
|---|---|---|---|
| max_bytes | No | Max bytes of text (and html) to return (default 102400) | |
| tab_index | No | Tab index from safari_list_tabs (default current tab of that window) | |
| url_match | No | Substring to match against the tab URL. Takes precedence over indices. | |
| include_html | No | Also return the HTML source (default false) | |
| window_index | No | Window index from safari_list_tabs (default 0) | |
| include_links | No | Extract all links with href + visible text (default false). Great for navigating SPAs. |
Output Schema
| Name | Required | Description |
|---|---|---|
| url | No | |
| html | No | |
| text | No | |
| links | No | |
| title | No | |
| truncated | No | |
| html_bytes | No | |
| link_count | No | |
| text_bytes | No | |
| links_error | No | |
| html_truncated | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true and destructiveHint=false; the description reinforces this by describing a read operation. Discloses behavioral details: text capped at max_bytes with default 100 KB, and optional returns of HTML and links. Adds value beyond annotations with specific capping information.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, each purposeful and front-loaded. No redundancy. Efficiently covers purpose, parameters, and usage in a compact, scannable format.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (6 parameters, none required, output schema exists), the description is complete. Covers identification, return options, and capping. With an output schema, return values need no explanation. Agent has all needed information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. Description enhances parameter meaning: explains url_match as substring matching with first-hit-wins, states default for max_bytes (100 KB), and clarifies include_links extracts href and text. These additions go beyond the terse schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description starts with 'Reads the rendered text content of a Safari tab,' clearly stating the verb (reads) and the resource (text content). It distinguishes from siblings by specifying identification methods (url_match vs indices) and optional return types (HTML, links), making it easy to differentiate from other Safari tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear guidance on when to use url_match vs window_index+tab_index, including precedence rules. Explains optional parameters and their purposes (e.g., include_links for SPAs). Lacks explicit when-not-to-use or alternative sibling mentions, but the context is sufficient for correct invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_search_tabsSafari Search TabsARead-onlyInspect
Searches the rendered text of every open Safari tab for a substring. Returns each matching tab with the surrounding snippet. Useful for 'do I have a tab open with X?' questions across dozens of tabs.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Substring to search for (case-insensitive) | |
| context | No | Characters of context around each match (default 120) | |
| max_tabs | No | Max tabs to scan (default 30). Higher = slower. |
Output Schema
| Name | Required | Description |
|---|---|---|
| hits | No | |
| query | No | |
| scanned | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the description is not burdened with basic safety. It adds value by specifying that it searches 'rendered text' (not just titles/URLs) and returns snippets, and the max_tabs parameter note warns about performance. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two brief sentences accomplish full purpose, outcome, and use case without any wasted words. Front-loaded with the action and result, perfectly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (search across tabs, return matches with snippets), the description covers the essential behavior. The presence of an output schema further reduces the need to describe return format. However, it does not address potential edge cases like unloaded tabs or dynamic content, leaving minor gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and each parameter has a clear description in the schema. The tool description does not add new details beyond what the schema provides, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('searches'), the resource ('every open Safari tab'), and the outcome ('returns each matching tab with the surrounding snippet'). It effectively distinguishes from siblings like 'safari_list_tabs' and 'safari_read_tab' by focusing on substring search across rendered text.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description includes an explicit use case: 'Useful for "do I have a tab open with X?" questions across dozens of tabs.' While it doesn't explicitly state when not to use or list alternatives, the context makes the tool's niche clear compared to sibling tools for reading or listing tabs.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_setup_checkSafari Setup CheckARead-onlyInspect
Reports whether Safari is ready for interactive tools (safari_click, safari_type, safari_evaluate_js). Returns setup instructions if JavaScript from Apple Events is not enabled.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| tabs_open | No | |
| instructions | No | |
| ready_for_js_tools | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint and destructiveHint; description adds that it returns setup instructions if JavaScript is disabled.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, clearly front-loaded with purpose and outcome.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Fully explains purpose, behavior, and outcome for a simple check tool with output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters; description doesn't need to add param info.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool reports whether Safari is ready for interactive tools, and distinguishes from similar setup checks like chrome_setup_check.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage before interactive Safari tools, but lacks explicit when-not-to-use or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_typeSafari TypeCRead-onlyInspect
Sets the value of an input/textarea matching a CSS selector and fires input/change events.
| Name | Required | Description | Default |
|---|---|---|---|
| clear | No | true | |
| value | Yes | ||
| selector | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| ok | No | |
| result | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
There is a contradiction: annotations declare readOnlyHint=true (indicating no write), but the description states it sets a value and fires events (a write). This severely misleads the agent. No additional behavioral traits are disclosed beyond the firing of events.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise (one sentence, 14 words), but it omits critical details. It is too terse to be effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite an output schema existing (per context), the description fails to cover key aspects: behavior of 'clear' parameter, error handling for invalid selectors, or what the output contains. The description is grossly incomplete for a tool that interacts with web pages.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds no meaning to the three parameters. With schema description coverage at 0%, the description fails completely to explain parameters like 'clear' (default true) or how to format the selector. A score of 1 is warranted.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('sets'), the specific resource ('value of an input/textarea'), and the action ('fires input/change events'). It distinguishes from sibling tools like safari_fill_form by focusing on a single element via CSS selector.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., safari_fill_form for multiple fields, safari_click for buttons). The description lacks any scenarios or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
safari_wait_forSafari Wait ForARead-onlyInspect
Polls the current Safari tab until a CSS selector appears (or its text matches, if text_match is provided). Useful after safari_click to wait for the next page or a modal to render.
| Name | Required | Description | Default |
|---|---|---|---|
| selector | Yes | CSS selector to wait for | |
| text_match | No | Optional substring that must appear inside the matched element | |
| timeout_ms | No | Max time to wait (default 10000 = 10s, max 30000) |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds behavioral context beyond annotations: it reveals that the tool polls (repeatedly checks) until the selector appears. Annotations already indicate readOnlyHint=true and destructiveHint=false, so the description's polling behavior is consistent and adds value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: two sentences. The first sentence defines the core behavior, the second provides a usage hint. No unnecessary words, front-loads the purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple tool and the presence of an output schema (not shown), the description covers the essential behavior (polling, CSS selector, optional text match, usage after click). It does not mention return behavior or timeout details, but the parameters handle those. It is adequate for a straightforward polling tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds no additional meaning beyond the schema for the parameters; it only mentions text_match in passing. The schema already provides descriptions for all three parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool polls the current Safari tab until a CSS selector appears, and optionally matches text. It specifies the verb (polls) and resource (CSS selector in Safari tab). It is distinct from siblings like safari_click and safari_navigate, though it does not explicitly contrast with alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description says 'Useful after safari_click', providing a specific usage context. However, it does not provide guidance on when not to use it or mention alternatives (e.g., other ways to wait). The guidance is minimal but clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
save_attachmentSave AttachmentBRead-onlyInspect
Saves an attachment from an email to disk. Pass account= (and mailbox= if known, both from list_emails/search_emails) so the lookup targets one account instead of scanning all of them.
| Name | Required | Description | Default |
|---|---|---|---|
| account | No | ||
| confirm | No | false | |
| mailbox | No | ||
| message_id | Yes | ||
| destination | No | ||
| attachment_name | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| name | No | |
| saved | No | |
| attempts | No | |
| destination | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description claims the tool 'saves to disk' (a write operation), but annotations set readOnlyHint=true, indicating no state modification. This is a direct contradiction, violating transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no redundancy. The first sentence states the core action, the second adds a performance tip. Front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists, so return docs are covered, but the description fails to specify destination defaults, confirm behavior, or attachment_name format. For a tool with 6 params and zero schema descriptions, this is insufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only two of six parameters (account, mailbox) are explained in the description, and those explanations are brief. The required parameters 'message_id' and 'attachment_name' are left completely undocumented, and schema has 0% coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Explicitly states 'Saves an attachment from an email to disk', providing a specific verb and resource. No sibling tool overlaps in functionality, so distinction is clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Advises passing 'account' and 'mailbox' parameters to optimize lookup speed. While it doesn't exclude alternative uses, the guidance is actionable and contextually relevant.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
search_contactsSearch ContactsARead-onlyInspect
Searches the Mac's Contacts app (Contacts.app, local/iCloud) by name, email, or phone number. For a Microsoft 365 directory use m365_search_contacts or search_m365_directory instead.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 50) | |
| query | Yes | Name, email, or phone to search for |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | |
| query | Yes | |
| contacts | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds specific context about the app scope (Mac's Contacts.app, local/iCloud), going beyond structured data.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with action, no unnecessary words. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given schema covers all parameters, annotations handle safety, and output schema exists, description is fully complete for tool usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so parameters are fully documented in schema. Description reiterates searchable fields but adds no new semantic details beyond what schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it searches the Mac's Contacts app by name, email, or phone number. It distinguishes itself from M365 alternatives, making purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use this tool (searching local/iCloud contacts) and when to use alternatives (M365 directory), providing clear usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
search_emailsSearch EmailsARead-onlyInspect
Searches emails in the Mac's Apple Mail (Mail.app, local accounts). For a Microsoft 365 / Outlook mailbox use m365_search_emails instead. On machines with 3+ accounts, pass account= (from list_email_accounts) to search a specific account and avoid timeouts.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | 20 | |
| query | Yes | ||
| account | No | ||
| mailbox | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| results | No | |
| next_actions | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds context that it searches local accounts, warns about potential timeouts with 3+ accounts, and suggests mitigation. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences with no wasted words. Front-loaded with purpose, then usage guidance and parameters. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool has output schema and is relatively simple, but description lacks explanation for most parameters (3 out of 4 undocumented). The core purpose and alternative tool are covered, but parameter semantics are incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so description must compensate. Only the 'account' parameter is explained (with guidance). The 'query', 'limit', and 'mailbox' parameters are not described, leaving the agent to infer meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it searches emails in Apple Mail local accounts, uses specific verb 'searches' and resource, and distinguishes from sibling 'm365_search_emails' by specifying platform.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use this tool vs alternatives (for local Apple Mail, not M365), and provides guidance on using 'account' parameter with multiple accounts to avoid timeouts, referencing list_email_accounts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
search_m365_directorySearch Microsoft 365 DirectoryARead-onlyInspect
Search your organization's Microsoft 365 directory for users by name or email. Returns matching users with their title, department, and contact info.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 10, max 25) | |
| query | Yes | Name or email to search for, e.g. 'Sarah' or 'sarah@contoso.com' |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| users | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds context: it returns matching users with specific fields, and implies safe read-only behavior. No contradictions exist.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no fluff, front-loaded with the tool's action and output. Every word serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple search tool with an output schema (present but not detailed), the description sufficiently covers purpose, input, and output. It mentions the key return fields, which aligns with typical directory search needs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both 'query' and 'limit' parameters fully described. The description only reinforces 'search by name or email', adding no new semantics beyond the schema. Baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'search', the resource 'Microsoft 365 directory', and the search criteria (name or email), along with the returned fields (title, department, contact info). This distinguishes it from sibling search tools like search_contacts or m365_search_contacts, which have overlapping but different scopes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implicitly guides usage by specifying the search target and return fields, but lacks explicit guidance on when to use this tool versus alternatives (e.g., search_contacts, m365_search_contacts). There is no mention of exclusions or required prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
search_messagesSearch MessagesARead-onlyInspect
Searches iMessage conversations by content, sender name, or date range.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 30) | |
| query | No | Text to search for in message content (optional if from_sender is set) | |
| since | No | ISO8601 date — only return messages on or after this date (optional, e.g. '2026-04-10' or '2026-04-10T00:00:00Z') | |
| from_sender | No | Substring of sender name/handle to filter by (optional) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| since | No | |
| results | No | |
| from_sender | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the safety profile is clear. The description adds minimal behavioral context beyond what annotations provide (e.g., it confirms the tool is read-only and non-destructive). No additional traits (e.g., pagination, substring matching) are disclosed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence of 10 words, front-loading the key action and criteria. It is extremely concise with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and full parameter coverage in the input schema, the description captures the essential functionality. It omits no critical context for a simple search tool, though it could mention return format or result ordering.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description only summarizes the parameter purposes (content, sender name, date range) without adding new meaning or syntax details. It does not compensate for any schema gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Searches' and the resource 'iMessage conversations', with explicit criteria ('by content, sender name, or date range'). This distinguishes it from sibling tools like list_message_chats (which only lists chats) and read_messages (which reads specific messages).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like list_message_chats or search_emails. It does not mention any conditions or exclusions, leaving the agent to infer usage context from the name and sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
search_notesSearch NotesBRead-onlyInspect
Searches Apple Notes by title or content.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | 20 | |
| query | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| results | No | |
| next_actions | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, which are consistent with a search operation. The description adds that it searches by title or content, which is useful. However, it does not mention details like full-text search or snippet behavior, but output schema may cover return format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (one sentence) but lacks necessary detail for an agent to fully understand the tool's behavior. It could be longer to include parameter hints or usage notes while remaining efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool is a simple search (2 params, one required) and has an output schema, the description is somewhat complete. However, it fails to differentiate from similar siblings (list_notes, search_notes) or mention limitations, leaving gaps for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description should compensate. It only mentions 'by title or content' which partially explains the 'query' parameter, but does not clarify format, case sensitivity, or the effect of 'limit'. The description adds minimal value over the raw schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Searches' and the resource 'Apple Notes', and specifies the search criteria 'by title or content'. It distinguishes itself from siblings like list_notes which likely lists notes without search, but does not elaborate on the scope or method of search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use search_notes versus alternatives like list_notes or read_note. The description implies it is for searching, but does not specify scenarios or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
search_omnifocus_tasksSearch OmniFocus TasksBRead-onlyInspect
Searches OmniFocus tasks by name or note content.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | 30 | |
| query | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| results | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, covering safety. The description adds behavioral context about searching by name or note content, but does not disclose further traits like search behavior (case sensitivity, partial matching), pagination, or rate limits. With annotations present, the description provides moderate added value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence with no wasted words. It is front-loaded with the verb and resource, and every word contributes to the purpose. Perfectly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a search tool with two parameters and an output schema, the description is minimal. It does not explain how the search algorithm works (e.g., substring match, fuzzy), expected behaviors for empty queries or limits, or any edge cases. The existence of an output schema partially compensates for return values, but the overall completeness is low.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so the description must compensate. It clarifies that the 'query' parameter is used to search by name or note content, but does not address the 'limit' parameter (integer, default=30) at all. The description adds some meaning for one parameter but fails to describe the other, leaving the agent to guess.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'searches' and the resource 'OmniFocus tasks', and specifies the search fields 'by name or note content'. It distinguishes from sibling tools like list_omnifocus_tasks (which returns all tasks) and search_notes (which searches notes by content). However, it could be more specific about match semantics (exact, partial, case-sensitive).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when needing to find tasks by name or note content, but does not explicitly state when to use this tool versus alternatives like list_omnifocus_tasks or search_notes. No exclusions or when-not-to-use guidance is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_emailSend EmailADestructiveInspect
Composes and sends an email via Mail.app. Supports plain text or HTML body. Pass from to send from a specific configured Mail.app account instead of the default sender. Pass attachments as a comma-separated list of absolute file paths to attach files.
| Name | Required | Description | Default |
|---|---|---|---|
| cc | No | ||
| to | Yes | ||
| bcc | No | ||
| body | No | ||
| from | No | ||
| confirm | No | false | |
| subject | Yes | ||
| html_body | No | ||
| attachments | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| to | No | |
| from | No | |
| sent | No | |
| subject | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true, which the description supports by stating it sends an email. The description adds context about attachments and html_body but does not disclose potential failure modes or prerequisites. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences long, front-loaded with the core purpose, and adds detail about body type and optional parameters without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the destructive hint and presence of an output schema, the description covers essential usage for a simple email tool. It could mention error cases or prerequisites like Mail.app configuration, but it is largely complete for basic operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description compensates by explaining from (specific account) and attachments (comma-separated paths). However, it omits details for cc, bcc, body, html_body, and confirm beyond default value, leaving many parameters only named.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool composes and sends an email via Mail.app, specifying the verb, resource, and platform. It distinguishes from sibling email tools like reply_email and m365_send_email.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for usage (Mail.app specific) and explains how to use optional parameters like from and attachments. However, it does not explicitly state when not to use this tool or mention alternatives among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_messageSend MessageAInspect
Sends an iMessage via the Mac's Messages.app to a recipient handle (phone number with country code, e.g. +14155551234, or an Apple ID email). This is a write operation: the first call (without confirm) returns a preview; call again with confirm=true to actually send. Direct (1:1) iMessage only — sending into an existing group chat isn't supported yet. Requires Messages.app signed in to iMessage + Automation permission.
| Name | Required | Description | Default |
|---|---|---|---|
| to | Yes | Recipient handle: phone number with country code (+14155551234) or Apple ID email. | |
| text | Yes | Message body to send. | |
| confirm | No | Set true to actually send. Without it, returns a preview only. |
Output Schema
| Name | Required | Description |
|---|---|---|
| to | No | |
| note | No | |
| sent | No | |
| text | No | |
| error | No | |
| preview | No | |
| service | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that it's a write operation (not read-only), the two-step process to prevent accidental sends, and prerequisites (Messages.app signed in, Automation permission). Annotations already indicate non-read-only and non-destructive, and description adds valuable context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each serving a specific purpose: action and recipient format, usage pattern, limitations and requirements. No filler, well-front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With output schema present, the description covers all necessary context: prerequisites, usage pattern, limitations, and parameter meaning. It is fully self-contained for an AI agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds value by clarifying phone number format and the meaning of confirm parameter, which is not fully explained in schema. However, schema already defines types and descriptions adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states it sends an iMessage via Messages.app, specifying the recipient handle format (phone with country code or Apple ID). It clearly distinguishes from sibling messaging tools like teams_send_message, whatsapp_send_message, etc., by naming iMessage.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit two-step usage: first call without confirm returns preview, second with confirm=true sends. Also notes limitation: only 1:1, no group chats. This guides when to use and how.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
servicenow_add_commentServiceNow Add CommentAInspect
Add a comment or work note to a ServiceNow incident. Comments are visible to the caller; work notes are internal only.
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Comment or work note text | |
| type | No | 'comment' (visible to caller, default) or 'work_note' (internal only) | |
| sys_id | Yes | Incident sys_id from servicenow_get_incident |
Output Schema
| Name | Required | Description |
|---|---|---|
| type | Yes | comment | work_note |
| added | Yes | |
| sys_id | Yes | |
| message | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate write operation (readOnlyHint=false) and non-destructive (destructiveHint=false). The description adds the visibility difference between comment and work note, but does not detail other traits like whether it appends or replaces existing comments. The additional context is moderate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no redundancy. Front-loaded with the action and resource. Every word contributes to understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (3 parameters, output schema exists), the description covers the core purpose and the key distinction between comment types. It is complete for an AI agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all parameters. The description adds value by explaining the 'type' parameter's options and their visibility implications, which enriches the schema. However, 'text' and 'sys_id' are straightforward and need no further clarification.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it adds a comment or work note to a ServiceNow incident, specifying the difference between comment (visible to caller) and work note (internal). It distinguishes from sibling tools like servicenow_update_incident or servicenow_create_incident.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for adding feedback to incidents but does not explicitly state when to use this tool versus alternatives like servicenow_update_incident or when not to use it. Lacks explicit guidance on prerequisites or context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
servicenow_create_incidentServiceNow Create IncidentBInspect
Create a new incident in ServiceNow.
| Name | Required | Description | Default |
|---|---|---|---|
| urgency | No | 1=Critical, 2=High, 3=Medium (default), 4=Low | |
| category | No | Incident category, e.g. 'software', 'hardware', 'network' | |
| description | No | Full description of the issue | |
| short_description | Yes | Brief summary of the issue (required) |
Output Schema
| Name | Required | Description |
|---|---|---|
| url | No | |
| number | Yes | |
| sys_id | Yes | |
| short_description | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate this is a mutation (readOnlyHint=false). The description adds no extra behavioral context such as required permissions, reversibility, or return value details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no wasted words. While arguably too brief, it is concise and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists, so return values need not be described. However, the description lacks context about required prior connection (connect_servicenow) and does not specify that it creates under the current user.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema covers all 4 parameters with descriptions (100% coverage). The tool description adds no additional meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Create') and the resource ('a new incident in ServiceNow'). It distinguishes from sibling tools like servicenow_get_incident, servicenow_update_incident, etc.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., servicenow_add_comment for comments, servicenow_update_incident for updates). Does not mention prerequisites like connecting ServiceNow first.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
servicenow_get_incidentServiceNow Get IncidentARead-onlyInspect
Get full details of a specific ServiceNow incident by number or sys_id.
| Name | Required | Description | Default |
|---|---|---|---|
| id | Yes | Incident number (e.g. 'INC0012345') or sys_id |
Output Schema
| Name | Required | Description |
|---|---|---|
| url | No | |
| state | No | |
| impact | No | |
| number | Yes | |
| sys_id | Yes | |
| urgency | No | |
| category | No | |
| priority | No | |
| caller_id | No | |
| opened_at | No | |
| updated_at | No | |
| work_notes | No | |
| assigned_to | No | |
| close_notes | No | |
| description | No | |
| resolved_at | No | |
| short_description | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows this is a safe read. The description adds no extra behavioral context beyond confirming it retrieves full details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with verb and resource, no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has an output schema and a single parameter, the description is complete: it specifies what the tool does and what identifier to use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the schema already documents the 'id' parameter with examples. The description's mention of 'by number or sys_id' adds little beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get', the resource 'full details of a specific ServiceNow incident', and the identifier types (number or sys_id). It distinguishes from sibling tools like servicenow_create_incident or servicenow_search_incidents.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when you have an incident number or sys_id, but does not explicitly state when to use this tool over alternatives, nor provide any exclusions or context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
servicenow_list_my_incidentsServiceNow List My IncidentsARead-onlyInspect
List incidents assigned to you or opened by you in ServiceNow.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 20, max 50) | |
| state | No | Filter by state: 'open' (default), 'resolved', 'all' |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | |
| incidents | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds behavioral context by specifying the scope (assigned to or opened by user). This goes beyond what annotations provide, though it doesn't disclose response size limits or pagination details beyond the limit parameter.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with no extraneous information. Every word adds value for an AI agent to understand the tool's purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has an output schema, so return values need not be explained. The description covers the core functionality and filtering scope. It implicitly assumes a connected ServiceNow account (with sibling connect_servicenow), which is acceptable for this tool's context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters (limit and state) already described in the input schema. The tool description adds no additional parameter meaning beyond what is in the schema, so baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'list', the resource 'incidents', and the scope 'assigned to you or opened by you'. This distinguishes it from sibling tools like servicenow_search_incidents (broader search) and servicenow_get_incident (single incident retrieval).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies when to use: for listing personal incidents (assigned or opened by user). It implies the context of incidents relevant to the current user, but does not explicitly state when not to use it or mention alternatives like servicenow_search_incidents for broader queries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
servicenow_search_incidentsServiceNow Search IncidentsARead-onlyInspect
Search incidents in ServiceNow by keyword, number, or caller.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 10, max 50) | |
| query | Yes | Free-text search or incident number (e.g. 'INC0012345' or 'printer not working') |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | |
| incidents | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds minimal behavioral context (searchable fields) but does not disclose pagination, result format, or whether full incident details are returned. For a read-only tool with annotations, the description adds little beyond what is already known.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with no wasted words. Every word adds value: verb, resource, and searchable fields.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity, annotations, and existing output schema, the description covers the basic purpose and search scope. However, it lacks guidance on sibling differentiation (e.g., when to use get_incident or list_my_incidents) and behavioral details like pagination or default ordering. The description is adequate but could be enhanced for completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, but the tool description explicitly mentions 'caller' as a searchable field, which is not in the schema's query description. This adds meaning beyond the schema, compensating for the schema's omission.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'search', the resource 'incidents', and specifies searchable fields (keyword, number, caller). It distinguishes from siblings like 'servicenow_get_incident' and 'servicenow_list_my_incidents' which have different scopes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for searching incidents but does not explicitly state when to use this tool versus alternatives (e.g., get_incident for a specific number, list_my_incidents for assigned incidents). No when-not-to-use guidance is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
servicenow_search_kbServiceNow Search KBBRead-onlyInspect
Search the ServiceNow Knowledge Base for articles.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 5, max 20) | |
| query | Yes | Search terms, e.g. 'reset password' or 'VPN setup' |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | |
| articles | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
While annotations already indicate read-only and non-destructive behavior, the description adds no further behavioral context such as scope (connected account), result format, or pagination details. It offers minimal value beyond the structured fields.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, which is concise but lacks structure. It could be expanded slightly to include key context without becoming verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and annotations, the description is minimally adequate. However, it does not mention that the search is within the connected ServiceNow account or clarify the scope of articles returned.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already provides descriptions for both parameters, achieving 100% coverage. The description adds no additional semantic meaning or usage tips about the parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('search') and resource ('ServiceNow Knowledge Base'). It effectively distinguishes from sibling tool servicenow_search_incidents, which searches a different resource.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for searching knowledge base articles but does not explicitly specify when to use versus alternatives like servicenow_search_incidents or other search tools. It lacks guidance on prerequisites or context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
servicenow_update_incidentServiceNow Update IncidentAInspect
Update fields on an existing ServiceNow incident — state, priority, assignment. Use sys_id from servicenow_get_incident.
| Name | Required | Description | Default |
|---|---|---|---|
| state | No | 1=New, 2=In Progress, 3=On Hold, 6=Resolved, 7=Closed | |
| sys_id | Yes | Incident sys_id from servicenow_get_incident | |
| urgency | No | 1=Critical, 2=High, 3=Medium, 4=Low | |
| priority | No | 1=Critical, 2=High, 3=Moderate, 4=Low, 5=Planning | |
| assigned_to | No | Username or email to assign to | |
| close_notes | No | Resolution notes (required when state=6 or 7) | |
| short_description | No | Updated summary |
Output Schema
| Name | Required | Description |
|---|---|---|
| url | No | |
| state | No | |
| number | No | |
| sys_id | Yes | |
| updated | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate not read-only, not destructive. Description adds only that it updates fields; no additional behavioral traits like permissions or side effects are disclosed. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
One sentence plus a clause; each word earns its place. Front-loaded with verb and resource, then examples, then usage tip. No redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 7 parameters, 1 required, annotations present, and an output schema, the description is sufficient. It covers prerequisite and key fields. Could mention idempotency, but not required.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with detailed descriptions. The description reinforces key parameters (state, priority, assignment) and sys_id usage but adds no new semantic meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Update', the resource 'existing ServiceNow incident', and lists key fields (state, priority, assignment). It distinguishes from sibling tools like create and get by specifying 'existing' and referencing sys_id from get_incident.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit guidance to use sys_id from servicenow_get_incident, indicating the prerequisite. Does not explicitly state when not to use, but the verb and context imply it's for updates only.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
setup_installInstall LMCPARead-onlyInspect
Install LMCP — free Mac app giving the user access to Mail, Calendar, Contacts, Teams, OneDrive, Notes, Reminders, and 100+ tools on their Mac (data stays local). Call this NOW to show their personalized install link (~30 sec). Pass os="macos" unless they said otherwise (windows/linux/ios/android). Optional: email, step, issue.
| Name | Required | Description | Default |
|---|---|---|---|
| os | Yes | macos | windows | linux | ios | android. Cloud connectors must pass os (or server asks). Desktop terminal clients may omit → macOS. Windows/Linux/mobile → waitlist (macOS-only today). | |
| step | No | If stuck: connector | install | email | connecting_stuck | server_down. | |
| No | Optional. Helps Cloud Relay auto-connect after install. | ||
| issue | No | Optional tag: gatekeeper_error, dot_not_green, install_failed, etc. |
Output Schema
| Name | Required | Description |
|---|---|---|
| os | No | Target operating system the instructions are for, when known. |
| instructions | Yes | Full human-readable, step-by-step install/setup text. |
| install_command | No | One-line terminal command to install LMCP, when applicable to this OS/step. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds useful context beyond annotations: it clarifies the action is quick (~30 sec), data stays local, and non-macOS leads to a waitlist. It could mention more about state implications, but the readOnlyHint is consistent with generating a link.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with key information. Every sentence adds value without unnecessary detail. Highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 4 parameters and an output schema, the description covers purpose, timing, parameter guidance, and edge cases (non-macOS). It does not describe the output, but the output schema exists. Minor gap: could mention that the link is for macOS only and what happens after.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds meaning beyond the schema: it explains the default os value, when to use optional parameters (step, email, issue) for troubleshooting, and the waitlist consequence for non-macOS. Schema coverage is 100%, so the description supplements rather than repeats.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool shows a personalized install link for the LMCP app, distinguishing it from siblings like lmcp_state or lmcp_welcome. The verb 'show' and the specific resource 'install link' make the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides context on when to call (immediately) and how to set the os parameter (default macOS, otherwise waitlist). However, it does not explicitly mention when not to use this tool or suggest alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
signal_connectSignal ConnectARead-onlyInspect
Connect Signal to Local MCP. Reports whether Signal Desktop is installed and signed in, and tells you exactly what to do next — install Signal, or open it and link your phone. (Signal links inside its own desktop app, so the QR is shown there, not here.) Once you're signed in, signal_list_chats / signal_read_messages work. If Signal is already connected, it just reports that.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description aligns with annotations (readOnlyHint=true, destructiveHint=false) and adds valuable context: it only checks status, does not show QR code (shown in app), and after connection other tools become available. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is slightly lengthy but each sentence contributes meaning: purpose, behavior, common concern (QR location), and post-connection usage. It is well-structured and front-loaded with the main purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no parameters, diagnostic role), the description is complete: it covers what the tool does, what to expect, and prerequisites. It fully informs the agent for invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has no parameters, and schema coverage is 100%. The description explains what the tool does without needing parameter details, adding semantic context beyond the empty schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: to connect Signal to Local MCP by checking installation and sign-in status, providing next-step instructions. It distinguishes itself from sibling tools like signal_list_chats and signal_read_messages by noting that they work only after sign-in.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates when to use the tool (to connect Signal) and implies that if already connected, it simply reports that. It sets expectations about prerequisites (Signal Desktop installed, sign-in) but lacks explicit exclusion of alternatives or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
signal_frictionSignal FrictionARead-onlyInspect
Send an ANONYMOUS, content-free signal when an LMCP tool fails, returns nothing useful, the user seems frustrated, or you could not accomplish what they asked. Helps the LMCP team find and fix the roughest spots. Send ONLY the category + the tool name — NEVER the user's request, message/email content, account names, or any personal data. No confirmation needed: this is anonymous (categories only) and respects the user's opt-out.
| Name | Required | Description | Default |
|---|---|---|---|
| attempt_count | No | How many times this was attempted (optional). | |
| error_category | No | Category of what went wrong (optional). | |
| tool_attempted | No | Name of the LMCP tool involved (e.g. list_emails). Optional. | |
| friction_signal | Yes | What kind of friction you observed. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true and destructiveHint=false. The description adds crucial context: anonymity, content-free signal, no confirmation, and opt-out respect. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single efficient paragraph with three sentences, each adding essential information. Front-loaded with purpose, followed by constraints. No redundant content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (single required param, enums, no nested objects) and the presence of an output schema, the description fully covers usage, constraints, and privacy aspects, leaving no gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. Description adds value by emphasizing that only category and tool name should be sent, and that the signal is anonymous and content-free, which goes beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Send', the resource 'signal', and the specific triggering conditions (tool fails, user frustrated, etc.). It distinguishes itself from all sibling tools by its unique anonymous feedback purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly defines when to use (failures, frustration, incomplete tasks) and what not to include (personal data). It provides clear context but does not mention alternative tools, though none are needed given the tool's singularity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
signal_list_chatsSignal List ChatsARead-onlyInspect
Lists Signal conversations (chats) with last-active timestamps. Reads from the local Signal Desktop database — no network access required. Returns chat IDs, contact names, and type (direct or group). Use the chat_id in subsequent signal_read_messages calls.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max chats to return (default 50) |
Output Schema
| Name | Required | Description |
|---|---|---|
| chats | Yes | Signal conversations |
| count | No | Number of chats returned |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint=true, destructiveHint=false), description adds that it reads from the local Signal Desktop database without network access, providing context about its behavior and safety.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences that efficiently convey purpose, behavior, and usage. No wasted words; every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and the tool's simplicity, the description is complete. It mentions key return fields and provides a usage tip, making it fully informative for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a well-described 'limit' parameter. Description does not add any additional meaning or examples beyond what the schema provides, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it lists Signal conversations with last-active timestamps, reads from local database, and returns chat IDs, contact names, and type. It distinguishes from siblings like signal_read_messages by specifying the purpose and output.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides a clear usage hint: 'Use the chat_id in subsequent signal_read_messages calls.' It also notes that no network access is required. However, it does not explicitly contrast with alternative tools or state when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
signal_read_messagesSignal Read MessagesARead-onlyInspect
Reads messages from a specific Signal chat. The chat_id must come from a previous signal_list_chats call. Returns messages in chronological order with sender phone numbers and body text. Only messages cached locally by Signal Desktop are available.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max messages to return (default 50) | |
| chat_id | Yes | Chat ID from signal_list_chats |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | Number of messages returned |
| messages | Yes | Messages from the chat, chronological |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond the readOnlyHint annotation, the description adds important behavioral context: messages are only from local cache, returned in chronological order with sender and body. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with the key action, no unnecessary words. Each sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers what the tool does, its prerequisites, output format, and limitations. With an output schema present, no further details are needed. Complete for agent decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds meaningful context beyond the schema: it explains that chat_id must be obtained from signal_list_chats. Since schema coverage is 100%, the baseline is 3, but the prerequisite clarification elevates it to 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (reads messages), the resource (a specific Signal chat), and provides specifics like chronological order, sender info, and body text. It distinguishes from sibling tools like signal_send_message and signal_search_messages by focusing on reading cached messages from a chat.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies a prerequisite (chat_id must come from signal_list_chats) and notes that only cached messages are available. While it doesn't explicitly compare to alternatives, the context is sufficient for deciding when to use this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
signal_search_messagesSignal Search MessagesARead-onlyInspect
Full-text search across locally-cached Signal messages. Only messages Signal Desktop has stored on disk are searched — no network access required. Optionally restrict search to a specific chat_id.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results to return (default 50) | |
| query | Yes | Search text (case-insensitive substring match) | |
| chat_id | No | Optional chat ID to restrict search |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | Number of results returned |
| results | Yes | Matching messages |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds that no network access is needed, which is extra behavioral context beyond annotations. There is no contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, front-loaded with the core purpose. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (handling return values), the description fully covers behavior: local-only search, optional chat restriction, case-insensitive substring match. No gaps remain for a search tool of this complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with detailed parameter descriptions (e.g., case-insensitive substring match for query). The description adds minor reinforcement ('optionally restrict search to a specific chat_id') but does not significantly enhance meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs 'full-text search' on 'locally-cached Signal messages', specifying both action and resource. It distinguishes from siblings like signal_read_messages (read) and signal_list_chats (list) by focusing on search and caching.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explains that only locally cached messages are searched (no network access) and optionally allows restricting to a chat_id. While it does not explicitly name alternatives or exclusions, the local-only nature provides clear context on when to use this tool versus online search or message reading.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
signal_send_messageSignal Send MessageAInspect
Preview (and get send guidance for) a message to a Signal chat. NOTE: Signal Desktop exposes no local send API — the Signal integration reads the local database read-only — so LMCP cannot transmit Signal messages directly. The first call (confirm=false or omitted) returns a preview. Pass confirm=true to get step-by-step guidance for completing the send. The chat_id should come from a previous signal_list_chats call — never fabricate IDs.
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Plain-text message body | |
| chat_id | Yes | Chat ID from signal_list_chats | |
| confirm | No | Set true for send guidance. Default: preview only. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses the critical behavioral trait that the tool cannot actually send messages due to Signal Desktop's lack of a local send API, though annotations show readOnlyHint=false. It explains the two-phase process (preview then guidance) and adds context about read-only database access, going beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two short paragraphs with a bolded NOTE. The first sentence states the purpose, followed by necessary caveats and usage flow. Every sentence adds information; there is no fluff. It is efficiently structured and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the three-parameter input schema, presence of output schema, and annotations, the description fully covers the unusual behavior (no direct send), explains the two-phase invocation, and provides critical external dependency (chat_id source). It is complete for an AI agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the parameters are already documented. The description adds value by explicitly stating that chat_id must come from signal_list_chats (never fabricated) and explains that confirm=true changes the output from preview to guidance, which is not clear from the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's function: 'Preview (and get send guidance for) a message to a Signal chat.' It distinguishes from siblings like 'send_message' and 'signal_list_chats' by noting that Signal Desktop has no local send API, so this tool only previews and guides rather than sending directly.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly describes when to use confirm=false (preview) vs confirm=true (send guidance), instructs that chat_id must come from a prior signal_list_chats call, and explains why the tool behaves this way (no direct send API). This provides clear when-to-use and when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
slack_list_channelsSlack List ChannelsARead-onlyInspect
Lists channels in a Slack workspace, including public channels, private channels, and direct messages (DMs). Reads from the local IndexedDB cache — only channels that Slack Desktop has synced to disk are returned. Pass workspace_id from slack_list_workspaces to filter to a specific workspace.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max channels to return (default 200) | |
| workspace_id | No | Workspace ID from slack_list_workspaces (optional — omit to list channels across all workspaces) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | Number of channels returned |
| channels | Yes | Channels and DMs synced to the local cache |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, but the description adds critical insight: 'Reads from the local IndexedDB cache — only channels that Slack Desktop has synced to disk are returned.' This explains data source and limitation, which annotations do not cover. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each with a clear purpose: what the tool does, data source limitation, and parameter guidance. No fluff or repetition. Front-loaded with primary action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that an output schema exists (not shown but indicated in signals), the description does not need to explain return values. It covers the key behavioral aspect (cache-based data source) and parameter guidance. It is complete for the tool's purpose.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters. The description adds value by explaining the source of workspace_id ('from slack_list_workspaces') and the default for limit ('default 200'), though this is also in the schema. The additional context for workspace_id is helpful.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Lists channels in a Slack workspace, including public channels, private channels, and direct messages (DMs).' It uses a specific verb ('Lists') and specifies the resource ('channels in a Slack workspace'). It distinguishes from sibling tools like slack_read_channel_messages and slack_search_messages by focusing solely on listing channels.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explicitly instructs to 'Pass workspace_id from slack_list_workspaces to filter to a specific workspace,' guiding parameter usage. While it doesn't explicitly state when not to use this tool, the context from sibling tools implies alternatives for reading messages or searching. The guidance is clear but could be more explicit about exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
slack_list_workspacesSlack List WorkspacesARead-onlyInspect
Lists the Slack workspaces (teams) the user has connected in Slack Desktop. Reads from the local IndexedDB cache — no token needed. Only workspaces that have been synced to disk are returned.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | Number of workspaces returned |
| workspaces | Yes | Connected Slack workspaces synced to the local cache |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=true. The description adds beyond this by disclosing the data source (local IndexedDB cache), the lack of token requirement, and the limitation that only synced workspaces are returned. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, each adding essential information: purpose, data source, requirement, and limitation. No redundant words, efficiently front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters and an output schema, the description covers core aspects: what it does, data source, requirement, and constraint. It could potentially mention return format but output schema handles that.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has no parameters and schema description coverage is 100%, so baseline is 3. The description adds no parameter info, which is acceptable for a parameterless tool.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists Slack workspaces (teams) connected in Slack Desktop, using the verb 'lists' and specifying the resource. It distinguishes from sibling tools like slack_list_channels by noting it reads from IndexedDB cache and requires no token.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for retrieving connected workspaces but does not explicitly state when to use vs alternatives or provide exclusions. No direct sibling for listing workspaces exists, so guidance is adequate but not explicit.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
slack_read_channel_messagesSlack Read Channel MessagesARead-onlyInspect
Reads recent messages from a Slack channel or DM. Reads from the local IndexedDB cache — only messages that Slack Desktop has synced to disk are available (typically the last few hundred messages for active channels). channel_id must come from slack_list_channels.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max messages to return (default 50) | |
| channel_id | Yes | Channel ID from slack_list_channels |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | Number of messages returned |
| messages | Yes | Recent messages from the channel, oldest first |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true and destructiveHint=false, which the description reinforces. It goes beyond by disclosing the caching mechanism and typical message availability (last few hundred messages). This adds significant behavioral context beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no filler. The first sentence defines the purpose immediately; the second adds crucial limitation. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 params, no nested objects, output schema present), the description covers purpose, data source, parameter source, and caching limitation. No gaps for agent understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds value by stating that channel_id must come from slack_list_channels, linking it to another tool's output. This extra context lifts the score to 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Reads recent messages') and resource ('Slack channel or DM'). It specifies the source (local IndexedDB cache) and distinguishes from search or listing tools. This meets the 5-level: specific verb+resource, distinguishes from siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives clear context: use for reading recent messages, channel_id must come from slack_list_channels. It implies that for older messages or search, another tool (slack_search_messages) should be used. Lacks explicit 'when not to use' but still provides practical guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
slack_search_messagesSlack Search MessagesARead-onlyInspect
Searches Slack messages across locally-cached channels using full-text substring matching. Only messages that Slack Desktop has synced to disk are searched — this is not the Slack cloud search API. Optionally restrict search to a specific channel_id.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results to return (default 50) | |
| query | Yes | Search text (case-insensitive substring match) | |
| channel_id | No | Optional channel ID to restrict search |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | Number of results returned |
| results | Yes | Matching messages, most recent first |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond the annotations (readOnlyHint, no destructive hint), the description reveals key behaviors: search is limited to locally-cached/synced data, uses substring matching, and is not the cloud API. This provides essential transparency for an AI agent to set expectations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, minimal yet complete, with the most critical information front-loaded. Every sentence adds value: first defines scope and method, second clarifies limitation and optional filter. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, the description need not explain return values. It sufficiently covers input constraints and behavior. Minor missing detail: whether search spans all workspaces or just current; but overall complete for a read-only search tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description restates the channel_id parameter as optional but adds no new meaning beyond what the schema already provides. The mention of 'full-text substring matching' aligns with the query parameter but doesn't enhance parameter-level understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb 'searches' and resource 'Slack messages' with clear qualifiers: 'locally-cached channels', 'full-text substring matching'. It explicitly distinguishes from the Slack cloud search API, which sets it apart from any generic search messages sibling.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description states when to use (searching locally cached Slack messages) and what it is not (cloud search API). It mentions optional channel_id restriction, giving context. However, it does not explicitly direct users to alternative tools like 'search_messages' for cloud searches, which would strengthen guidelines.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
stocks_get_chartStocks Get ChartBRead-onlyInspect
Gets historical price data for a stock symbol. Range: 1d, 5d, 1mo, 3mo, 6mo, 1y, 2y, 5y, 10y, ytd, max.
| Name | Required | Description | Default |
|---|---|---|---|
| range | No | Time range (default: 1mo) | |
| symbol | Yes | Ticker symbol, e.g. AAPL | |
| interval | No | Data interval (default: 1d). Intraday (1m–90m) needs a short range. |
Output Schema
| Name | Required | Description |
|---|---|---|
| name | No | |
| range | No | |
| symbol | No | |
| candles | No | |
| currency | No | |
| interval | No | |
| change_pct | No | |
| data_points | No | |
| current_price | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false. The description adds no behavioral context beyond listing ranges, which is already in the schema. It does not disclose rate limits, auth needs, or any other behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, concise and front-loaded with the purpose. The second sentence listing ranges is somewhat redundant with the schema, but not overly verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, the description does not need to explain return values. However, it omits useful context like default range/interval or constraints (e.g., intraday needs short range), which are only in the schema. Adequate but missing some practical guidance.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters. The description lists range options but adds no new meaning beyond what the schema provides, which is sufficient for a baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it gets historical price data for a stock symbol, which is a specific verb-resource pair. It also lists valid ranges, differentiating from sibling tools like stocks_get_quote (current quote) and stocks_search_symbol (search).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not explicitly state when to use this tool versus alternatives. It only describes what it does, leaving usage context implied. No exclusions or when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
stocks_get_quoteStocks Get QuoteARead-onlyInspect
Gets current stock price and market data for one or more symbols (e.g. AAPL, MSFT, BTC-USD). Uses Yahoo Finance — no API key required.
| Name | Required | Description | Default |
|---|---|---|---|
| symbols | Yes | Ticker symbols, comma-separated ('AAPL,MSFT,GOOGL') or a JSON array (['AAPL','MSFT']) |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds useful context: uses Yahoo Finance and requires no API key. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with verb. Every word is necessary. No fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (not shown in input), the description adequately covers the tool's purpose, usage, and authentication. For a simple quote tool, this is complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% for the single parameter 'symbols'. The description adds that it gets 'current stock price and market data' but does not add new semantic details beyond the schema's description of acceptable formats.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb (gets) and resource (current stock price and market data for symbols). The examples (AAPL, MSFT, BTC-USD) and mention of Yahoo Finance distinguish it from sibling tools stocks_get_chart and stocks_search_symbol.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use (to get quote data) and notes no API key required. It does not explicitly state when not to use or compare to alternatives, but the purpose is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
stocks_search_symbolStocks Search SymbolARead-onlyInspect
Searches for a stock ticker symbol by company name. Returns matching symbols and exchanges.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results (default 10) | |
| query | Yes | Company name or partial ticker to search for |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| query | No | |
| results | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare it as read-only, deterministic, and non-destructive. The description adds that it returns symbols and exchanges, but lacks details on matching behavior, pagination, or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences long with no wasted words. It is front-loaded with the core action and efficiently states the output.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 params, output schema exists), the description covers purpose and output adequately. However, it could briefly mention matching behavior (e.g., partial match, case sensitivity) for completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Both parameters are fully described in the input schema, so the description adds little beyond connecting the query parameter to the search. The limit parameter is not mentioned in the description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it searches for a stock ticker symbol by company name and returns matching symbols and exchanges. This distinguishes it from sibling tools like stocks_get_quote and stocks_get_chart.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use it (when you have a company name and need a ticker), but it does not explicitly contrast with other stock tools or provide guidance on when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
teams_list_channelsTeams List ChannelsARead-onlyInspect
Lists channels in a Microsoft Teams workspace. Returns channels that are cached in the local Teams client. If the result is empty, the channels have not been loaded into the local cache yet — ask the user to open Microsoft Teams and browse to the team's channels, then try again.
| Name | Required | Description | Default |
|---|---|---|---|
| team_id | Yes | Team ID from list_teams |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| error | No | |
| channels | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds critical behavioral trait beyond readOnlyHint annotation: results depend on local cache and may be empty if not loaded.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded purpose, no waste, perfectly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple listing tool with output schema, description provides necessary caching caveat; complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema already fully describes the single parameter (team_id) with 100% coverage; description adds no extra semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it lists channels in a Microsoft Teams workspace, distinguishing it from sibling tools like teams_list_teams and teams_list_chats.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit context about cache dependency and guidance when result is empty, but does not compare with alternative channels-list tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
teams_list_chatsTeams List ChatsARead-onlyInspect
Lists Microsoft Teams chats (direct messages and group chats).
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max chats to return (default 50) |
Output Schema
| Name | Required | Description |
|---|---|---|
| chats | No | |
| count | No | |
| error | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, so the safety profile is clear. The description adds that it lists both DMs and group chats but does not disclose other behavioral traits like pagination, ordering, or filtering. This is adequate but not rich.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with no unnecessary words. It efficiently communicates the tool's purpose without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool with one optional parameter and a provided output schema, the description is complete enough. It accurately conveys what the tool returns, and no additional context is necessary given the annotations and schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage for the single parameter (limit). The description does not add semantic meaning beyond the schema's own description, which already documents the default of 50. Therefore, baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists Microsoft Teams chats, specifying both direct messages and group chats. This provides a specific verb and resource, distinguishing it from sibling tools like teams_list_channels or teams_list_teams.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide any guidance on when to use this tool versus alternatives. Although siblings include many chat-related tools, no conditions or exclusions are mentioned, leaving the agent to infer usage without explicit direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
teams_list_teamsTeams List TeamsARead-onlyInspect
Lists all Microsoft Teams workspaces the user belongs to.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| error | No | |
| teams | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the description adds no additional behavioral context such as pagination, rate limits, or what happens if the user has no teams.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single, clear sentence that conveys the tool's purpose without any unnecessary words. Perfectly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple no-parameter tool with an output schema, the description is nearly complete. It could be slightly enhanced by noting that it returns all teams without filtering, but it is adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With zero parameters and 100% schema coverage, the description is not required to explain parameters. The minimal description is acceptable.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('lists') and resource ('Microsoft Teams workspaces'), clearly distinguishing it from sibling tools like teams_list_channels and teams_list_chats.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for listing teams the user belongs to but provides no explicit guidance on when to use this tool versus alternatives, nor any when-not scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
teams_read_channel_messagesTeams Read Channel MessagesARead-onlyInspect
Reads messages from a Microsoft Teams channel.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max messages (default 50) | |
| team_id | Yes | Team ID | |
| channel_id | Yes | Channel ID from list_teams_channels |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| error | No | |
| messages | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, so the description adds no extra behavioral context beyond what is already known.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single, front-loaded sentence with no wasted words; efficiently conveys the core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While output schema exists, the description lacks details on ordering, pagination, or message content; it is adequate but minimal.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description does not add further meaning to the parameters beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the action (reads messages) and resource (Microsoft Teams channel). Distinguishes from siblings like teams_read_chat_messages and slack_read_channel_messages.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives, such as teams_read_chat_messages or slack_read_channel_messages, nor when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
teams_read_chat_messagesTeams Read Chat MessagesARead-onlyInspect
Reads messages from a Teams chat or direct message thread.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max messages (default 50) | |
| chat_id | Yes | Chat ID from list_teams_chats |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| error | No | |
| messages | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, making the tool's safety profile clear. The description adds no extra behavioral detail such as rate limits, authentication needs, or whether messages are returned in chronological order. It aligns with annotations, so no contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence that front-loads the core purpose. It avoids unnecessary details while still being informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity, the description covers the core functionality. An output schema exists, so return values need not be detailed. The description is complete enough for typical use, though additional context about message ordering or formatting could be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage for both parameters (chat_id and limit). The tool description does not add any additional meaning beyond what the schema already provides, so baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb and resource: 'Reads messages from a Teams chat or direct message thread.' It clearly distinguishes from the sibling tool 'teams_read_channel_messages' which reads from channel messages instead.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like teams_read_channel_messages, teams_send_message, or search_messages. The agent must infer context from the name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
teams_send_channel_messageTeams Send Channel MessageAInspect
Sends a text message to a Microsoft Teams channel via Graph API. Requires connect_m365_account with Chat.ReadWrite / ChannelMessage.Send permissions. team_id and channel_id must come from teams_list_teams / teams_list_channels. First call returns a preview; set confirm=true to send.
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Plain-text message body | |
| confirm | No | Set true to send; false returns preview | |
| team_id | Yes | Team ID from teams_list_teams | |
| channel_id | Yes | Channel ID from teams_list_channels |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint=false, destructiveHint=false), the description discloses permissions needed, that the first call returns a preview, and that setting confirm=true is required to send. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences packed with essential information—no fluff. Every sentence serves a purpose: what it does, prerequisites, and usage pattern.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description adequately covers all needed context: how to obtain IDs, permission requirements, and the confirm workflow. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds critical context: team_id and channel_id must come from specific list tools, text is plain-text, and confirm controls preview vs. send. This adds significant value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states that the tool sends a text message to a Microsoft Teams channel via Graph API, distinguishing it from similar tools like teams_send_message (which targets chats). It clearly identifies the resource (channel) and action (send).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Prerequisites are listed: requires connect_m365_account with specific permissions, and team_id/channel_id must come from teams_list_teams and teams_list_channels. The description also explains the two-step confirm mechanism and how to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
teams_send_messageTeams Send MessageAInspect
Sends a text message to a Microsoft Teams chat or channel. Requires Microsoft Teams to be running and signed in (token is read fresh from Teams' local cookies on each call). The chat_id MUST come from a previous teams_list_chats call — never fabricate ids. This is a write operation: the first call returns a preview, the second call (with confirm=true) actually sends.
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Plain-text message body. Max 28000 chars. No formatting / mentions / attachments in v1. | |
| chat_id | Yes | Thread id from teams_list_chats (e.g. '19:<uuid>_<uuid>@unq.gbl.spaces' for 1:1, '19:<uuid>@thread.tacv2' for group) | |
| confirm | No | Must be true to actually send. Without it, returns a preview without making any network call. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds beyond annotations: two-step commit (preview then actual send), no network call without confirm, token read from cookies. Consistent with openWorldHint. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, front-loaded with purpose, concise. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers key workflow (two-step commit), prerequisites (Teams running, chat_id source), constraints (never fabricate ids). Output schema exists, so return values need not be in description.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage 100% provides baseline 3. Description adds workflow context: confirm must be true to send, chat_id dependency on list call, enhancing parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb+resource: 'sends a text message to a Microsoft Teams chat or channel'. Distinguishes from other messaging tools, but potential overlap with 'teams_send_channel_message' sibling, as description includes both chat and channel.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit context: Teams must be running, chat_id must come from teams_list_chats, and two-step commit with confirm parameter. No when-not-to-use, but helpful workflow instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
todo_complete_taskTo Do Complete TaskBInspect
Marks a Microsoft To Do task as complete (via Reminders sync).
| Name | Required | Description | Default |
|---|---|---|---|
| list | No | List name to narrow search by title (optional) | |
| title | No | Task title (partial match, alternative to task_id) | |
| confirm | No | Must be true to complete | |
| task_id | No | Task ID from todo_list_tasks |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint=false, destructiveHint=false), the description adds no additional behavioral context such as prerequisites, side effects, or permissions needed. It merely restates the action already implied by the name.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single, concise sentence with no wasted words. It effectively conveys the tool's purpose without unnecessary detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that an output schema exists and all parameters are documented in the input schema, the description covers the essential purpose. However, it could mention the need for 'confirm: true' or error handling, but completeness is adequate for a simple mutation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters. The description adds no extra meaning beyond what is in the schema. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (Marks as complete) and target resource (Microsoft To Do task). It also distinguishes from siblings like 'complete_omnifocus_task' and 'complete_reminder' by mentioning the sync mechanism.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives. The description mentions 'via Reminders sync' but does not explain when it should be preferred over other complete-task tools or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
todo_create_taskTo Do Create TaskAInspect
Creates a task in Microsoft To Do (via Reminders sync). Task appears in To Do automatically once synced.
| Name | Required | Description | Default |
|---|---|---|---|
| list | No | List name (from todo_list_lists). Defaults to first available list. | |
| notes | No | Task notes (optional) | |
| title | Yes | Task title | |
| confirm | No | Must be true to create | |
| due_date | No | Due date (YYYY-MM-DD, optional) | |
| priority | No | Priority: 1=high, 5=medium, 9=low (optional) |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate it's not read-only and not destructive, which aligns with the description. The description adds the sync delay context but does not disclose potential side effects or authentication requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences that front-load the main action and include relevant sync context without wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, the description doesn't need to explain return values. It includes the sync note and covers the main function, though it could mention the confirm parameter's requirement explicitly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All parameters have descriptions in the schema (100% coverage), so the description adds no extra meaning beyond what's already provided. The baseline of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool creates a task in Microsoft To Do and mentions the sync mechanism, which distinguishes it from other task creation tools like create_reminder or create_omnifocus_task.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide explicit guidance on when to use this tool versus alternatives like create_reminder or create_omnifocus_task. It mentions the sync behavior but lacks context on prerequisites or selection criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
todo_list_listsTo Do List ListsARead-onlyInspect
Lists Microsoft To Do task lists. Requires Microsoft account in Reminders sync (System Settings → Internet Accounts → Microsoft Exchange → enable Reminders).
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| note | No | |
| count | No | |
| lists | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the description's mention of the sync requirement adds beyond that. It does not contradict annotations. The description could note if there are any limits or pagination, but it's adequate given the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words. The core action is front-loaded, and the prerequisite is a natural follow-up. Perfectly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no params, output schema present), the description covers purpose and setup requirement. Nothing essential is missing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, and schema coverage is 100%. The description adds no param info, which is fine as per guidelines: baseline 4 for zero-param tools.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it lists Microsoft To Do task lists. The verb and resource are explicit, and it distinct from sibling tools like 'list_reminder_lists' which likely correspond to the native Reminders app.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides a clear prerequisite: requires Microsoft account in Reminders sync. This gives context for when the tool can be used. However, it does not explicitly mention when not to use it or suggest alternatives, but the sibling list is extensive and the context implies usage for To Do.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
todo_list_tasksTo Do List TasksARead-onlyInspect
Lists tasks from a Microsoft To Do list (or any Reminders list). Syncs via macOS Reminders.
| Name | Required | Description | Default |
|---|---|---|---|
| list | No | List name (from todo_list_lists). Leave empty to show all. | |
| limit | No | Max tasks to return (default 50) | |
| include_completed | No | Include completed tasks (default false) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| tasks | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the description's behavioral disclosure is minimal. It adds no extra details on auth, rate limits, or side effects beyond the sync mechanism, which is consistent with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence that efficiently communicates the core functionality. Every word is necessary, and there is no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of a complete output schema and annotations, the description is fairly complete. It could add a note about default behavior (e.g., returns tasks from all lists if list name omitted), but overall it provides sufficient context for a well-annotated tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description adds no parameter-specific information beyond what the schema already provides. The baseline of 3 is appropriate as the schema handles parameter semantics adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Lists tasks from a Microsoft To Do list (or any Reminders list)', specifying the operation and target resources. It distinguishes from siblings like todo_complete_task and todo_create_task by focusing on listing, and mentions the broader Reminders compatibility.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for listing tasks from To Do or Reminders, but does not explicitly state when to use this tool versus alternatives like list_reminders. No when-not-to-use guidance is provided, making it adequate but incomplete.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_calendar_eventUpdate Calendar EventAInspect
Updates an existing event in the Mac's Calendar app (Calendar.app) by ID. Pass only the fields you want to change — unspecified fields are left as-is. Get the event_id from list_calendar_events. For Microsoft 365 use the m365 calendar tools instead.
| Name | Required | Description | Default |
|---|---|---|---|
| span | No | For recurring events: 'this' (default) or 'future' | |
| notes | No | New notes — pass empty string to clear (optional) | |
| title | No | New title (optional) | |
| confirm | No | Must be true to apply changes | |
| end_date | No | New end datetime ISO 8601 (optional) | |
| event_id | Yes | Event identifier from list_calendar_events | |
| location | No | New location — pass empty string to clear (optional) | |
| start_date | No | New start datetime ISO 8601 (optional) |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| end | No | |
| notes | No | |
| start | No | |
| title | No | |
| all_day | No | |
| updated | No | |
| calendar | No | |
| location | No | |
| attendees | No | |
| calendar_id | No | |
| attendees_total | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate a write operation (readOnlyHint=false) and non-destructive (destructiveHint=false). The description adds that unspecified fields are left as-is, and hints at safety by mentioning the confirm parameter (must be true to apply changes). This provides useful behavioral context beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences, each earning its place: purpose, usage hint, and cross-platform alternative. It is front-loaded and contains no redundant or vague language.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of 8 parameters and the presence of an output schema, the description covers the core operation, the important behavior of idempotent updates, and the distinction from sibling tools. It could mention the span parameter for recurring events explicitly, but that is detailed in the schema. Overall, it is complete enough for effective tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds the general instruction to pass only fields to change and that unspecified fields are left as-is, but does not add new parameter-level meaning beyond what the schema provides. The schema descriptions are already comprehensive for each parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it updates an existing event in Mac's Calendar app by ID. It distinguishes from sibling tools like create_calendar_event and m365 tools by specifying the target app and constraints. The verb 'updates' and resource 'event' are specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says to get event_id from list_calendar_events and to use m365 tools for Microsoft 365 events, providing clear when-to-use and when-not-to-use guidance. It does not mention alternatives like create_calendar_event for creating events, but the purpose is clear enough from the name and context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_local_mcpUpdate Local MCPAInspect
Checks for and installs LMCP updates. Installing downloads the update and RESTARTS LMCP (the AI client briefly reconnects), so it requires confirm=true. Pass check_only=true to only report whether an update is available, with no download or restart.
| Name | Required | Description | Default |
|---|---|---|---|
| confirm | No | Must be true to actually install (which restarts LMCP). Without it, returns availability + a preview. | |
| check_only | No | If true, only report availability — no download, no install, no restart. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description discloses that installing causes a restart of LMCP, which is key behavioral information beyond what annotations provide (readOnlyHint=false, destructiveHint=false). It also clarifies that confirm=true is required for installation. However, it doesn't discuss potential side effects like loss of unsaved state or prerequisites like internet connectivity.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, highly efficient. Front-loaded with the main purpose, then parameter details. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with two boolean parameters and an output schema, the description is complete. It explains both parameters and their effects. Could be slightly improved by noting that check_only is useful for avoiding restarts, but overall it provides sufficient context for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and description adds significant context beyond the schema descriptions. For 'confirm', it explains the effect of setting it true vs false. For 'check_only', it clearly states the outcome. This helps an agent understand the exact consequences of each parameter value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool's function: checking for and installing LMCP updates. It uses specific verbs ('checks', 'installs') and identifies the specific resource (LMCP updates). It distinguishes itself from related tools like 'lmcp_state' and 'lmcp_welcome' by focusing on updates.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description explains when to use 'confirm' vs 'check_only' parameters, providing clear context for each mode. It doesn't explicitly state when not to use this tool or list alternative tools, but the update-specific domain makes usage clear. Could be improved by mentioning that this is the sole tool for updating LMCP.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_noteUpdate NoteAInspect
Updates an existing note in Apple Notes. Change the title and/or body. Find note_id with list_notes or search_notes. Requires confirm=true.
| Name | Required | Description | Default |
|---|---|---|---|
| body | No | ||
| title | No | ||
| confirm | No | ||
| note_id | No | ||
| note_name | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| id | No | |
| name | No | |
| updated | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already show it's a write operation (readOnlyHint=false) and non-destructive. Description adds the confirm=true requirement, which is key behavioral info. Could mention error cases like missing note.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences, front-loaded with purpose, no wasted words. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple update tool with output schema, description covers core inputs and a critical requirement. Missing note_name parameter explanation, but otherwise adequate for agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so description must compensate. It explains note_id, title, body, and confirm, but note_name is not mentioned. Adds meaning beyond types but incomplete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Updates' and the resource 'existing note in Apple Notes', and distinguishes from siblings like create_note and list_notes by mentioning how to find note_id.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It says when to use (to update a note), how to get note_id, and the confirm requirement. No explicit exclusions or alternatives, but sufficient for basic usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_reminderUpdate ReminderAInspect
Updates an existing reminder in Reminders.app. Change the title, due date, notes, priority, or move it to another list (list_name). Get reminder_id from list_reminders. Requires confirm=true.
| Name | Required | Description | Default |
|---|---|---|---|
| notes | No | New notes text (optional) | |
| title | No | New title (optional) | |
| confirm | No | Must be true to apply changes | |
| due_date | No | New ISO 8601 due date. Pass empty string to clear (optional) | |
| priority | No | Priority: none | low | medium | high (optional) | |
| list_name | No | Move the reminder to this list (a name from list_reminder_lists) (optional) | |
| reminder_id | Yes | Reminder identifier from list_reminders |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate write operation (readOnlyHint=false) and non-destructive (destructiveHint=false); description adds the confirm flag requirement, which is key behavioral info beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences: purpose, modifiable fields, and two essential usage notes. No fluff, front-loaded, efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists, so return value details are covered. Description covers parameters, usage, and behavioral requirement. Minor gap: no mention of any side effects or prerequisites beyond confirm, but still sufficient for selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds value by grouping parameters into actions ('Change the title, due date, notes, priority, or move it to another list') and explaining reminder_id and confirm.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Updates an existing reminder in Reminders.app' and lists specific parameters (title, due date, notes, priority, list_name), distinguishing it from create/delete/complete reminders.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit retrieval method ('Get reminder_id from list_reminders') and a required behavior ('Requires confirm=true'), but lacks explicit when-to-use vs alternatives, though implied by 'update'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_self_diagnosisUpdate Self DiagnosisCInspect
Returns the self-update health state: current version, last N update attempts (with errors), writability of the update cache, and any stale LMCP binaries found at alternate paths. Call this when auto-update seems stuck or when you need to explain to a user why they're on an old version.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max recent attempts to return (default 10) |
Output Schema
| Name | Required | Description |
|---|---|---|
| cache_dir | Yes | |
| running_from | Yes | Real path of the currently running binary. |
| binaries_found | Yes | LMCP binaries found at known alternate paths. |
| cache_writable | Yes | Whether the update cache dir is writable (#1 silent-failure cause). |
| current_version | Yes | |
| last_success_at | Yes | ISO 8601 timestamp of last successful update, empty if none. |
| recent_attempts | Yes | Recent update attempts, newest first. |
| consecutive_failures | Yes | |
| recent_attempts_count | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description says 'Returns the self-update health state,' implying a read-only operation, but annotations have readOnlyHint=false, indicating potential side effects. This is a clear contradiction. Additionally, the name suggests mutation, which is not reflected in the description. The description does not disclose any behavioral traits beyond what annotations already contradict.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, concise and front-loaded with the main purpose. No waste. Minor deduction for not addressing the name mismatch.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (one optional param, output schema exists), the description provides sufficient context about what is returned. However, the significant contradiction between name, description, and annotations leaves the tool incomplete in terms of clarity. The output schema is present, so return values need not be detailed, but the behavioral ambiguity is a critical gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema already describes the 'limit' parameter with 100% coverage. The description adds useful context: 'Max recent attempts to return (default 10)' which clarifies the default value and purpose, going beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The name 'update_self_diagnosis' suggests a mutation (update), but the description states it returns state. This misalignment causes confusion about the tool's actual purpose. The description itself is clear about returning health state, but the contradiction with the name reduces clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to call this tool: 'when auto-update seems stuck or when you need to explain to a user why they're on an old version.' This provides good context for use. However, it does not mention when not to use it or alternative tools among siblings (though siblings include many read/diagnose tools).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
whatsapp_connectWhatsApp ConnectAInspect
Link WhatsApp to Local MCP by showing a QR code right here in the chat — no Terminal needed. Call this, then on your phone open WhatsApp → Settings → Linked Devices → Link a Device, and scan the QR shown. After you scan, WhatsApp tools (whatsapp_list_chats, whatsapp_read_messages, …) start working. If WhatsApp is already linked, it just reports that.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses beyond annotations: describes that a QR code is shown in chat, requires external user action on phone, and that WhatsApp tools become operational after scanning. No contradiction with annotations (readOnlyHint=false, openWorldHint=true, destructiveHint=false).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences front-load the main purpose and include all necessary steps and outcomes. Every sentence adds value with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no parameters and a straightforward connection task, the description fully covers the user's experience: what happens (QR shown), what the user must do (scan on phone), post-link effect (WhatsApp tools work), and edge case (already linked). Output schema exists but its presence does not detract from completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has zero parameters, so no parameter descriptions are needed. Baseline for 0 params is 4, and the description appropriately does not add any unnecessary param info.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Link WhatsApp') and resource ('Local MCP') and explains the mechanism (showing QR code). It also distinguishes from sibling WhatsApp tools by specifying that after linking, WhatsApp tools (whatsapp_list_chats, etc.) start working.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit step-by-step instructions for use: call this, then on phone navigate to Settings → Linked Devices → Link a Device and scan the QR. Also covers the alternative scenario ('If WhatsApp is already linked, it just reports that'), making when-to-use very clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
whatsapp_list_chatsWhatsApp List ChatsARead-onlyInspect
Lists WhatsApp conversations with last message preview. Returns chat IDs, contact names, and recent message snippets. Some contacts may appear with @lid identifiers (e.g. 123456@lid) instead of phone numbers — this is a WhatsApp privacy feature for certain account types; use the Name field for display and the JID/chat_id for subsequent calls. ⚠️ Uses Wacli (unofficial WhatsApp client). Accounts may be restricted for ToS violations.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max chats to return (default 50) |
Output Schema
| Name | Required | Description |
|---|---|---|
| chats | Yes | WhatsApp conversations with last message preview |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint and destructiveHint, but the description adds critical context: unofficial client Wacli, potential account restrictions for ToS violations, and privacy features regarding @lid identifiers. This goes beyond annotation requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, front-loaded with the main action and returned fields, followed by important details. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool with one optional parameter and an output schema, the description covers return values, special identifier formats, and risks. The agent has sufficient information to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter 'limit' is fully described in the input schema (schema coverage 100%). The description adds no additional parameter information, meeting the baseline expectation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists WhatsApp conversations with last message preview, identifying specific resources (chat IDs, contact names, snippets). It differentiates from sibling tools by referencing WhatsApp-specific identifiers (@lid) and the unofficial client Wacli.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for listing WhatsApp chats but provides no explicit guidance on when to use this tool versus alternatives like signal_list_chats or teams_list_chats. The warning about ToS violations offers some context for appropriate use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
whatsapp_read_messagesWhatsApp Read MessagesARead-onlyInspect
Reads messages from a specific WhatsApp chat. The chat_id must come from a previous whatsapp_list_chats call. ⚠️ Uses Wacli (unofficial WhatsApp client). Accounts may be restricted for ToS violations.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max messages to return (default 50) | |
| chat_id | Yes | Chat ID from whatsapp_list_chats |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | Number of messages returned |
| messages | Yes | Messages from the chat, chronological |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=true. The description adds critical behavioral context: 'Uses Wacli (unofficial WhatsApp client). Accounts may be restricted for ToS violations.' This warns of potential account risk beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with key information, no redundancy. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers purpose, prerequisite, and risk. The tool has an output schema, so return values need not be described. For a simple read operation, this is complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%; both parameters have descriptions in the schema. The description adds value by linking chat_id to the prerequisite call (whatsapp_list_chats), providing extra context not in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Reads messages from a specific WhatsApp chat', specifying the action and resource. It distinguishes from sibling tools like whatsapp_list_chats (which lists chats) and whatsapp_send_message (which sends).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies a clear prerequisite: 'chat_id must come from a previous whatsapp_list_chats call'. It also provides a warning about Wacli and ToS risks. However, it does not explicitly compare to alternative read message tools among siblings, such as signal_read_messages.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
whatsapp_search_messagesWhatsApp Search MessagesARead-onlyInspect
Offline full-text search across all WhatsApp chats. Only locally-cached messages are searched — no network access required. Optionally restrict search to a specific chat_id. ⚠️ Uses Wacli (unofficial WhatsApp client). Accounts may be restricted for ToS violations.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results to return (default 50) | |
| query | Yes | Search text (case-insensitive substring match) | |
| chat_id | No | Optional chat ID to restrict search |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | Number of results returned |
| results | Yes | Matching messages |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant disclosure beyond the readOnlyHint=true annotation by warning about the unofficial Wacli client and potential account restrictions for ToS violations. This directly informs the agent of risks and behavioral constraints, fulfilling the burden fully.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: two core sentences and one warning. The first sentence immediately conveys the primary purpose, the second adds the key parameter option, and the warning is clearly flagged. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers purpose, behavior (offline, unofficial client, risk), and optional restriction. Given that an output schema exists (so return format is handled) and the annotations are present (read-only, non-destructive), the description is fairly complete. Minor gaps: no mention of result ordering or handling of empty results, but these are compensable by output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already describes all three parameters with 100% coverage (e.g., 'case-insensitive substring match' for query). The description adds only that chat_id is optional, which is already indicated by not being in the required list. Thus no substantial additional meaning is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Offline full-text search across all WhatsApp chats', which clearly defines the action (search) and the resource (WhatsApp messages) with specific scope (offline, all chats). It distinguishes from sibling tools like 'whatsapp_read_messages' or 'signal_search_messages' by specifying the offline nature and WhatsApp context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description notes that only locally-cached messages are searched and no network access is required, which implies when to use (offline/fast search) and implicitly when not (if server-side data needed). However, it does not explicitly list alternative tools or contraindications, though the offline emphasis provides meaningful context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
whatsapp_send_fileWhatsApp Send FileAInspect
Sends a file attachment to a WhatsApp chat. The chat_id MUST come from a previous whatsapp_list_chats call — never fabricate IDs. file_path must be an absolute path to a local file. This is a write operation: the first call (confirm=false) returns a preview without sending; set confirm=true to actually send. ⚠️ Uses Wacli (unofficial WhatsApp client). Accounts may be restricted for ToS violations.
| Name | Required | Description | Default |
|---|---|---|---|
| caption | No | Optional caption text to accompany the file | |
| chat_id | Yes | Chat ID from whatsapp_list_chats | |
| confirm | No | Must be true to actually send. Without it, returns a preview. | |
| file_path | Yes | Absolute path to the local file to send |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses it is a write operation (`confirm=false` vs `confirm=true`), warns about using an unofficial client (Wacli) and potential account restrictions. Adds context beyond annotations, which only state `readOnlyHint=false` and `destructiveHint=false`.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, front-loaded with main purpose. Warning is appended. Could be slightly more concise, but no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers key aspects: source of ID, file path requirement, two-step sending, and usage risks. With existing output schema and sibling context, it is complete enough for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All 4 parameters have schema descriptions (100% coverage), and the description adds crucial context: `chat_id` must be from a prior call, `file_path` must be absolute, `confirm` flag semantics, and `caption` optionality. This significantly aids correct invocation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it sends a file attachment to a WhatsApp chat, distinguishing it from `whatsapp_send_message` and other chat tools. The two-step confirm process is explicitly mentioned, adding clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear instructions: `chat_id` must come from `whatsapp_list_chats`, `file_path` must be absolute, and explains the `confirm` parameter. Does not explicitly state when not to use, but the guidance is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
whatsapp_send_messageWhatsApp Send MessageAInspect
Sends a text message to a WhatsApp chat. The chat_id MUST come from a previous whatsapp_list_chats call — never fabricate IDs. This is a write operation: the first call (confirm=false) returns a preview without sending; set confirm=true to actually send. ⚠️ Uses Wacli (unofficial WhatsApp client). Accounts may be restricted for ToS violations.
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Plain-text message body | |
| chat_id | Yes | Chat ID from whatsapp_list_chats | |
| confirm | No | Must be true to actually send. Without it, returns a preview. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark readOnlyHint=false, but the description adds crucial behavioral details: it's a write operation, uses the unofficial Wacli client, and warns of potential ToS restrictions. The preview-before-send pattern is also disclosed, exceeding what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, each adding unique value: purpose, critical constraint, confirm flag workflow, important warning. No redundant or fluff content. Every sentence is justified.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (write operation, preview, unofficial client), the description covers the core guidance. The return value (preview) is not explicitly described, but since an output schema exists, it partially compensates. Still, a mention of what the preview contains would slightly improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the schema already documents each parameter. The description adds operational meaning for chat_id (must be from list_chats) and confirm (preview without it), which is useful beyond the schema's technical descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a clear verb+resource statement: 'Sends a text message to a WhatsApp chat.' It defines both the action and the target, and the resource (chat) is distinct from sibling tools like whatsapp_send_file or generic send_message.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explicitly states that chat_id must come from whatsapp_list_chats, preventing ID fabrication. It also explains the confirm flag workflow: first call without confirm returns a preview, set confirm=true to actually send. No ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
word_appendWord AppendAInspect
Appends text to an existing Word (.docx) document.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Path to the existing .docx file | |
| confirm | No | Must be true to modify | |
| content | Yes | Text to append |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=false and destructiveHint=false. The description adds minimal behavioral context by stating 'appends', implying mutation without destruction. However, it does not disclose the confirm requirement or any side effects. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that clearly conveys the operation and resource type. No unnecessary words or redundancy. Front-loaded with the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that an output schema exists and parameters are fully described in the schema, the description is minimally complete. However, it omits critical context about the confirm parameter needing to be true, which could lead to failures. The description doesn't compensate for this gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema descriptions cover all three parameters (path, confirm, content) with 100% coverage. The tool description does not add extra meaning beyond the schema. It implies the file must exist, which aligns with the path description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's function: 'Appends text to an existing Word (.docx) document.' It uses a specific verb ('appends') and resource ('existing Word (.docx) document'), which distinguishes it from sibling tools like word_create (creates new documents) and word_read (reads content).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description lacks any guidance on when to use this tool versus alternatives. It does not mention that the file must already exist, that confirm must be true to modify, or differentiate from word_create for new documents. No usage context or exclusions are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
word_createWord CreateAInspect
Creates a new Word (.docx) document with the given content.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Output path for the .docx file | |
| title | No | Document title (optional) | |
| confirm | No | Must be true to create | |
| content | Yes | Document text content |
Output Schema
| Name | Required | Description |
|---|---|---|
| path | Yes | Path of the created .docx file |
| created | Yes | True when the document was created |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description does not disclose the required 'confirm' boolean parameter or potential overwrite behavior. Annotations provide basic read/write status but description adds little beyond schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
One sentence, front-loaded, no extraneous words. Every part earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Missing mention of required confirmation parameter despite its importance. Output schema exists so return values are covered, but behavioral gaps reduce completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds minimal value beyond mentioning 'content' but ignores other parameters like 'title' and 'confirm'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Creates a new Word (.docx) document' with specific verb and resource. Distinguishes from sibling tools like word_append and word_read.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use vs. alternatives (e.g., word_append). Usage is implied but lacks context on prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
word_readWord ReadARead-onlyInspect
Reads text content from a Word (.docx) file.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the .docx file |
Output Schema
| Name | Required | Description |
|---|---|---|
| text | Yes | Extracted text content |
| chars | Yes | Number of characters in the extracted text |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, indicating a safe read operation. The description adds value by specifying that only text content is read, not formatting or other elements. This provides additional transparency beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, front-loaded with the essential information. It is concise with no extraneous words, effectively conveying the tool's purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity, the description is complete enough. An output schema exists, so return values are not needed in the description. The description covers what the tool does and the required parameter.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the 'path' parameter well-described as 'Absolute path to the .docx file'. The description does not add any further meaning to the parameter beyond what is in the schema. Baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it reads text content from a Word (.docx) file, specifying the action (reads), resource (text content), and file type (.docx). This distinguishes it from sibling tools like word_append and word_create, and from other read tools for different formats (pdf_read, gdrive_read_file).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not explicitly state when to use this tool vs alternatives. However, the name and context imply it's the correct tool for reading .docx files. No guidance on exclusions or alternative tools is given, but the usage is implied by the specific file type.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
zoom_list_recordingsZoom List RecordingsARead-onlyInspect
Lists Zoom meeting recordings saved locally on this Mac (~/Documents/Zoom), newest first: meeting name, date, and which artifacts exist (transcript, captions, saved chat, audio, video). Local recordings only — no Zoom API, no admin approval. Use zoom_read_transcript to read the text of a meeting.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max recordings to return (default 20) |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | No | |
| total | No | |
| recordings | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false, confirming no side effects. The description adds valuable context: recordings are from ~/Documents/Zoom, ordered newest first, and includes which artifacts exist. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, concise, and front-loaded with the main purpose. Every word serves a purpose, no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has one optional parameter and an output schema (not shown but noted), the description covers key aspects: data source, order, and contained information. It is complete for effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage for the only parameter 'limit' is 100% (description present in schema). The description does not add extra meaning beyond what the schema provides, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'lists', the resource 'Zoom meeting recordings saved locally', and specifies the details returned (name, date, artifacts). It distinguishes itself from zoom_read_transcript by noting it's local only and not using the Zoom API.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Local recordings only — no Zoom API, no admin approval' and suggests using zoom_read_transcript to read the text. This provides clear context on when to use this tool and when to use an alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
zoom_read_transcriptZoom Read TranscriptARead-onlyInspect
Reads the text artifacts of a local Zoom recording: the transcript/captions (.vtt or closed_caption.txt, cleaned to readable 'Speaker: text' lines) and the saved in-meeting chat. Pass the recording name or path from zoom_list_recordings. Perfect for 'summarize my last meeting' or 'what did we agree on in the kickoff call'.
| Name | Required | Description | Default |
|---|---|---|---|
| include | No | 'all' (default), 'transcript' or 'chat' | |
| recording | Yes | Recording folder name (or full path) from zoom_list_recordings. Partial name match works. |
Output Schema
| Name | Required | Description |
|---|---|---|
| chat | No | |
| note | No | |
| path | No | |
| recording | No | |
| transcript | No | |
| chat_source | No | |
| transcript_source | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and non-destructive behavior. The description adds value by specifying cleaning to 'Speaker: text' lines and inclusion of in-meeting chat, providing behavioral context beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with no wasted words. The first states the core functionality and processing, the second provides usage context. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given a simple tool with two parameters and an output schema (not shown but present), the description covers what is read, how it is processed, and typical use cases. It does not explain the output format, but that is handled by the output schema. Minor missing details like file formats or limitations do not significantly detract from completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description does not add significant new meaning for parameters; it reiterates the schema's information about recording name/path and include options. The baseline of 3 is appropriate as the schema already does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool reads text artifacts (transcript/captions and chat) from a local Zoom recording, using specific verbs and resources. It differentiates from sibling zoom_list_recordings by specifying what it reads from a given recording, making the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage context with examples like 'summarize my last meeting' or 'what did we agree on in the kickoff call'. It implies the tool is for extracting text content, but does not explicitly state when not to use it or list alternatives, though none exist among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!
Your Connectors
Sign in to create a connector for this server.