XMemo
Server Details
User-owned memory for AI agents, Copilot, Claude, IDEs, CLIs, and chat apps over remote MCP.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.1/5 across 49 of 49 tools scored. Lowest: 3.3/5.
Multiple tools have overlapping purposes, such as create_memory_todo/create_reminder, complete_memory_todo/complete_reminder, list_memory_todos/list_reminders, forget/forget_memory/delete_current_memory, and recall/recall_context/recall_plan/search_memory. Despite descriptions, an agent will struggle to select the correct tool.
Naming is inconsistent: mix of verb_noun (add_expense, create_memory_todo), noun_verb (memory_activity, memory_overview), get_ (get_mcp_identity), recall_ (recall, recall_context), and others like reflect, forget, store_memory. No consistent pattern.
With 49 tools, the server is far too large. This number suggests excessive granularity and many aliases, making it difficult for agents to navigate. A well-scoped server typically has 3-15 tools.
The tool set comprehensively covers memory management (CRUD, restore, redact, versions, conflicts), ledger/expenses, TODO/reminders, decisions, audits, and lifecycle. Minor gaps exist (e.g., no dedicated single-memory get tool, but recall/search cover it), but overall the domain is well-covered.
Available Tools
49 toolsadd_expenseAdd an expense to XMemoAInspect
Record one expense in the XMemo Ledger. Call this whenever the user states a concrete expense, purchase, income, or refund — for example, 'I spent 25 on lunch' or 'log this expense' — not only on an explicit 'record this'. Capture date, amount, currency, item, and merchant/category when present.
| Name | Required | Description | Default |
|---|---|---|---|
| item | Yes | The purchased item or service, for example 2 watermelons or Nintendo Switch 2. | |
| note | No | Optional completion note. | |
| path | No | A simple user-facing category or path for this memory, such as preferences, projects/xmemo, or personal/travel. | finance/ledger/expenses |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| amount | Yes | Positive transaction amount as a number. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | private |
| team_id | No | Optional team/workspace identifier for team-scoped memory access. | |
| agent_id | No | Optional client-supplied agent label for memory attribution. | |
| category | No | Optional spending category, such as food, transport, or electronics. | |
| currency | No | Currency code or label, such as CNY, RMB, JPY, yen, or USD. For ledger search/list tools, leave blank unless the user explicitly requested one currency; labels are normalized to codes. | CNY |
| merchant | No | Optional merchant or store name. | |
| device_id | No | Optional client-supplied device identifier for attribution. | |
| device_label | No | Optional human-readable device label for attribution. | |
| payment_method | No | Optional payment method, such as card, cash, Alipay, or WeChat Pay. | |
| transaction_date | No | Transaction date in YYYY-MM-DD. Leave blank to use today's UTC date. | |
| agent_instance_id | No | Optional stable, non-secret agent instance ID for per-client attribution. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations have readOnlyHint=false and destructiveHint=false, indicating mutation. The description adds 'Record one expense' but does not disclose side effects, authorization needs, error handling, or constraints beyond basic usage. More behavioral context is needed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, front-loaded with the core purpose, and uses only a few sentences without unnecessary words. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (return values presumably covered), the description adequately addresses the tool's purpose and key parameters. It does not explain error states or edge cases, but for a parameter-rich tool with many optionals, this is sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by listing key fields to capture (date, amount, currency, item, merchant/category) but does not explain their meaning beyond the schema's descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool records one expense in the XMemo Ledger, provides examples, and distinguishes from generic recording by specifying it's for concrete expenses, purchases, incomes, or refunds.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says when to call the tool (user states concrete expense/purchase/income/refund) and provides examples. It implies not to call on explicit 'record this' only, but does not mention alternatives or specific exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_memory_textAnalyze XMemo memory textARead-onlyIdempotentInspect
Count a word or phrase across XMemo memories, optionally grouped by month, agent, location, or type. Call this when the user asks how often a word/topic appears.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| since | No | Only include memories or activity on or after this ISO 8601 timestamp. | |
| until | No | Only include memories or activity on or before this ISO 8601 timestamp. | |
| phrase | Yes | Exact word or phrase to count across visible XMemo memories. | |
| group_by | No | Optional grouping: none, month, agent, location, memory_type, type, source, path, bucket, day, or comma-separated advanced fields. | none |
| match_mode | No | How to match phrase: exact, case_insensitive, or regex_lite. | exact |
| memory_type | No | Memory type/category filter or value, such as episodic, identity, procedural, semantic, working, auto, or %. | % |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. | |
| path_filter | No | Memory path filter; % matches all paths. | % |
| max_examples | No | Maximum number of short snippets to include. | |
| max_memories | No | Maximum number of memories to inspect while analyzing text. | |
| include_examples | No | Whether to include a few short matching memory snippets. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and non-destructive behavior. The description adds that the tool counts and groups, but it lists only a subset of grouping options (month, agent, location, type) while the schema includes many more (day, source, path, etc.), which is incomplete and slightly misleading.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is just two sentences, front-loading the main action. No filler or redundancy; every sentence serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite an existing output schema and good annotations, the description fails to guide an agent on how to use the 12 parameters effectively. It omits details on filtering, match modes, and advanced grouping, leaving the agent under-informed for complex queries.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, the description adds limited value; it mentions the phrase parameter and grouping capability but does not explain other key parameters like match_mode or scope. The baseline of 3 is appropriate as it reinforces core semantics without enriching the full parameter set.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool counts a word or phrase across XMemo memories with optional grouping, using a specific verb and resource. It distinguishes the 'count' action from nearby siblings like search_memory or recall, but does not explicitly differentiate from all siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use the tool: 'Call this when the user asks how often a word/topic appears.' This gives clear context, though it does not mention when not to use it or suggest alternative tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
complete_memory_todoComplete XMemo memory TODOAInspect
Mark a TODO/action item completed. Call this when the user says a saved task is done, resolved, or no longer needed; use the exact reference from list_memory_todos.
| Name | Required | Description | Default |
|---|---|---|---|
| note | No | Optional completion note. | |
| todo_id | Yes | The memory TODO/action-item ID to complete. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate non-read-only, non-destructive, and non-idempotent behavior. The description adds the 'mark completed' semantic but does not disclose additional behavioral traits such as reversibility or side effects beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, directly conveying purpose and usage guidance. Every sentence is meaningful with no extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and the tool's simplicity (status update), the description adequately covers purpose, usage, and parameter hints. It provides sufficient context for an AI agent to select and invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and both parameters are described in the schema. The description adds minimal value, only noting to use the exact reference from list_memory_todos for todo_id. This meets the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool marks a TODO/action item as completed, using specific verbs and resource. It distinguishes from siblings by referencing list_memory_todos for the exact reference, aiding differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to call the tool (user says task is done, resolved, or no longer needed) and advises using the exact reference from list_memory_todos. However, it does not explicitly exclude alternative tools like complete_reminder.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
complete_reminderAdvanced: complete XMemo legacy reminderBInspect
Complete a TODO item. Alias for complete_memory_todo.
| Name | Required | Description | Default |
|---|---|---|---|
| note | No | Optional completion note. | |
| reminder_id | Yes | The memory TODO/action-item ID to complete. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate mutation (readOnlyHint=false) and non-destructive behavior (destructiveHint=false). The description adds the alias relationship, clarifying the tool's equivalence to complete_memory_todo, but provides no further behavioral details like side effects or completion semantics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with no fluff. The alias information is valuable and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, return values are handled externally. However, the description omits important context like legacy status, prerequisites, and completion effects, making it minimally sufficient for a tool with 2 parameters.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema covers 100% of parameters with descriptions. The tool description adds no additional meaning beyond the schema for either parameter, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Complete a TODO item,' which is a specific verb and resource. It also identifies as an alias for complete_memory_todo, but does not explicitly mention 'legacy' or the XMemo context from the title, which would further differentiate it.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description only notes it is an alias for complete_memory_todo, providing no guidance on when to use this tool versus the sibling. It lacks context about legacy status or scenarios where this alias is preferred.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_memory_todoCreate XMemo memory TODOAInspect
Create a TODO/action item in XMemo with an optional due time. Call this when the user asks to save a task, follow-up, action item, or reminder-style item to revisit later.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | work |
| due_at | No | Optional due time for a memory TODO/action item, preferably ISO 8601. | |
| content | Yes | Text body to save, such as a memory, TODO/action item, event note, or state summary. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=false and destructiveHint=false, so the description's mention of 'Create' is consistent. However, it adds no further behavioral details (e.g., overwriting behavior, permission requirements) beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences: one stating purpose and one giving usage guidance. No unnecessary words or details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 5 parameters (1 required), output schema existence, and annotations, the description covers the main purpose and usage. It does not explain return values, but the output schema likely covers that.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the description does not need to add much. It mentions 'optional due time' corresponding to due_at, but does not add new meaning beyond the schema for other parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Create' and the resource 'TODO/action item in XMemo' with an optional due time. This distinguishes it from siblings like 'create_reminder' and 'complete_memory_todo'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Call this when the user asks to save a task, follow-up, action item, or reminder-style item to revisit later.' This provides clear usage context, though it does not mention when not to use it or compare to alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_pending_decisionCreate Pending DecisionAInspect
Record an unresolved decision with its options to revisit later. Use for open choices, not ordinary TODOs.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | work |
| due_at | No | Optional due time for a memory TODO/action item, preferably ISO 8601. | |
| context | Yes | Short context explaining a memory usage event or pending decision. | |
| options_json | No | JSON array of candidate options for a pending decision. | [] |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations are all false, providing no safety hints. The description adds no behavioral detail beyond the implied mutation of 'record'. It does not disclose side effects, duplication behavior, or required permissions, leaving the agent to infer from context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two sentences that front-load the purpose and include a critical usage instruction. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While the description covers the basic purpose and usage boundary, it lacks explanation of expected output (despite an output schema existing) and does not fully clarify the distinction between pending decisions and memory TODOs in the sibling set. It is minimally adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already explains all parameters. The description adds no extra parameter-specific meaning or usage guidance, meeting the baseline of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'record' and resource 'unresolved decision with its options', and distinguishes from ordinary TODOs with the explicit instruction 'Use for open choices, not ordinary TODOs.' This differentiates it from sibling tools like create_memory_todo.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context on when to use the tool ('for open choices') and what not to use it for ('not ordinary TODOs'). However, it does not explicitly mention alternative tools for resolution or listing, though the context signals provide sibling names.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_reminderAdvanced: create XMemo legacy reminderBInspect
Create a TODO item. Alias for create_memory_todo.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | work |
| due_at | No | Optional due time for a memory TODO/action item, preferably ISO 8601. | |
| content | Yes | Text body to save, such as a memory, TODO/action item, event note, or state summary. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide minimal behavioral hints (readOnlyHint=false, destructiveHint=false). Description adds no additional behavioral context beyond 'Create a TODO item', leaving side effects and preconditions undisclosed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is extremely concise with one sentence that directly states the purpose. No extraneous content, though it lacks structural elements like sections.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 parameters and an output schema, the description is too minimal. It does not explain return values, prerequisites, or how parameters interact, leaving significant gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents parameters. Description adds no further meaning or examples beyond what the schema provides, resulting in baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the verb 'Create' and resource 'TODO item', and explicitly identifies it as an alias for 'create_memory_todo', distinguishing it from siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description mentions it is an alias for create_memory_todo, implying equivalent usage, but provides no explicit guidance on when to use this tool vs alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_restart_snapshotCreate Restart SnapshotAInspect
Save active state, recent events, TODOs, and pending decisions for restart. Call this before handoff or when context may be lost.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | work |
| state_key | No | Working-state key to save, snapshot, or restore, such as active_task. | active_task |
| session_id | No | Optional session identifier associated with the event, timeline query, or snapshot. | |
| ttl_seconds | No | Time-to-live in seconds for working state or restart snapshot records. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| decision_limit | No | Maximum number of pending decisions to include in a restart snapshot. | |
| reminder_limit | No | Maximum number of TODO/reminder items to include in a restart snapshot. | |
| timeline_limit | No | Maximum number of recent timeline events to include in a restart snapshot. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate a write operation (readOnlyHint=false) and non-destructive (destructiveHint=false). Description adds that it saves state for restart, but doesn't detail side effects (e.g., overwriting previous snapshots) or permissions. Still adds useful context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, no redundant text. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (9 parameters) and presence of output schema, the description captures the core functionality. Could mention that the snapshot can be restored later, but that is implied by the sibling tool name. Overall adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so each parameter has a description in the schema. The tool description adds no additional parameter information, but the schema is sufficient. Baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description states a specific verb ('Save') and resource ('active state, recent events, TODOs, and pending decisions for restart'), and clearly distinguishes from siblings like restore_restart_snapshot (complementary) and create_pending_decision (part of what's saved).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use: 'Call this before handoff or when context may be lost.' Provides clear context and implies alternatives (restore_restart_snapshot for restoring). No misleading guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_current_memoryDelete current XMemo memoryADestructiveInspect
Permanently delete the single active memory (no ID needed). Call this only when exactly one memory exists and the user explicitly asks to delete it; refused if zero or multiple match.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Optional user-facing reason for the deletion. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark this as destructive, but the description adds that deletion is permanent and explains the single-memory requirement. This supplements the annotation effectively.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no fluff. Action and condition are front-loaded. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and the presence of an output schema, the description covers all necessary aspects: purpose, usage condition, and behavior. Complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description does not add extra meaning beyond the schema's description of the 'reason' parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it deletes the single active memory without needing an ID, distinguishing it from sibling tools like forget_memory that require an ID. The verb 'delete' is specific and the resource 'current memory' is well-defined.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to call: 'only when exactly one memory exists and the user explicitly asks to delete it'. Also specifies refusal conditions: 'refused if zero or multiple match'. This provides clear usage boundaries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
explain_memoryExplain XMemo memoryARead-onlyIdempotentInspect
Explain why a memory exists or matched a query. Call this when the user asks why something was saved or why it was retrieved.
| Name | Required | Description | Default |
|---|---|---|---|
| query | No | Natural-language question or search text. | |
| memory_id | Yes | Exact XMemo memory reference shown by search or recall. | |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. | |
| include_embedding | No | Whether to include embedding/vector metadata in the returned explanation. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, idempotentHint, and destructiveHint, covering safety and side effects. The description adds the purpose of explaining existence/matching but no additional behavioral traits (e.g., permissions, rate limits). It complements annotations without contradicting them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no extraneous information. Each sentence serves a distinct purpose: stating functionality and providing usage guidance. Highly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description, combined with comprehensive schema descriptions and annotations, covers the essential context for this relatively simple tool. The output schema exists, so return values are documented. Sibling tools are numerous, but the description helps distinguish the tool's purpose. Could be slightly improved by specifying that it requires a memory_id, but that is already in the schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for each parameter. The tool description does not add extra meaning beyond the schema, so the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'explain' and the resource 'memory', specifying two distinct scenarios: existence and retrieval matching. It effectively distinguishes from siblings like 'recall' or 'search_memory' by focusing on explanation rather than retrieval or search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Call this when the user asks why something was saved or why it was retrieved.' This provides clear when-to-use guidance but does not mention when not to use it nor suggest alternative tools for other scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
forgetForget XMemo memoryADestructiveInspect
Permanently delete one memory by natural target ('latest', 'current', or an exact ID). Call this only when the user explicitly asks to forget/delete a specific memory; search or check activity first if the target is ambiguous.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Optional user-facing reason for the deletion. | |
| target | No | The memory to forget: current or an exact memory ID shown by search. | current |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description repeats the destructive nature already indicated by the annotation (destructiveHint=true) but adds the nuance of 'permanently delete' and the range of acceptable targets. It does not contradict annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no wasted words. The key action and usage guideline are front-loaded, making it easy to scan.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of annotations, output schema, and sibling differentiation, the description covers the core functionality and usage. It lacks details about return values (though output schema may cover that) and post-deletion effects, but overall it's adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema coverage is 100%, so parameters are already documented. The description adds only minor context (e.g., 'natural target') beyond the schema, meeting the baseline but not significantly enhancing understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'delete' and resource 'memory', and specifies the natural targets ('latest', 'current', or exact ID). It distinguishes from sibling tools like 'delete_current_memory' and 'forget_memory' by its target flexibility.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises calling only when the user asks to forget a specific memory, and instructs to search or check activity first if the target is ambiguous. This provides clear when-to-use and when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
forget_memorySoft-delete XMemo memoryADestructiveInspect
Delete a memory (recoverable) by exact reference, when the user explicitly asks to remove or hide it. Use forget or delete_current_memory for permanent deletion.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Deletion mode for forget_memory, such as soft_delete, hard_delete, or redact when supported. | soft_delete |
| reason | No | Optional user-facing reason for the deletion. | |
| memory_id | Yes | Exact XMemo memory reference shown by search or recall. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| replacement_content | No | Replacement text to store when redacting or replacing deleted memory content. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true, but the description adds that deletion is recoverable (soft-delete), providing important behavioral context beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (three sentences), front-loaded with the purpose, and contains no unnecessary information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is complete for the tool's purpose, but lacks details about the output or return values. However, an output schema exists (not shown) which likely covers this.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description does not add additional meaning to the parameters beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool deletes a memory by exact reference and is recoverable, distinguishing it from siblings like 'forget' and 'delete_current_memory' which are for permanent deletion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explicitly says when to use (when user asks to remove or hide) and when not to (for permanent deletion, use alternatives), providing clear usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_mcp_identityCheck XMemo connectionARead-onlyIdempotentInspect
Check XMemo connection status and the connected account/agent. Call this when the user asks whether XMemo is connected.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds minor context about checking connected account/agent, but does not reveal additional behavioral traits beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words. Front-loaded with purpose in first sentence.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple read-only check with no inputs, the description is adequate. An output schema exists to detail return values, so no need to describe them in text.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, and schema coverage is 100%. Baseline for 0 parameters is 4; description adds no parameter info but not required.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Check' and the resource 'XMemo connection status and the connected account/agent'. It distinguishes itself from sibling tools (which deal with memory, expenses, etc.) by being the only connection check.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Call this when the user asks whether XMemo is connected.' No exclusions needed as no alternative tools exist for this purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_monthly_ledger_summaryShow XMemo monthly spending summaryARead-onlyIdempotentInspect
Summarize Ledger totals by month and currency. Call this when the user asks for monthly spending, income, refunds, net balance, budget review, or spending trends.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | % |
| months | No | Number of recent ledger months to summarize. | |
| currency | No | Currency code or label, such as CNY, RMB, JPY, yen, or USD. For ledger search/list tools, leave blank unless the user explicitly requested one currency; labels are normalized to codes. | |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. | |
| transaction_type | No | Optional ledger type filter: expense, income, refund, or transfer. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, providing a strong safety profile. The description adds that it 'summarizes totals by month and currency,' which is accurate but adds minimal new behavioral context beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences that front-load the core function and then provide clear usage triggers. Every sentence adds value with zero waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists and the tool is a straightforward summary, the description covers the essential behavior and usage triggers. It does not detail the effect of scope/bucket parameters, but those are standard memory tool parameters explained in the schema. A minor omission is the behavior when currency is blank (assumes all currencies), but it's implied.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description does not add extra meaning beyond what each parameter's schema description already provides (e.g., months: number of recent months, currency: currency code). No additional parameter guidance is given.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb 'Summarize' with clear resources 'Ledger totals by month and currency.' It distinguishes from sibling tools like list_ledger_transactions (which lists individual transactions) by focusing on summary. Examples of queries that trigger it further clarify purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to call the tool: 'when the user asks for monthly spending, income, refunds, net balance, budget review, or spending trends.' It does not explicitly exclude alternatives or specify when not to use it, but the guidance is clear and direct.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_recall_policyGet Recall PolicyARead-onlyIdempotentInspect
Inspect XMemo retrieval policy (debug/admin). For actual recall use recall_context/recall/search_memory.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. The description adds behavioral context that it is for debug/admin, which reinforces the non-destructive nature. No contradictions. Slight value added beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first states purpose, second provides alternatives. Every word is necessary and front-loaded. No redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters, presence of output schema, rich annotations, and clear differentiation from sibling tools, the description is fully complete for an agent to decide when and how to use this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so schema coverage is 100%. Baseline for zero-parameter tools is 4. Description does not need to add parameter info, and it doesn't attempt to.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: inspect XMemo retrieval policy for debug/admin. It uses specific verb 'inspect' and resource 'retrieval policy', and distinguishes it from recall tools by naming alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides when to use this tool (debug/admin inspection of policy) and when not to (for actual recall, use recall_context/recall/search_memory). This is a model example of usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_safety_policyGet Safety PolicyARead-onlyIdempotentInspect
Show the account safety policy. Useful before custom memory-writing that may include sensitive content; normal writes are already sanitized server-side.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, destructiveHint, idempotentHint. Description adds behavioral context about sanitization and why reading the policy is useful, going beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two efficient sentences, front-loaded with purpose, no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero parameters and presence of output schema, description fully covers what the tool does and when to use it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters; baseline 4 applies. Description adds no parameter info but schema coverage is 100% trivially.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Show the account safety policy' and distinguishes from siblings by specifying usage context for custom memory-writing vs normal sanitized writes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises using before custom memory-writing with sensitive content, and notes that normal writes are already sanitized, providing clear when-to-use and when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_system_statsAdvanced: query XMemo memory statsARead-onlyIdempotentInspect
Detailed memory stats with filters and grouping. For a quick summary use memory_overview or memory_stats.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| since | No | Only include memories or activity on or after this ISO 8601 timestamp. | |
| top_n | No | Maximum number of aggregate groups to return. | |
| until | No | Only include memories or activity on or before this ISO 8601 timestamp. | |
| source | No | Optional source filter for system statistics. | |
| status | No | Memory status filter or update value, such as active, archived, deleted, or %. | % |
| team_id | No | Optional team/workspace identifier for team-scoped memory access. | |
| group_by | No | Optional grouping: none, month, agent, location, memory_type, type, source, path, bucket, day, or comma-separated advanced fields. | |
| memory_type | No | Memory type/category filter or value, such as episodic, identity, procedural, semantic, working, auto, or %. | % |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. | |
| path_filter | No | Memory path filter; % matches all paths. | % |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, idempotentHint=true, destructiveHint=false, which the description does not contradict. The description adds 'Advanced' context, implying more complex operation, but does not elaborate on performance or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, each adding unique value: first states purpose, second provides usage guidance. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With full schema coverage, output schema present, and clear annotations, the description adequately covers purpose and usage. The presence of an output schema means return values need no explanation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the description adds minimal parameter-level value. It hints at filtering and grouping but does not detail individual parameters beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it provides detailed memory stats with filters and grouping, distinguishing it from memory_overview and memory_stats which provide quick summaries.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use this tool (for detailed stats) and when to use alternatives (memory_overview or memory_stats for quick summaries).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_timelineGet TimelineARead-onlyIdempotentInspect
Show recent events. Call this when the user asks what happened recently or when resuming work that may depend on recent events.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | % |
| event_type | No | Timeline event type, such as event, milestone, decision, or handoff. | |
| session_id | No | Optional session identifier associated with the event, timeline query, or snapshot. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, so the description's safety profile is well-covered. The description adds no further behavioral traits (e.g., chronological ordering, pagination), but it does not contradict the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences: the first states the action, the second gives usage context. Every word earns its place; there is no redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, the description need not detail return values. However, it does not explain how the optional parameters (scope, bucket, event_type, session_id) affect the results. The schema fills this gap, but the overall description is minimal for a tool with 5 parameters.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all 5 parameters have descriptions). The tool description itself adds no meaning beyond what the schema provides, so the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose as showing recent events and ties it to user requests about what happened recently or resuming work. This specific verb+resource combination distinguishes it from siblings like record_event (write) and memory activity tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly explains when to call the tool: 'when the user asks what happened recently or when resuming work that may depend on recent events.' It does not mention when not to use it or provide alternatives, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_ledger_transactionsShow XMemo expensesARead-onlyIdempotentInspect
Show XMemo Ledger records. Call this when the user asks about past expenses, purchases, or spending history, or before answering a finance question that depends on recorded transactions. For item searches use query and leave currency blank unless a currency was requested.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| query | No | Natural-language question or search text. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | % |
| offset | No | Number of matching records to skip for pagination. | |
| date_to | No | Only include ledger transactions on or before this YYYY-MM-DD date. | |
| category | No | Optional spending category, such as food, transport, or electronics. | |
| currency | No | Currency code or label, such as CNY, RMB, JPY, yen, or USD. For ledger search/list tools, leave blank unless the user explicitly requested one currency; labels are normalized to codes. | |
| date_from | No | Only include ledger transactions on or after this YYYY-MM-DD date. | |
| max_amount | No | Only include ledger transactions with amount less than or equal to this value. | |
| min_amount | No | Only include ledger transactions with amount greater than or equal to this value. | |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. | |
| transaction_type | No | Optional ledger type filter: expense, income, refund, or transfer. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds no new behavioral traits beyond stating it 'shows' records, which is consistent. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with purpose, each sentence adds necessary information. No fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 13 parameters, 100% schema coverage, and output schema present, the description covers usage scenarios and provides parameter guidance. It is complete enough for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds value by guiding use of query and currency parameters (e.g., leave currency blank unless explicit request), which goes beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool shows XMemo Ledger records and provides specific use cases like past expenses and spending history. It distinguishes from siblings implicitly by mentioning ledger records, but does not explicitly differentiate from tools like 'get_monthly_ledger_summary'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to call (past expenses, spending history, before finance questions) and gives parameter guidance (use query for item searches, leave currency blank unless requested). Lacks explicit when-not-to-use or alternatives beyond the hint.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_memory_conflictsList Memory ConflictsARead-onlyIdempotentInspect
List conflicting/inconsistent memories. Call this when the user/admin asks to review unresolved conflicts.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. | |
| unresolved_only | No | Whether to show only unresolved memory conflicts. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, so the description need not repeat them. It adds context about the tool's focus on conflicts without contradicting annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two short, front-loaded sentences with no wasted words. The purpose and usage are conveyed efficiently.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given annotations and well-documented parameters, the description provides sufficient context for a listing tool. It could mention default filtering but is otherwise complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for all parameters. The description adds no additional meaning beyond the schema, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List conflicting/inconsistent memories') and the resource ('memory conflicts'), distinguishing it from siblings like resolve_memory_conflict.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says when to call the tool ('when the user/admin asks to review unresolved conflicts'), providing clear context for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_memory_todosList XMemo memory TODOsARead-onlyIdempotentInspect
List open or completed TODO/action items. Call this when the user asks what tasks, follow-ups, or saved action items are pending or done.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | % |
| due_before | No | Only show memory TODO/action items due before this time. | |
| item_status | No | Optional memory TODO/action-item status filter. | open |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and idempotentHint=true. The description adds the 'open or completed' clarification and usage context, but no additional behavioral traits like pagination or default scope beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with no fluff. The purpose and usage trigger are front-loaded and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists, so return values are covered. The description adequately explains what the tool does and when to use it. It does not mention parameter caveats like bucket wildcards, but the schema covers those. Overall sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so all parameters are documented. The description adds no extra meaning beyond mentioning 'open or completed', which aligns with the item_status parameter. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'List' and the resource 'TODO/action items from XMemo memory'. It distinguishes from sibling tools like create_memory_todo and complete_memory_todo by specifying it lists items.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance: 'Call this when the user asks what tasks, follow-ups, or saved action items are pending or done.' It does not explicitly mention exclusions or alternatives, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_memory_versionsList Memory VersionsARead-onlyIdempotentInspect
Show version history for a memory. Call this when the user asks for previous content or changes over time.
| Name | Required | Description | Default |
|---|---|---|---|
| memory_id | Yes | Exact XMemo memory reference shown by search or recall. | |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, so the primary behavioral traits are captured. The description adds minimal extra context (e.g., 'changes over time') but does not reveal rate limits, authentication needs, or other nuances beyond what annotations imply.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—two sentences with no filler. The first sentence delivers the core purpose, and the second adds usage context. Every word contributes meaning.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (so return structure is documented elsewhere), and the tool is a simple version-listing operation, the description covers the essential context—purpose and when to use. Some might want more detail on what 'version history' entails, but it is sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and both parameters have clear descriptions in the input schema (e.g., 'Exact XMemo memory reference'). The description does not add any additional semantic value beyond what the schema already provides, warranting a baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Show') and the resource ('version history for a memory'), making the tool's purpose unambiguous. It effectively distinguishes itself from sibling tools like 'list_memory_todos' or 'get_timeline' which serve different purposes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to call the tool ('when the user asks for previous content or changes over time'), providing clear usage context. It lacks explicit when-not-to-use guidance but the positive indicator is strong.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_pending_decisionsList Pending DecisionsARead-onlyIdempotentInspect
List open/unresolved decisions. Call this when the user asks what decisions are pending, or before planning work that depends on unresolved choices.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | % |
| due_before | No | Only show memory TODO/action items due before this time. | |
| item_status | No | Optional memory TODO/action-item status filter. | open |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds that it lists 'open/unresolved' decisions, consistent with these hints. No contradiction; additional behavioral context is minimal but sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, no wasted words. The description is front-loaded with the purpose and usage, making it easy for an agent to quickly understand the tool's role.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 optional parameters and an output schema, the description effectively covers the core purpose and usage. It could mention parameter-based filtering (e.g., due date or status), but the schema covers that, so the description is complete enough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All 5 parameters have 100% schema description coverage, so the schema already explains each parameter's meaning. The description does not add any extra parameter details, maintaining the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists 'open/unresolved decisions' with a specific verb ('List') and resource ('pending decisions'). It distinguishes itself from sibling tools like 'create_pending_decision' and 'resolve_decision' by focusing on listing pending items.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to call: 'when the user asks what decisions are pending, or before planning work that depends on unresolved choices.' This provides clear context and a specific use case.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_remindersAdvanced: list XMemo legacy remindersBRead-onlyIdempotentInspect
List TODO items. Alias for list_memory_todos.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | % |
| due_before | No | Only show memory TODO/action items due before this time. | |
| item_status | No | Optional memory TODO/action-item status filter. | open |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. The description adds no behavioral details beyond what is captured in annotations; it only restates the listing action.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences long, front-loading the purpose and then clarifying the alias. No wasted words; highly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and comprehensive parameter descriptions, the description covers the essential purpose and alias. However, it lacks context on what 'TODO items' specifically refers to in the legacy reminders system, which may require the agent to infer from sibling tool names.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All 5 parameters have descriptions in the input schema (100% coverage). The tool description does not provide any additional parameter semantics, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states 'List TODO items' and identifies it as an alias for list_memory_todos, giving a clear purpose. However, the title mentions 'legacy reminders' while description says 'TODO items', causing slight ambiguity about the exact resource.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates it is an alias for list_memory_todos, implying interchangeability. It provides no explicit guidance on when to use this tool over others or when not to use it, leaving the agent to infer from the alias relationship.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
mark_memory_usedMark Memory UsedAInspect
Record that a recalled memory was used in the answer. Call this after a retrieved memory actually influenced your response (not for ignored results).
| Name | Required | Description | Default |
|---|---|---|---|
| action | No | Usage action or audit-event action filter, depending on the tool. | used |
| context | No | Short context explaining a memory usage event or pending decision. | |
| memory_id | Yes | Exact XMemo memory reference shown by search or recall. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| usage_tracking_id | No | Optional tracking ID that links a recalled memory to later usage feedback. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=false (writes) and destructiveHint=false (not destructive). The description adds the important condition that it should only be called when the memory influenced the answer, which goes beyond annotation and gives behavioral context on appropriate use.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the tool's purpose, and contains no unnecessary information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has an output schema, so return values are covered. The description explains when to use it, which is the main context. Could mention effect on future recalls or audit trail, but overall adequate for a simple audit action.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, with descriptions for all 5 parameters. The description does not add additional meaning beyond the schema; it only explains the tool's overall purpose. Therefore baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool records that a recalled memory was used, with a specific verb 'Record' and resource 'memory used'. It distinguishes from ignored results, making the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says to call this after a memory influenced the response, and not for ignored results. While it doesn't name alternative tools, the condition is clear. Sibling tools like 'record_recall_feedback' exist but are not mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
memory_activityShow XMemo memory activityARead-onlyIdempotentInspect
Show recent memory activity — writes, deletions, TODO changes, ledger changes, and timeline events. Call this when the user asks what changed recently, or before deleting 'latest/recent' memory so the user can confirm the preview.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| since | No | Only include memories or activity on or after this ISO 8601 timestamp. | |
| until | No | Only include memories or activity on or before this ISO 8601 timestamp. | |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. | |
| activity_type | No | Activity filter: all, writes, deletions, reads, reminders, ledger, or timeline. | all |
| include_timeline | No | Whether to include timeline events in recent memory activity. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark it as read-only and idempotent. The description adds context about the types of activity shown and the preview use case, going beyond annotation information without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two short sentences: first explains what the tool does, second gives usage guidelines. No wasted words, information is front-loaded and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the rich annotations, full schema coverage, and presence of output schema, the description sufficiently covers the tool's behavior and usage. No gaps identified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% and each parameter is well-documented. The description adds no extra meaning to parameter values beyond what the schema provides, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Show' and the resource 'recent memory activity', enumerating specific types (writes, deletions, TODO changes, ledger changes, timeline events). It distinguishes itself from siblings like 'recall' or 'search_memory' by focusing on activity stream.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit guidance: 'Call this when the user asks what changed recently, or before deleting "latest/recent" memory...' This gives concrete use cases but does not explicitly list when not to use or compare to all alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
memory_overviewShow XMemo memory overviewARead-onlyIdempotentInspect
Show a summary of XMemo memories and recent activity. Call this when the user asks what XMemo remembers without a specific query.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false, so the safety profile is well-established. The description adds that it shows a 'summary' and 'recent activity,' which is useful context but not critical beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences, front-loaded with the key action and purpose. Every word adds value with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (zero parameters, clear purpose, and an output schema), the description is fully complete. It tells the agent exactly what the tool does and when to use it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has no parameters, so there is no need for parameter descriptions. Baseline of 4 applies because the schema coverage is effectively 100%.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the tool's function: 'Show a summary of XMemo memories and recent activity.' It also distinguishes from siblings by noting it is for general queries without a specific query, which differentiates it from tools like recall or search_memory.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use the tool: 'Call this when the user asks what XMemo remembers without a specific query.' While it doesn't mention when not to use or alternatives, the guidance is clear and actionable.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
memory_statsShow XMemo memory statisticsARead-onlyIdempotentInspect
Show aggregate statistics for XMemo memories — counts by type, source, location, and optional time range. Call this when the user asks for counts, breakdowns, or distributions.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| since | No | Only include memories or activity on or after this ISO 8601 timestamp. | |
| top_n | No | Maximum number of aggregate groups to return. | |
| until | No | Only include memories or activity on or before this ISO 8601 timestamp. | |
| group_by | No | Optional grouping: none, month, agent, location, memory_type, type, source, path, bucket, day, or comma-separated advanced fields. | |
| memory_type | No | Memory type/category filter or value, such as episodic, identity, procedural, semantic, working, auto, or %. | % |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. | |
| path_filter | No | Memory path filter; % matches all paths. | % |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and idempotentHint=true, matching the description's stateless read nature. The description adds useful behavioral context (aggregate stats, optional time range) beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise: one sentence stating purpose and one sentence for usage guidance. No wasted words; front-loaded with core function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 8 optional parameters, an output schema exists, and the description covers purpose and usage. Could elaborate on grouping behavior, but adequate for an aggregate stats tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with good parameter descriptions. The description adds some context (e.g., 'counts by type, source, location') but does not significantly enhance understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description specifies exactly what the tool does: 'Show aggregate statistics for XMemo memories — counts by type, source, location, and optional time range.' It clearly distinguishes from sibling tools like search_memory or recall by focusing on aggregations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear when-to-use guidance: 'Call this when the user asks for counts, breakdowns, or distributions.' Lacks explicit when-not-to-use or alternatives, but the context makes it sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
query_audit_eventsQuery Audit EventsARead-onlyIdempotentInspect
Advanced audit query by action/time/target. For normal users prefer memory_activity.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| since | No | Only include memories or activity on or after this ISO 8601 timestamp. | |
| until | No | Only include memories or activity on or before this ISO 8601 timestamp. | |
| action | No | Usage action or audit-event action filter, depending on the tool. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | |
| offset | No | Number of matching records to skip for pagination. | |
| target_id | No | Optional audit target ID to filter audit events. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false, so safety and side effects are covered. Description adds no extra behavioral context (e.g., pagination behavior, rate limits, or result structure). It merely restates query capability.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first states purpose, second gives usage guidance. Efficient and front-loaded with essential information. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 8 parameters fully documented in schema and presence of an output schema, the description is adequate. It references the sibling tool for normal users, but could briefly explain what 'advanced' means regarding filter combinations. Otherwise complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with all 8 parameters clearly documented. The description adds no additional meaning beyond what the schema provides, so a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it's an advanced audit query by action/time/target, distinguishing it from 'memory_activity' for normal users. The verb 'query' and resource 'audit events' are specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'For normal users prefer memory_activity', providing a clear when-not to use this tool and an alternative. The phrase 'advanced' implies it's for users needing finer-grained filtering.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
query_consolidation_auditQuery Consolidation AuditARead-onlyIdempotentInspect
Inspect consolidation/lifecycle/reflection audit records (admin). Not for normal recall/save.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| since | No | Only include memories or activity on or after this ISO 8601 timestamp. | |
| until | No | Only include memories or activity on or before this ISO 8601 timestamp. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | |
| offset | No | Number of matching records to skip for pagination. | |
| action_type | No | Consolidation, lifecycle, or reflection audit action type to filter by. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, destructiveHint. Description adds context: admin-only, specific audit types. No contradiction. Adequate beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with key purpose and usage. No redundant information. Every word earned its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With full schema coverage, annotations, and an output schema, the description is complete for an admin inspection tool. It covers purpose, scope, and usage context without gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the description does not need to explain parameters. It adds no extra parameter meaning beyond the schema, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it inspects consolidation/lifecycle/reflection audit records and is admin-only, distinguishing it from normal recall/save. Sibling tools like query_audit_events may be broader, but this narrow focus makes purpose explicit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description implies admin use and excludes normal recall/save, but does not explicitly say when to avoid this tool or mention alternatives like query_audit_events. More explicit when-not guidance would help.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recallRecall XMemo memoryARead-onlyIdempotentInspect
Recall the few most relevant saved memories before answering, when prior preferences, known facts, past decisions, or project context may directly affect the response. Use recall_context when many memories are needed.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| query | Yes | Natural-language question or search text. | |
| explain | No | Whether to include retrieval explanation details with search or recall results. | |
| agent_id | No | Optional client-supplied agent label for memory attribution. | |
| memory_type | No | Memory type/category filter or value, such as episodic, identity, procedural, semantic, working, auto, or %. | % |
| path_filter | No | Memory path filter; % matches all paths. | % |
| prefer_working | No | Whether to prioritize working/session-state memories in retrieval. | |
| agent_instance_id | No | Optional stable, non-secret agent instance ID for per-client attribution. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. The description adds context about retrieving the few most relevant memories and that it is used before answering, which is useful beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with no wasted words. First sentence states purpose and usage, second provides alternative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has an output schema (not shown), the description is complete enough. It covers what it does and when to use it, with schema handling parameter details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with good descriptions for all parameters. The description adds minimal extra meaning beyond the schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Recall' and resource 'saved memories', and distinguishes from sibling 'recall_context' by specifying when many memories are needed.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use the tool (when prior preferences, known facts, etc., affect response) and provides a clear alternative (recall_context for many memories).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recall_contextBuild XMemo contextARead-onlyIdempotentInspect
Build a context pack from XMemo memories. Call this before answering or planning when the task could benefit from multiple saved memories — preferences, prior conversations, projects, decisions, TODOs, or long-running work.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| query | Yes | Natural-language question or search text. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | % |
| team_id | No | Optional team/workspace identifier for team-scoped memory access. | |
| agent_id | No | Optional client-supplied agent label for memory attribution. | |
| max_items | No | Maximum number of memory items to include. | |
| max_tokens | No | Approximate maximum response size. | |
| memory_type | No | Memory type/category filter or value, such as episodic, identity, procedural, semantic, working, auto, or %. | auto |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. | |
| path_filter | No | Memory path filter; % matches all paths. | % |
| prefer_working | No | Whether to prioritize working/session-state memories in retrieval. | |
| agent_instance_id | No | Optional stable, non-secret agent instance ID for per-client attribution. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive. The description adds behavioral context: it builds a 'context pack' from multiple memories, meant for holistic preparation. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with the primary purpose and usage guidance. No filler. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 13 parameters, output schema present, and rich annotations, the description adequately covers the 'why and when'. It does not explain the return format (but output schema does) or parameter details (schema covers them). Could mention that it returns a structured context pack, but overall sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All 13 parameters have descriptions in the schema (100% coverage). The tool description does not elaborate on any parameter beyond the schema. The overall purpose is clear, but parameter-specific semantics are fully delegated to the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a clear verb ('Build') and resource ('context pack from XMemo memories'), and provides concrete examples of when to use it (preferences, prior conversations, projects, etc.). It implicitly distinguishes from siblings like 'recall' or 'search_memory' by focusing on multiple memories.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to call: 'before answering or planning when the task could benefit from multiple saved memories'. Provides examples of memory types. Lacks explicit 'when not to use' or alternative tool names, but the context signals list many siblings, so it's reasonably clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recall_planPlan with XMemoARead-onlyIdempotentInspect
Plan which memory searches to run before retrieval. Use only for complex tasks; for ordinary answers call recall_context or recall directly.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Natural-language question or search text. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | % |
| team_id | No | Optional team/workspace identifier for team-scoped memory access. | |
| memory_type | No | Memory type/category filter or value, such as episodic, identity, procedural, semantic, working, auto, or %. | % |
| output_json | No | Return a machine-readable JSON response instead of a human-readable summary. | |
| path_filter | No | Memory path filter; % matches all paths. | % |
| prefer_working | No | Whether to prioritize working/session-state memories in retrieval. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the safety profile is clear. The description adds no extra behavioral context beyond stating it is for planning, which is adequate given the annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, no wasted words. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, the description need not explain return values. It provides a clear high-level purpose but could offer slightly more detail on what 'planning' entails (e.g., whether it returns search suggestions). Still adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so each parameter already has a description. The description adds no additional meaning beyond the schema, meeting the baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb-resource pair: 'Plan which memory searches to run before retrieval.' It clearly distinguishes from siblings by naming recall_context and recall as alternatives for ordinary answers.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('complex tasks') and when not to use ('for ordinary answers call recall_context or recall directly'), providing clear context and alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
record_eventRecord EventAInspect
Record a significant session event, milestone, decision, or handoff note useful for reconstructing recent work. Not for durable facts/preferences (use remember).
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | work |
| content | Yes | Text body to save, such as a memory, TODO/action item, event note, or state summary. | |
| event_type | No | Timeline event type, such as event, milestone, decision, or handoff. | event |
| session_id | No | Optional session identifier associated with the event, timeline query, or snapshot. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare non-readOnly and non-destructive traits. Description adds context that events are for transient reconstruction, but could clarify whether events are appended or overwritten. Nonetheless, adds value beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first states purpose and examples, second states exclusion. Front-loaded and no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With output schema present, return values are covered. Tool is simple record operation, description covers what, when, and alternatives. Complete for this complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage, so baseline is 3. Description does not add parameter-specific meaning beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool records session events, milestones, decisions, or handoff notes, and explicitly distinguishes from 'remember' for durable facts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use (significant session events for reconstructing recent work) and when-not-to-use (durable facts/preferences, with alternative tool named 'remember').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
record_recall_eval_judgmentRecord Recall Eval JudgmentAInspect
Evaluation only — record whether a returned memory was relevant for a query. Do not use in normal conversation.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Natural-language question or search text. | |
| reason | No | Optional user-facing reason for the deletion. | |
| position | No | One-based rank position of the memory in the recall results, or 0 if not returned. | |
| relevant | Yes | Whether the memory was relevant to the recall-evaluation query. | |
| memory_id | Yes | Exact XMemo memory reference shown by search or recall. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| expected_relevant_total | No | Expected number of relevant memories for this recall-evaluation query. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide no behavioral hints (all false), so the description carries the full burden. It states the tool records evaluation data, a write operation, and does not contradict annotations. However, it could detail side effects like data persistence or impact on recall models.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences front-loaded with purpose and usage guidance. No redundant information; every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 7 parameters and an output schema, the description is minimal. It covers the core purpose but lacks detail on return values, evaluation context, or interaction with recalled data. The sibling tool 'run_recall_eval' may provide context, but the description could be more self-contained.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline 3. The description only outlines the core parameters (query, memory_id, relevant) and does not add meaning to optional ones like reason, position, metadata_json, or expected_relevant_total. Schema descriptions already explain these sufficiently.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'record whether a returned memory was relevant for a query', specifying the verb (record), resource (relevance judgment), and scope (recall evaluation only). It distinguishes from siblings like 'record_recall_feedback' and warns against use in normal conversation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Evaluation only' and 'Do not use in normal conversation', providing clear when-to-use and when-not-to-use guidance. The context of recall evaluation is evident, and alternatives (like 'record_recall_feedback') are implied.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
record_recall_feedbackRecord Recall FeedbackAInspect
Record recall corrections. Call this when the user says relevant context was missing or wrong, or supplies memories that should have been found.
| Name | Required | Description | Default |
|---|---|---|---|
| notes | No | Freeform notes explaining recall feedback or corrections. | |
| query | Yes | Natural-language question or search text. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| missing_ids_json | No | JSON array of relevant memory IDs that were missing from recall results. | [] |
| returned_ids_json | No | JSON array of memory IDs returned by the recall query. | [] |
| corrected_ids_json | No | JSON array of memory IDs that should have been returned for the recall query. | [] |
| missing_memories_json | No | JSON array of memory snippets or objects that should exist but were missing. | [] |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate this is a non-destructive write operation (readOnlyHint=false, destructiveHint=false). The description adds that it 'records' feedback, which is consistent. No further behavioral traits (e.g., side effects, permissions) are disclosed, but the bar is lowered due to annotations covering the safety profile.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence that immediately conveys the tool's purpose and usage. It is front-loaded with the core action and followed by context. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 7 parameters (6 optional) and an output schema, the description is minimal. It explains when to call but not how to fill in parameters like 'missing_ids_json' or 'corrected_ids_json'. The high schema coverage partly compensates, but more guidance would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All 7 parameters have descriptions in the input schema (100% coverage). Therefore, the baseline is 3. The tool description does not add any parameter-specific meaning beyond what the schema already provides, so it does not exceed the baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Record recall corrections.' It specifies the verb ('record') and resource ('recall corrections'), and differentiates from siblings like 'recall' or 'store_memory' by focusing on feedback about missing or wrong context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says when to use the tool: 'when the user says relevant context was missing or wrong, or supplies memories that should have been found.' This gives clear context. It does not explicitly name alternatives or when-not-to-use, but the provided usage scenarios are sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
redact_memoryRedact MemoryADestructiveInspect
Redact sensitive content from a memory while keeping an audit trail. Call this only when the user explicitly asks to redact.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Optional user-facing reason for the deletion. | |
| memory_id | Yes | Exact XMemo memory reference shown by search or recall. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| replacement_content | No | Replacement text to store when redacting or replacing deleted memory content. | [REDACTED] |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already show destructiveHint=true. The description adds that it keeps an audit trail and replaces content, which is useful behavioral context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, effective sentence with no unnecessary words. It is front-loaded and earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and the presence of an output schema, the description covers purpose and usage adequately. It could mention irreversibility, but overall complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the description does not add significant meaning beyond the parameter descriptions. It mentions memory_id is the exact reference, but that is already stated in schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool redacts sensitive content from a memory while keeping an audit trail. It distinguishes from siblings like 'delete_current_memory' or 'forget' by focusing on redaction and audit trail.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states to call this only when the user asks to redact, providing clear context. It does not mention when not to use or list alternatives, but the instruction is sufficient for most cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
reflectReflectADestructiveInspect
Maintenance tool. Use only when the user/admin asks to summarize, promote, expire, decay, or archive via reflection. Prefer dry_run unless changes are explicitly requested.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| since | No | Only include memories or activity on or after this ISO 8601 timestamp. | |
| until | No | Only include memories or activity on or before this ISO 8601 timestamp. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | % |
| dry_run | No | Preview the maintenance action without applying changes. | |
| team_id | No | Optional team/workspace identifier for team-scoped memory access. | |
| actions_csv | No | Comma-separated maintenance actions to run, such as summarize, promote, expire, decay, or archive. | summarize,promote,expire,decay,archive |
| path_filter | No | Memory path filter; % matches all paths. | % |
| owner_filter | No | Optional owner/user filter for administrative maintenance tools. | |
| time_window_hours | No | Number of recent hours to inspect during reflection maintenance. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark destructiveHint=true, so the description's 'Maintenance tool' and listing of mutating actions add context. The mention of dry_run as a preview option is valuable beyond annotations. No contradiction found.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no fluff. First sentence defines purpose, second gives usage guidance. Highly concise and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the high parameter count and full schema descriptions, the description is sufficient. It could be slightly enriched by mentioning that output varies by dry_run status, but overall it's complete enough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all 11 parameters. The description does not add any parameter-specific information beyond the schema, so a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it's a maintenance tool for summarizing, promoting, expiring, decaying, or archiving via reflection. It uses specific verbs and distinguishes itself from sibling tools which are mostly memory CRUD and recall operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use only when the user/admin asks to...' and advises to 'Prefer dry_run unless changes are explicitly requested.' This provides clear context for when and how to use the tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
rememberRemember in XMemoAInspect
Save a memory so it can be recalled in future conversations. Call this whenever the user states a durable fact, preference, instruction, identity or profile detail, project detail, or recurring workflow likely useful later — you do not need an explicit 'remember this'. Skip transient chit-chat. For spending, income, refunds, or bookkeeping requests, use add_expense.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | A simple user-facing category or path for this memory, such as preferences, projects/xmemo, or personal/travel. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | public |
| dedupe | No | Whether to check for and merge likely duplicate memories before saving. | |
| content | Yes | Text body to save, such as a memory, TODO/action item, event note, or state summary. | |
| team_id | No | Optional team/workspace identifier for team-scoped memory access. | |
| agent_id | No | Optional client-supplied agent label for memory attribution. | |
| device_id | No | Optional client-supplied device identifier for attribution. | |
| memory_id | No | Exact XMemo memory reference shown by search or recall. | |
| confidence | No | Confidence score for the saved or updated memory, from 0.0 to 1.0. | |
| expires_at | No | Optional ISO 8601 expiration timestamp for the memory. | |
| importance | No | Importance score for the saved or updated memory, from 0.0 to 1.0. | |
| memory_type | No | Memory type/category filter or value, such as episodic, identity, procedural, semantic, working, auto, or %. | auto |
| device_label | No | Optional human-readable device label for attribution. | |
| semantic_key | No | Optional stable semantic key used to identify or upsert a memory concept. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| embedding_json | No | Optional JSON array or object containing a precomputed embedding; leave blank for normal use. | |
| provenance_json | No | Optional JSON object string describing memory source, evidence, or attribution. | {} |
| agent_instance_id | No | Optional stable, non-secret agent instance ID for per-client attribution. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations are all false, so description carries the burden. It explains basic write behavior but does not disclose whether saves always create new memories or update existing ones (despite dedupe parameter), what happens on duplicate, or any response details. Acceptable but leaves some ambiguity.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with purpose, followed by usage guidelines and alternative. No filler or repetition; every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 19 parameters and an output schema, the description provides core purpose and usage context but could elaborate on behavioral nuances like deduplication or update semantics. Still, it is largely complete for an agent to decide when to invoke.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so parameters are already documented. The description adds no additional meaning beyond what the schema provides, meeting the baseline for high coverage but not exceeding it.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool saves a memory for future recall, specifies what constitutes a durable fact (preferences, instructions, etc.), and distinguishes from add_expense. It answers the 'what' and 'when' unambiguously.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use (user states durable fact, preference, etc.) and when not to (transient chit-chat). Provides alternative tool (add_expense) for bookkeeping requests, fulfilling the when/when-not/alternatives criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
resolve_decisionResolve DecisionAInspect
Record the chosen option for a previously saved pending decision. Call this when the user states their decision.
| Name | Required | Description | Default |
|---|---|---|---|
| resolution | Yes | Chosen resolution text for a pending decision. | |
| decision_id | Yes | Exact pending-decision ID returned by list_pending_decisions. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=false and destructiveHint=false, indicating mutation but no destruction. Description confirms write behavior ('Record'). No additional behavioral details (e.g., whether the decision becomes non-pending) are provided, but annotations cover the safety profile adequately.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two-sentence description is very concise and front-loaded with the primary purpose. No wasted words; every sentence provides value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool has low complexity (2 required params, output schema exists). Description covers the main action and usage trigger. Could mention that the decision must be pending, but this is implied by 'previously saved pending decision.' Overall sufficient for correct usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters: decision_id is 'Exact pending-decision ID returned by list_pending_decisions' and resolution is 'Chosen resolution text.' The tool description does not add additional meaning beyond the schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the action: 'Record the chosen option for a previously saved pending decision.' It specifies the verb 'record' and the resource 'pending decision', distinguishing it from siblings like create_pending_decision and list_pending_decisions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description explicitly says 'Call this when the user states their decision,' providing clear context for when to use the tool. It does not mention when not to use it or alternatives, but the context is sufficient given the sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
resolve_memory_conflictResolve Memory ConflictADestructiveInspect
Resolve a listed memory conflict. Call this only when the user/admin explicitly chooses a resolution strategy.
| Name | Required | Description | Default |
|---|---|---|---|
| conflict_id | Yes | Exact memory-conflict ID returned by list_memory_conflicts. | |
| merge_content | No | Merged memory content to apply when resolving a conflict with a merge strategy. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| winning_memory_id | No | Memory ID that should win when resolving a conflict. | |
| resolution_strategy | No | Conflict resolution strategy, such as mark_conflict, keep_existing, replace, or merge. | mark_conflict |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide destructiveHint=true and readOnlyHint=false, establishing it as a write operation. Description adds minimal behavioral context (resolving conflict) but does not elaborate on side effects, irreversibility, or required permissions. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is two sentences, directly stating purpose and usage condition. No redundant information, front-loaded with key action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Description is adequate given presence of output schema and annotations, but could improve by explicitly linking to prerequisite tools (e.g., 'conflict_id must come from list_memory_conflicts'). The parameter schema partially compensates but description itself does not mention prerequisites.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with all parameters described. Description does not add additional meaning beyond the schema's parameter descriptions, so no extra value beyond baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Resolve a listed memory conflict' with a specific verb and resource. Distinguishes from sibling 'list_memory_conflicts' by implying resolution vs listing. Additional sentence about explicit user choice adds clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description explicitly says 'Call this only when the user/admin explicitly chooses a resolution strategy,' which tells the agent when to invoke it. It implies a prerequisite of listing conflicts first via sibling tool, but does not explicitly exclude alternative tools or scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
restore_memoryRestore XMemo memoryADestructiveInspect
Restore a previously deleted memory. Call this only when the user explicitly asks to restore or undo a deletion.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Optional user-facing reason for the deletion. | |
| memory_id | Yes | Exact XMemo memory reference shown by search or recall. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds that the tool restores a previously deleted memory, and the annotations already indicate destructiveHint=true. There is no contradiction, but the description could elaborate on side effects, permissions, or irreversible effects. The combination is adequate but not rich.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with purpose, followed by usage guideline. No wasted words, every sentence serves a clear purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and the presence of an output schema, the description covers the essential: what it does and when to use it. It could mention error handling for invalid memory_id, but overall it is mostly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description does not add new meaning beyond the schema for parameters; it merely restates what's already in the schema. No additional guidance is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Restore a previously deleted memory' with a specific verb and resource, and it distinguishes from sibling tools like forget and delete_current_memory by focusing on restoration of deleted memories and specifying the user request context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description clearly states when to use the tool: 'Call this only when the user explicitly asks to restore or undo a deletion.' This provides a clear condition but does not explicitly mention alternatives or when not to use it, leaving a minor gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
restore_restart_snapshotRestore Restart SnapshotADestructiveInspect
Resume previous work from a saved snapshot. Call this when the user asks to resume/restart or after context loss.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | work |
| state_key | No | Working-state key to save, snapshot, or restore, such as active_task. | |
| snapshot_id | No | Exact restart snapshot memory ID to restore. | |
| ttl_seconds | No | Time-to-live in seconds for working state or restart snapshot records. | |
| restore_state | No | Whether restore_restart_snapshot should restore the saved working-state record. | |
| source_session_id | No | Session ID whose saved restart snapshot should be restored. | |
| target_session_id | No | Session ID to associate with restored state and restore events. | |
| record_restore_event | No | Whether to write a timeline event after restoring a restart snapshot. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide destructiveHint=true and readOnlyHint=false. Description adds minimal behavioral context beyond stating it resumes work. Does not disclose what gets destroyed or other traits, but annotations already suffice.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, no wasted words. Highly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With output schema present, return values are covered. However, for a complex tool with 9 optional parameters, the description is terse and does not elaborate on default behavior or edge cases.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so schema documents all parameters. Description adds no extra meaning beyond 'Resume previous work'. Baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the verb 'resume', the resource 'saved snapshot', and the context 'when user asks to resume/restart or after context loss'. It effectively distinguishes from sibling tools like 'create_restart_snapshot' and 'restore_memory'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'when the user asks to resume/restart or after context loss'. Provides clear context but does not mention when not to use or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
run_lifecycleRun LifecycleADestructiveInspect
Maintenance tool. Use only when the user/admin explicitly asks to expire, archive, decay, or dry-run lifecycle processing.
| Name | Required | Description | Default |
|---|---|---|---|
| decay | No | Whether lifecycle maintenance should decay low-value memories. | |
| limit | No | Maximum number of results to return. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | % |
| expire | No | Whether lifecycle maintenance should expire memories that reached their expiration policy. | |
| archive | No | Whether lifecycle maintenance should archive stale memories. | |
| dry_run | No | Preview the maintenance action without applying changes. | |
| team_id | No | Optional team/workspace identifier for team-scoped memory access. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare destructiveHint=true, and the description reinforces this by naming destructive actions (expire, archive, decay) and mentioning a dry-run preview option. It adds context beyond annotations by detailing the specific memory lifecycle operations performed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences long, front-loads the purpose, and contains no redundant information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (8 parameters, destructive hint, output schema exists), the description covers the core purpose and usage constraints. It could elaborate on the interaction of parameters or the dry-run effect, but the presence of an output schema and detailed parameter descriptions compensates. It is adequate for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with each parameter having a description. The description does not add additional meaning beyond the schema for parameters like limit, scope, bucket, or team_id, but it does map the boolean parameters to the listed operations. Baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly identifies it as a 'Maintenance tool' and lists the specific operations: expire, archive, decay, and dry-run. This clearly distinguishes it from sibling tools which focus on memory creation, retrieval, or modification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description states 'Use only when the user/admin explicitly asks to expire, archive, decay, or dry-run lifecycle processing.' This provides clear guidance on when to invoke the tool, though it does not mention alternatives or when not to use it beyond the explicit request condition.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
run_recall_evalRun Recall EvalAInspect
Admin/evaluation only — aggregate stored recall judgments and optionally tune retrieval. Not for normal conversation.
| Name | Required | Description | Default |
|---|---|---|---|
| since | No | Only include memories or activity on or after this ISO 8601 timestamp. | |
| until | No | Only include memories or activity on or before this ISO 8601 timestamp. | |
| auto_tune | No | Whether recall evaluation may tune retrieval settings from recorded judgments. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds context beyond annotations: indicates tuning may modify retrieval settings and restricts usage to admin. Annotations already show readOnlyHint=false and destructiveHint=false, so no contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two succinct sentences, front-loaded with critical scope qualifier 'Admin/evaluation only'. No wasted words; every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, description adequately covers core behavior and constraints. Could mention admin permission requirement but the 'Admin/evaluation only' label implies it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are well-documented in schema. Description does not add additional explanatory information for individual parameters, meeting baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it aggregates recall judgments and optionally tunes retrieval, with explicit 'Admin/evaluation only' scope. Distinct from siblings like record_recall_eval_judgment (recording judgments) and recall (normal retrieval).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context: 'Admin/evaluation only' and 'Not for normal conversation'. Implies when not to use but does not explicitly name alternative tools for recording judgments or normal recall.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
search_memorySearch XMemo memoriesARead-onlyIdempotentInspect
Search XMemo memories by natural-language query. Call this when the user asks about saved or past information, AND proactively before answering any question where prior preferences, facts, projects, decisions, or history could change the answer. To delete a memory, use forget.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. | |
| query | Yes | Natural-language question or search text. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| explain | No | Whether to include retrieval explanation details with search or recall results. | |
| team_id | No | Optional team/workspace identifier for team-scoped memory access. | |
| agent_id | No | Optional client-supplied agent label for memory attribution. | |
| memory_type | No | Memory type/category filter or value, such as episodic, identity, procedural, semantic, working, auto, or %. | % |
| path_filter | No | Memory path filter; % matches all paths. | % |
| prefer_working | No | Whether to prioritize working/session-state memories in retrieval. | |
| agent_instance_id | No | Optional stable, non-secret agent instance ID for per-client attribution. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, so the description adds value by advising proactive use for question answering. No contradictions; the description supplements rather than repeats annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences succinctly convey purpose and usage. Front-loaded with key information; every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (10 parameters, output schema exists), the description covers essential aspects: what it does and when to use it. Could mention scope or filtering options briefly, but usage guidelines and schema coverage make it complete enough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are well-documented. The description does not add significant meaning beyond the schema, as it only mentions 'natural-language query'. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Search XMemo memories by natural-language query', providing a specific verb and resource. It distinguishes from siblings like recall, forget, and remember, making its purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to call this tool (when user asks about saved/past information, proactively before answering questions with prior context) and when not ('To delete a memory, use forget'). Provides clear usage context and alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
store_memoryAdvanced: store XMemo memoryAInspect
Low-level memory write with full field control. Use only when a caller needs explicit field-level control; for normal conversational capture use remember.
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | A simple user-facing category or path for this memory, such as preferences, projects/xmemo, or personal/travel. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | public |
| content | Yes | Text body to save, such as a memory, TODO/action item, event note, or state summary. | |
| team_id | No | Optional team/workspace identifier for team-scoped memory access. | |
| agent_id | No | Optional client-supplied agent label for memory attribution. | |
| device_id | No | Optional client-supplied device identifier for attribution. | |
| confidence | No | Confidence score for the saved or updated memory, from 0.0 to 1.0. | |
| expires_at | No | Optional ISO 8601 expiration timestamp for the memory. | |
| importance | No | Importance score for the saved or updated memory, from 0.0 to 1.0. | |
| memory_type | No | Memory type/category filter or value, such as episodic, identity, procedural, semantic, working, auto, or %. | semantic |
| device_label | No | Optional human-readable device label for attribution. | |
| semantic_key | No | Optional stable semantic key used to identify or upsert a memory concept. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| embedding_json | No | Optional JSON array or object containing a precomputed embedding; leave blank for normal use. | |
| provenance_json | No | Optional JSON object string describing memory source, evidence, or attribution. | {} |
| agent_instance_id | No | Optional stable, non-secret agent instance ID for per-client attribution. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate it is not read-only, not open world, not idempotent, and not destructive. The description adds context that this is a low-level write with full control, which is consistent and adds value beyond annotations. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (two sentences), front-loads the purpose, and includes usage guidance. Every sentence adds value with no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While the description clearly states when to use the tool, it does not elaborate on the behavioral effects (e.g., whether it overwrites or appends, persistence semantics) or the purpose of the many parameters beyond what the schema provides. Given the complexity (17 parameters) and available output schema, the description is minimally adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all 17 parameters thoroughly. The description does not add any parameter-specific meaning beyond what the schema provides. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it is a 'low-level memory write with full field control,' specifying the verb (write), resource (memory), and scope (full field control). It also explicitly distinguishes itself from the sibling 'remember' tool, which is for normal conversational capture.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance: 'Use only when a caller needs explicit field-level control; for normal conversational capture use remember.' This tells the agent when to use this tool and when to use an alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_memoryUpdate XMemo memoryADestructiveInspect
Update the content, location, or type of an existing memory. Call this only when the user explicitly asks to correct, revise, move, or reclassify a specific saved memory.
| Name | Required | Description | Default |
|---|---|---|---|
| path | No | A simple user-facing category or path for this memory, such as preferences, projects/xmemo, or personal/travel. | |
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | |
| status | No | Memory status filter or update value, such as active, archived, deleted, or %. | |
| content | No | Text body to save, such as a memory, TODO/action item, event note, or state summary. | |
| team_id | No | Optional team/workspace identifier for team-scoped memory access. | |
| memory_id | Yes | Exact XMemo memory reference shown by search or recall. | |
| confidence | No | Confidence score for the saved or updated memory, from 0.0 to 1.0. | |
| expires_at | No | Optional ISO 8601 expiration timestamp for the memory. | |
| importance | No | Importance score for the saved or updated memory, from 0.0 to 1.0. | |
| memory_type | No | Memory type/category filter or value, such as episodic, identity, procedural, semantic, working, auto, or %. | |
| semantic_key | No | Optional stable semantic key used to identify or upsert a memory concept. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| merge_metadata | No | Whether to merge new metadata with existing metadata instead of replacing it. | |
| provenance_json | No | Optional JSON object string describing memory source, evidence, or attribution. | {} |
| detect_conflicts | No | Whether to detect possible conflicts with existing memories during an update. | |
| merge_provenance | No | Whether to merge new provenance with existing provenance instead of replacing it. | |
| supersession_reason | No | Reason the updated memory supersedes older content. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true, so mutation is expected. The description adds context by specifying what can be updated (content, location, type), which aligns with the destructive hint. However, it does not disclose side effects like conflict detection or provenance merging, but those are documented in the schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences: the first states the action, the second provides usage context. It is front-loaded, concise, and every sentence contributes value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the high parameter count (18) and the presence of both annotations and an output schema, the description focuses on the core purpose and usage boundary. It does not explain all parameter interactions, but the schema descriptions fill that gap. The description is adequate for an agent to correctly select and invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all 18 parameters. The description briefly names three key updatable aspects (content, location, type), mirroring some parameters, but adds no extra meaning beyond what is already in the schema. Baseline 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Update the content, location, or type of an existing memory') and specifies the resource ('memory'), distinguishing it from sibling tools like store_memory or create_memory_todo by narrowing the use case to explicit revisions, moves, or reclassifications.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit when-to-use guidance: 'Call this only when the user explicitly asks to correct, revise, move, or reclassify a specific saved memory.' This clearly differentiates it from creation or deletion tools among the many siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_stateUpdate StateAInspect
Save the current working state during long-running work — active task, next action, or blocker — so future turns/sessions can resume. For durable user facts use remember.
| Name | Required | Description | Default |
|---|---|---|---|
| scope | No | Optional scope that narrows memory access; leave blank for the token default. | |
| bucket | No | Memory bucket or namespace to read from or write to; use % only for tools that support wildcard reads. | work |
| content | No | Text body to save, such as a memory, TODO/action item, event note, or state summary. | |
| state_key | No | Working-state key to save, snapshot, or restore, such as active_task. | active_task |
| next_action | No | Next action to persist so future turns can resume the work. | |
| ttl_seconds | No | Time-to-live in seconds for working state or restart snapshot records. | |
| current_task | No | Current task or work item to persist for later resume. | |
| metadata_json | No | Optional JSON object string with extra metadata for the operation. | {} |
| blocked_reason | No | Short explanation of why the current work is blocked. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate it's a write but not destructive. The description adds context about working state and TTL (implied by default), but doesn't disclose all behavioral traits like whether it overwrites or appends.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no fluff: first states purpose, second provides usage guidance. Well-structured and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema and full parameter documentation, the description adequately explains the use case and session resumption utility. Could be slightly more detailed on behavior like overwrite semantics.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline 3 is appropriate. The description mentions 'active task, next action, or blocker' which maps to some params, but adds no additional meaning beyond the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'save' and the resource 'current working state', and explicitly distinguishes from the sibling tool 'remember' for durable facts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explicitly says when to use ('during long-running work') and when not to ('For durable user facts use remember'), providing clear context signals.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!