RecallNest

by AliceLJY

Overview Schema Related Servers Score Discussions

TypeScript

Local

Server Quality Checklist

Profile completionA complete profile improves this server's visibility in search results.

Disambiguation4/5
Most tools have distinct purposes aligned with specific memory operations (store, retrieve, consolidate, export). Minor ambiguity exists between brief_memory (creates/persists asset) and distill_memory (read-only briefing), and between pin_memory (importance boost) and promote_memory (authority upgrade), but side effects differentiate them.
Naming Consistency3/5
Mixed conventions despite functional groupings. Strong consistency within families (store_*, list_*, memory_*) but deviations exist: batch_store breaks the store_* verb pattern, auto_capture uses a prefix, dream is an abstract single word, and latest_checkpoint uses an adjective prefix. All use snake_case at least.
Tool Count3/5
With 25 tools, the surface is at the upper limit of 'heavy' for this domain. While the memory management domain is complex (requiring storage, retrieval, maintenance, checkpointing, and skills), the count approaches the threshold where agent selection becomes burdensome. Some consolidation (e.g., merging brief/distill or the various store types) could reduce cognitive load.
Completeness4/5
Covers the memory lifecycle comprehensively: capture (auto_capture, store_*), retrieval (search, drill_down), maintenance (dream, data_checkup), session continuity (checkpoint/resume), and export. Minor gaps in explicit CRUD operations (no delete/forget tool, no unpin, no explicit memory update) but archival/consolidation via dream provides partial coverage.
Average 4.2/5 across 25 of 25 tools scored.
See the tool scores section below for per-tool breakdowns.
This repository includes a README.md file.
This repository includes a LICENSE file.
Latest release: v1.0.3
No tool usage detected in the last 30 days. Usage tracking helps demonstrate server value.
Tip: use the "Try in Browser" feature on the server page to seed initial usage.
This repository includes a glama.json configuration file.
View server inspector
This server provides 25 tools. View schema
No known security issues or vulnerabilities reported.
Report a security issue
This server has been verified by its author.
Add related servers to improve discoverability.

Tool Scores

Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It earns credit for explicitly stating 'Side effect: writes an export artifact file,' which is critical for an agent to understand the file-system impact. However, it omits other important behavioral details like idempotency, overwrite behavior, or what the function returns (file path vs. content).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of three efficiently structured sentences: the action, the side effect, and the usage guideline. Every sentence earns its place with no redundant or tautological content, and it is appropriately front-loaded with the core verb.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (7 parameters, no annotations, no output schema), the description provides the minimum necessary context by mentioning the file write side effect. However, it lacks details about the return value (e.g., file path confirmation) and does not explain how the 'distilled' content is selected from the memory store, leaving gaps for a multi-parameter export operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, establishing a baseline of 3. The description reinforces the 'format' parameter by mentioning 'markdown or JSON' and implies the 'query' parameter with 'knowledge on a topic,' but does not add semantic context beyond what the schema already provides for 'scope', 'profile', or 'limit'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (Export), the resource (distilled memory briefing), and the output format (markdown or JSON file on disk). It effectively distinguishes this from sibling tools by emphasizing the file-on-disk aspect, though it could clarify its relationship to 'distill_memory' explicitly.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'Use when you need an offline-readable snapshot of knowledge on a topic' provides clear context for when to invoke this tool. However, it lacks explicit guidance on when NOT to use it (e.g., for in-chat retrieval) or named alternatives like 'brief_memory'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It successfully discloses the read-only nature and aggregation behavior (combining multiple sources), but lacks critical behavioral details such as return format, error handling when sessionId is invalid, or behavior when no checkpoint exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three tight statements totaling one sentence plus fragments: the action (compose), the safety property (read-only), and the usage trigger (when entering). Every word earns its place; no redundancy or boilerplate.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 7 optional parameters and no output schema, the description adequately explains the conceptual output (aggregated context) but omits the structure or format of what is returned. For a tool of this complexity without annotations, it meets minimum viability but leaves gaps in behavioral contract documentation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions conceptual mappings (e.g., 'latest checkpoint' relates to includeLatestCheckpoint/sessionId) but adds no usage guidance, syntax details, or examples beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool composes 'startup context' by combining specific sources (durable memory, patterns, cases, checkpoint). The 'Read-only' declaration distinguishes it from sibling write-tools like store_memory or checkpoint_session, though it doesn't explicitly contrast with search_memory or latest_checkpoint.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit temporal guidance: 'Use when entering a new session and you need to recover prior decisions, open loops, and next actions.' This clearly signals the session-initiation use case. However, it does not explicitly name alternatives (e.g., 'use search_memory instead for targeted queries').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden. It adds valuable implementation context ('zero LLM calls', 'salience filtering', 'durable memories') but omits safety-critical behavioral traits like whether the operation is idempotent, what happens when no items pass filtering, or potential rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is optimally concise with four sentences that progress logically: mechanism (sentence 1), detection targets (sentence 2), storage behavior (sentence 3), and usage trigger (sentence 4). No redundancy or filler content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 3 simple parameters and no output schema, the description adequately covers functional purpose and operational context. It could be improved by noting the return behavior (e.g., what the tool returns when memories are captured vs. when none are found), but this is not critical given the tool's straightforward nature.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions 'analyze a block of conversation text' which aligns with the text parameter, but does not add semantic nuance beyond what the schema already provides (e.g., no format guidance for scope or usage patterns for source).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool extracts 'memory-worthy items' using 'lightweight heuristics' and lists specific detection targets (preferences, identity facts, etc.). However, it does not explicitly differentiate from siblings like store_memory or batch_store, leaving the agent to infer when auto-extraction is preferred over manual storage.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance with 'Use this when you want to analyze a block of conversation text...' which clarifies the trigger condition. However, it lacks 'when-not-to-use' guidance or named alternatives (e.g., 'use store_memory for manual capture instead').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It adds valuable behavioral context with 'durable memory' and 'across fresh windows,' indicating persistence across sessions. However, it omits critical behavioral details like update/merge semantics (only hinted at in the schema's canonicalKey description), failure modes, or durability guarantees.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of exactly two high-value sentences: the first defines the action and target, the second provides usage context with examples. There is no redundancy or extraneous text; every word earns its place with purpose-driven, front-loaded structure.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (10 parameters, structured workflow concept) and absence of annotations or output schema, the description provides adequate but minimal coverage. It successfully conveys the core concept and usage trigger, but could enhance completeness by explaining the workflow lifecycle, canonicalKey behavior, or how stored patterns relate to retrieval mechanisms (retrieve_skill is a sibling).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is appropriately 3. The description implies the conceptual structure (title, trigger, steps) through the 'workflow pattern' concept and examples, but does not add specific parameter semantics, syntax guidance, or clarify the scope format beyond what the schema already documents.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Store a reusable workflow pattern as durable memory' with a specific verb (store) and resource (workflow pattern). It effectively distinguishes from siblings like store_memory, store_skill, and store_case by specifying 'workflow pattern' and providing concrete examples (startup continuity, debugging routines) that imply structured, step-based processes rather than facts or capabilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear positive guidance: 'Use this when you identify a repeatable process worth reusing across fresh windows' followed by specific examples. However, it lacks explicit negative constraints (when not to use) or named alternatives, though the examples implicitly differentiate it from other storage tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It successfully discloses critical behavioral traits: the four-phase process, the destructive/mutative side effect of archiving entries, and the generation of insight memories. It mentions the automatic threshold concept (linked to the 'force' parameter) implicitly.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise two-sentence structure with zero waste. First sentence defines the action and phases; second states side effects and usage conditions. Information is perfectly front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (destructive maintenance operation) and lack of output schema, the description adequately covers the essential behavioral contract and side effects. It would benefit from mentioning the return value or success indicators, but the core functionality is sufficiently documented.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description mentions the consolidation scope and threshold concepts but does not add syntax details, format constraints, or semantic context beyond what the schema already provides for the 'scope' and 'force' parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly defines the tool as running a 'full memory consolidation cycle' with specific phases (Orient, Gather, Consolidate, Prune), and mentions side effects (archiving, insight generation). It distinguishes from simple storage/retrieval siblings, though it could explicitly contrast with 'distill_memory' or 'checkpoint_session' to be perfect.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear when-to-use guidance ('Use when memory count is high and you need periodic maintenance') and implies when not to use via the side effect warning (avoid if you don't want archiving). Lacks explicit sibling alternatives, but the context is sufficiently clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations exist, the description carries the full burden. It successfully discloses 'Read-only' status and output format ('one-line descriptions'), but omits details about default tier behavior, pagination, or empty result handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste. First sentence delivers purpose and safety (read-only). Second sentence delivers usage guidance. Perfectly front-loaded and appropriately sized for a single-parameter discovery tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (1 optional parameter, no output schema), the description adequately covers intent, behavior, and usage. It hints at return format ('one-line descriptions') despite no output schema. Minor gap: doesn't mention the 'advanced' default tier behavior explicitly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage for the 'tier' parameter, the baseline is 3. The description mentions 'filtered by tier' but adds no semantic details beyond the schema's existing documentation of the enum values and default behavior.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the action ('List'), resource ('RecallNest tools'), and key characteristic ('filtered by tier'). It distinguishes itself from sibling memory management tools by positioning itself as a discovery/directory tool rather than an operational tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit usage context ('Use when you need to discover advanced or governance tools beyond the core set'), helping agents understand when to invoke this over operational siblings like 'store_memory' or 'search_memory'. Lacks explicit 'when not to use' guidance or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Explicitly states 'Read-only' (critical safety disclosure) and 'semantic similarity' matching mechanism. However, lacks details on return format, error handling when no matches found, or what constitutes an 'executable skill' structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences total. First states purpose and mechanism, second states usage guideline and safety property. Every word earns its place; no redundancy or waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Good coverage for 3-parameter tool with complete schema documentation. Mentions 'executable skills' and 'stored procedure' hinting at return type. Slight gap: no output schema exists and description doesn't detail return structure or skill format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage, establishing baseline 3. Description aligns with schema (referencing 'task description' matching query param) but adds no additional syntax guidance, examples, or constraints beyond what schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb ('Retrieve'), resource ('executable skills'), and mechanism ('semantic similarity'). Explicitly contrasts with knowledge recall tools, distinguishing from siblings like search_memory.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('when you need a stored procedure to act on') and contrasts with alternative ('not just recall knowledge'), effectively differentiating from semantic search tools. Would be 5 if it named specific sibling alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations are provided, the description carries the full burden of disclosing behavioral traits. It successfully declares the operation as 'Read-only' (safety profile) and crucially warns that it 'may fire stored reminders as a side effect,' which is essential context for agent planning not found in the schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of three tightly constructed sentences: one defining the core function, one disclosing safety/side effects, and one providing usage guidance. No words are wasted and critical information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the high schema coverage (100%) and lack of output schema, the description is appropriately complete. It covers the search mechanism, filtering capabilities, side effects, and usage patterns. It could be improved by hinting at the result structure or contrasting with retrieval siblings, but it suffices for the complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description mentions 'temporal filtering' (mapping to after/before parameters) and 'semantic similarity' (contextualizing query), but does not clarify ambiguous schema entries like 'scope' or 'profile' (both lacking descriptive enums in the description text).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches 'indexed memories by semantic similarity' and returns 'ranked results with optional temporal filtering.' It uses specific verbs and resources. However, it does not explicitly distinguish this from sibling tools like `memory_drill_down` or `retrieve_skill`.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
It provides explicit guidance on when to use the tool: 'proactively at the start of tasks, when debugging, writing, or when the user references past work.' This gives clear context for invocation, though it lacks explicit exclusions or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It discloses the side effect ('stores a reminder entry') and trigger mechanism ('auto-triggers during future search_memory calls'), but lacks details on what happens when triggered (return behavior, consumption vs. recurring, limits).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three tightly constructed sentences with zero waste: first defines function and mechanism, second discloses side effects, third provides usage guidance. Well front-loaded with the core prospective memory concept.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations or output schema, the description adequately covers the unique integration with search_memory and the persistent nature of reminders. Minor gaps remain regarding immediate return values and detailed trigger behavior (one-time vs. recurring).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. The description adds semantic context by explaining that 'trigger' keywords relate to future search_memory activation, but does not elaborate deeply on individual parameter semantics since the schema is self-documenting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Set') and resource ('prospective memory reminder') and clearly distinguishes from sibling storage tools by explaining the unique auto-trigger mechanism with 'future search_memory calls' and 'trigger keywords match'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear usage context ('Use when you need a future nudge tied to a specific context') and implicitly differentiates from store_memory/search_memory siblings by explaining the prospective memory pattern. Would benefit from explicit 'when not to use' guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It establishes durability ('durable memory') and temporal scope ('across future windows'), but omits behavioral details like update/merge semantics (relevant given the canonicalKey parameter), idempotency, or failure modes.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Two tightly constructed sentences with zero waste. Front-loaded with the core action, followed immediately by usage context and specific examples. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 11 parameters with 100% schema coverage and no output schema, the description adequately covers the conceptual model (case as problem-solution) and persistence guarantees. Could be improved by noting the merge/update capability implied by canonicalKey, but sufficient for tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds conceptual context mapping 'problem-and-solution pair' to the problem/solutionSteps parameters, but does not elaborate on specific parameter formats, the importance score scale, or scope syntax beyond what the schema already documents.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Store') and resource ('reusable case'/'durable memory'), distinguishing it from sibling tools like store_memory or store_skill by specifying this is for 'problem-and-solution pairs' with concrete examples (debugging fix, migration lesson).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear positive guidance on when to use ('when you identify a concrete problem-and-solution pair worth reusing across future windows') with specific scenarios. Lacks explicit 'when not to use' or named alternative tools from the sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden. It explicitly warns of side effects: 'writes a new brief asset to disk and indexes it in the vector store for future search.' This reveals the mutation nature and persistence mechanism. Could improve by mentioning idempotency or overwrite behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences with zero waste: first defines the action (create/retrieve/summarize/persist), second gives usage context (consolidation), third discloses side effects. Front-loaded with core verb and resource. No redundant phrases.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 7 parameters and a complex multi-step operation (retrieval + summarization + persistence), the description covers the conceptual model well. However, with no output schema provided, it fails to specify what the tool returns (the brief content itself or an asset reference), which is essential for agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all 7 parameters (query, limit, scope, sessionId, allScopes, profile, title). The description focuses on high-level behavior rather than parameter semantics, which is appropriate given the comprehensive schema. Baseline 3 is correct.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool creates a structured memory brief by 'retrieving and summarizing relevant memories, then persist it as a reusable asset.' It clearly distinguishes this from sibling tools by emphasizing consolidation into a 'single retrievable document' versus simple storage or search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance: 'Use this when you want to consolidate scattered knowledge on a topic into a single retrievable document.' This effectively differentiates it from search_memory (simple retrieval) and store_memory (raw storage). Lacks explicit 'when not to use' exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations are provided, the description carries the full burden. It successfully discloses the mutation side effects ('persists a new skill entry and indexes it'), confirming this is a write operation. However, it omits critical behavioral details like idempotency (behavior on duplicate names), conflict resolution, or return value structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences with zero waste: first defines the action and components, second discloses side effects, third provides usage context. Information is front-loaded with the core operation and efficiently structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (10 parameters including nested objects and enums) and lack of annotations/output schema, the description adequately covers the conceptual model but has gaps regarding operational behaviors like error conditions, name collision handling, or the indexing mechanism mentioned.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While the schema has 100% coverage (baseline 3), the description adds conceptual value by grouping parameters into logical categories (trigger conditions, implementation, verification steps) that help the agent understand the relationship between 'triggerPattern', 'implementation', and 'verification' fields beyond their individual schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (Store) and resource (executable skill) with constituent components (trigger conditions, implementation, verification steps). It effectively distinguishes from siblings like 'store_memory' (general memories) and 'store_case' (cases) by emphasizing the 'executable' nature and automation focus.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit positive guidance ('Use when you identify a reusable procedure worth automating across sessions'). However, it lacks explicit negative constraints or comparison with alternatives like 'auto_capture' (for automatic capture) or 'store_workflow_pattern' (for patterns vs. executable skills).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Declares 'Read-only' safety trait and clarifies diagnostic output dimensions. Could improve by describing output format (text vs structured) or whether explanations are logged/persisted.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three tight clauses: purpose definition, safety declaration, and usage trigger. No filler words. Front-loaded with the core value proposition ('Explain why memories matched').
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 7-parameter debugging tool without output schema. Mentions key output components (retrieval path, freshness, etc.) compensating for missing output schema. Missing only explicit mention of the contrasting tool (search_memory) and detailed output format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with detailed enums and examples. Description focuses on behavior rather than parameters, which is appropriate given the schema completeness. Baseline 3 is correct as no additional param semantics are needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action ('Explain why memories matched') and details exactly what aspects get explained ('retrieval path, freshness, scope, and matched terms'). Distinguishes from sibling search_memory by emphasizing diagnostic explanation rather than retrieval.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use condition ('when search results seem unexpected and you need to debug ranking or scope filtering'). Implies search_memory is the regular alternative but does not explicitly name it, which would strengthen guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully declares 'Read-only' safety status and clarifies the L1/overview mapping. However, it lacks disclosure of error behavior (e.g., invalid ID, ambiguous prefix matches) or rate limiting.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste. Front-loaded with core action and safety declaration ('Read-only'). Second sentence provides precise workflow trigger. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 2-parameter retrieval tool. Covers safety (read-only), workflow integration, and content depth levels. Minor gap in not describing error states or return format, though this is less critical given the straightforward 'retrieve content' purpose and lack of output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing a baseline of 3. The description adds valuable semantic context beyond the schema by mapping 'L1' to 'overview' and implying usage patterns (complete text vs. overview), enriching the agent's understanding of when to select each enum value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb ('Retrieve'), resource ('single memory entry'), and scope ('full or overview-level content'). Distinguishes from siblings like search_memory by emphasizing 'single' entry versus search results, and contrasts 'full' content with 'compact summaries' from search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('when search returned compact summaries and you need the complete text or L1 overview'), establishing a clear workflow. However, it implies rather than explicitly names the sibling search tool (e.g., search_memory) as the prerequisite step.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully identifies the operation as 'Read-only' and conceptually describes the returned data (counts by source/category). However, it lacks details about the return format structure, performance characteristics, or error conditions that would be expected when no annotations or output schema exist.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of two highly efficient sentences that are front-loaded with the core function. Every phrase serves a distinct purpose: the first defines the output content, while the second specifies behavior and use case. No redundancy or filler text is present.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (zero parameters) and the absence of an output schema, the description adequately explains what conceptual information is returned. However, it could slightly improve by hinting at the data structure format (e.g., JSON object with fields) to fully compensate for the missing output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema contains zero parameters, which establishes a baseline of 4. The description appropriately does not invent parameter details, maintaining consistency with the empty schema while implying through 'aggregate statistics' that no filtering parameters are required for this global overview.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the verb (Show), resource (memory database), and exact scope (aggregate statistics, total entries, counts by source and category). It effectively distinguishes from siblings like search_memory or memory_drill_down by emphasizing 'aggregate' overview rather than specific retrieval or analysis.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage context ('Use when you need an overview of memory store health or size') and notes the read-only nature. However, it does not explicitly name alternative tools for different use cases, such as when to prefer data_checkup or memory_drill_down instead.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. It successfully discloses the critical side effect ('creates a new durable entry linked to the source evidence') and mentions 'authority upgrade' to indicate data durability changes. Does not clarify whether the source evidence is retained or deleted, nor mention idempotency or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three efficient clauses: action definition, side effect disclosure, and usage guidance. Every sentence earns its place with zero redundancy. The front-loaded structure places the core verb ('Promote') immediately, followed by specific qualifiers.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex tier-migration operation with 8 parameters and no output schema, the description adequately explains the conceptual model (evidence vs. durable memory) and linkage behavior. Could be improved by clarifying the fate of the source evidence memory post-promotion and mentioning the required scope parameter's role in targeting.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all 8 parameters fully documented (descriptions, enums, defaults). The description provides conceptual context but does not add parameter-specific semantics beyond the schema, which is appropriate given the high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool 'Promote[s] an evidence memory into durable memory with an authority upgrade,' specifying the exact transformation (evidence→durable) and distinguishing it from sibling creation tools like store_memory by emphasizing the 'promotion' and 'upgrade' aspects.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit usage context ('Use when a transcript snippet or imported artifact contains a fact worth keeping across windows'), identifying the specific scenario for promotion. Lacks explicit contrast with alternatives like store_memory or brief_memory, though the 'promotion' metaphor implies working with existing evidence rather than new data.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden. It successfully reveals the side effect nature ('persists up to 20 entries'), the deduplication behavior, and the entry limit. Minor gap: does not specify failure behavior (partial vs atomic) or return value structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three efficiently structured sentences with zero waste: first states purpose, second discloses behavioral limits, third provides usage guidelines. Front-loaded with the core verb and resource.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero schema descriptions, no annotations, no output schema, and complex nested parameters (memories array with enum fields), the description is incomplete. It covers the batch operation concept well but leaves critical data structure semantics (scope usage, category types, importance scale) undocumented.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, requiring the description to compensate. It explains the '20 entries' limit (mapping to memories.maxItems) and deduplication, but fails to explain required parameters 'scope' (namespace semantics) and 'source' (provenance), or the memory object fields (category enum, importance range).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (Store multiple memories), mechanism (single call with deduplication), and scope (batch operation). It explicitly distinguishes itself from the sibling tool 'store_memory' by mentioning it is 'more efficient than repeated store_memory calls'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance ('when you have several facts to store at once') and clearly names the alternative ('repeated store_memory calls'), enabling the agent to make correct routing decisions between batch and single-item storage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations exist, description carries full burden. Explicitly states 'Read-only' for safety disclosure and details return structure ('asset type, title, scope, creation date, and file path'). Could mention pagination behavior or empty result handling, but covers essential behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: purpose, safety flag, usage example, and return value documentation. Front-loaded with specific action and resource. Every sentence earns its place with high information density.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter list operation with no output schema, description adequately compensates by documenting return fields and read-only status. Complexity is low; description covers all necessary dimensions (purpose, safety, usage, output) without gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% for the single 'limit' parameter, establishing baseline 3. Description implies recency/sorting behavior but does not add parameter-specific semantics (e.g., why one might use max 50 vs default 12) beyond what the schema already documents.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb ('List') with clear resource ('structured memory assets') and explicit scoping ('pinned memories and distilled briefs'). Distinguishes from sibling list_pins by indicating it covers multiple asset types, not just pins.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use ('Use when you need an inventory of persisted knowledge artifacts') with concrete example ('before creating a new brief to avoid duplicates'). Lacks explicit 'when not to use' or named alternatives (e.g., search_memory), but the contextual example effectively guides selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral disclosure burden. It explicitly declares 'Read-only' (critical safety information) and lists the specific diagnostic operations performed. However, it omits what the tool returns or any performance characteristics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste. First sentence front-loads the action and scope; second provides behavioral constraint (read-only) and usage context. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a zero-parameter diagnostic tool without output schema, the description adequately covers purpose, safety profile (read-only), and usage triggers. Could benefit from hints about return structure, but sufficient for tool selection given the simple interface.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has zero parameters and 100% schema coverage trivially. Per guidelines, 0 params warrants a baseline score of 4. The description appropriately focuses on behavior rather than non-existent parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Run health checks') on the specific resource ('memory database') and enumerates exactly what gets checked (vector dimensions, orphans, tier distribution, conflict backlog). This distinguishes it clearly from sibling storage/retrieval tools like store_memory or search_memory.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance ('when diagnosing data quality issues or before a consolidation run'). While it doesn't explicitly mention alternatives or when-not-to-use, the specific context provided effectively guides selection over similar diagnostic siblings like memory_stats.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden and successfully discloses read-only nature and output format ('structured briefing with citations'). However, it omits details about retrieval behavior (how ranking works across scopes) or performance characteristics (latency for large memory sets).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four tightly constructed sentences with zero waste: purpose (sentence 1), usage guidelines (sentence 2), output format (sentence 3), and safety guarantees (sentence 4). Information is front-loaded and every clause earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a 6-parameter tool with 100% schema coverage and no output schema. The description compensates for missing output schema by describing return value structure ('structured briefing with citations'). Minor gap: could elaborate on retrieval profile behaviors or scope inheritance given complexity of filtering options.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions 'source map' which loosely hints at citation tracking, but does not add semantic context for specific parameters (e.g., how 'profile' tunes ranking weights or interaction between 'scope' and 'allScopes').
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('Distill') and resource ('memories'), clearly defining the output format ('compact briefing with source map, key takeaways, and reusable evidence'). It effectively distinguishes from sibling tools like search_memory by contrasting 'synthesized summary' with 'raw search results'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('when you need a synthesized summary of stored knowledge on a topic rather than raw search results'), providing clear selection criteria against alternatives. Also clarifies safety profile ('Read-only — does not modify stored memories') to distinguish from store/update siblings like store_memory or batch_store.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully declares the read-only safety property and implies single-result fetching ('most recent'), but lacks details on error behavior (e.g., what happens if no checkpoint exists) or return format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three tightly constructed sentences: purpose definition, safety declaration, and usage guidance. Every sentence earns its place with no redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 2-parameter read operation. Covers core functionality and selection criteria. Minor gap: without output schema, brief mention of return value structure (checkpoint metadata vs full data) would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While schema has 100% description coverage, the description adds semantic value by framing the parameters as alternative filtering contexts ('session or shared scope'), implying you filter by one or the other rather than combining them, which the schema doesn't convey.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action ('Fetch') and resource ('most recent saved checkpoint'), specifies target contexts ('session or shared scope'), and declares read-only nature. The distinction from 'resume_context' helps differentiate from the sibling tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('when you need to inspect current work state') and names the alternative to avoid ('without running a full resume_context'), providing clear guidance for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It explicitly states 'Read-only' (critical safety info) and describes sorting behavior ('sorted by creation date') and return structure (four specific fields). Deduct one point for missing error handling or empty-state behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences total. First sentence front-loads action, resource, and key behavioral traits. Second sentence provides usage context. Zero redundancy—every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, but description compensates by enumerating the four fields returned. With single parameter and no annotations, description adequately covers tool behavior. Minor gap regarding null/empty result handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage for the 'limit' parameter including default/max values. Description does not explicitly discuss the parameter, but with complete schema coverage, baseline 3 is appropriate as the schema carries the semantic load.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Description explicitly states 'List pinned memory assets' with specific verb and resource. It distinguishes from siblings by referencing 'explicitly pinned via pin_memory' and specifies output fields (title, scope, importance score, file path), clearly defining scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance: 'review high-value memories that were explicitly pinned via pin_memory' and 'check if a topic already has a pinned reference before creating a new one.' Names sibling tool pin_memory directly as the complementary operation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior5/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden and excels by disclosing specific side effects: 'boosts importance to 0.95, writes pin asset file, and indexes it.' It quantifies the mutation and explains the persistent indexing behavior, giving agents precise expectations about state changes.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences with zero waste: sentence 1 defines purpose, sentence 2 lists side effects, sentence 3 gives usage guidance. Front-loaded with the core action, no filler words, appropriately dense for the complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 8 parameters and file I/O mutation with no annotations/output schema, the description adequately covers the core behavioral contract (importance boost, file write, indexing). Minor gap: doesn't address idempotency (re-pinning?), error states, or whether the operation is reversible, which annotations would typically cover.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description reinforces the 'retrieved memory' semantic for memory_id and connects to 'search result' usage, but doesn't add syntax details, validation rules, or parameter interaction logic beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('Pin') and clear resource ('retrieved memory'), defining the transformation to 'high-importance reusable asset on disk.' It implicitly distinguishes from siblings like store_memory (creates new) vs pin_memory (promotes existing) and search_memory (read-only).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'Use when a search result is critical and should be surfaced in future recalls,' providing clear contextual triggers. However, it doesn't explicitly name sibling alternatives (e.g., 'use store_memory for new memories, pin_memory for existing ones') or mention prerequisites like requiring a valid memory_id from search.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully establishes persistence semantics ('durable', 'survive future windows') but lacks operational details: no mention of merge/update behavior (implied by canonicalKey parameter), failure modes, rate limits, or atomicity guarantees.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste. First sentence establishes action, trigger conditions, and durability requirement; second establishes exclusion criteria. Front-loaded with essential constraints.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for 7-parameter tool with rich schema. Covers storage intent and durability semantics comprehensively. Minor gap: does not clarify upsert behavior suggested by canonicalKey parameter or success/failure return patterns (though no output schema exists).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds value by mapping concrete content examples (preferences, identity facts, etc.) to the category enum and text parameter, helping agents understand what constitutes valid input beyond raw schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Store' + resource 'durable memory' + concrete content examples (preference, identity fact, project entity, pattern, solved case). Explicitly distinguishes durable storage from transient alternatives via sibling differentiation ('survive future windows').
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit positive guidance ('when the user shares...') and negative constraint ('Do not use this for transient task state'). Clear when-to-use vs when-not-to-use boundaries implicitly reference sibling tools like brief_memory/checkpoint_session for transient needs.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It successfully discloses the ephemeral nature ('compact checkpoint', 'without polluting durable memory') and recovery intent ('next session to recover'). However, it lacks details on retention policy, whether repeated calls overwrite or append, and specific lifecycle behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, zero waste. Front-loaded with the core action, followed immediately by the specific usage context. Every word serves either definition or differentiation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 10-parameter storage tool with 100% schema coverage and no output schema, the description is complete. It explains the conceptual model (checkpoint vs. durable memory) and recovery workflow, which is sufficient given the rich parameter documentation in the schema itself.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing a baseline of 3. The description adds valuable semantic context by explicitly mapping the conceptual elements ('decisions, open loops, and next actions') to the workflow of resuming work across sessions, helping the agent understand why each parameter matters for the checkpoint pattern.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb ('Store') and resource ('compact checkpoint of the current work state'). It effectively distinguishes from siblings like store_memory or pin_memory by emphasizing the 'checkpoint' concept and explicitly mentioning it avoids 'polluting durable memory', clearly positioning it as ephemeral session state management.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('when a task spans windows or terminals') and implicitly defines when not to use (for durable/long-term storage, use other memory tools instead). The 'without polluting durable memory' phrase provides clear guidance on tool selection against the 20+ sibling storage tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

GitHub Badge

Glama performs regular codebase and documentation scans to:

Confirm that the MCP server is working as expected.
Confirm that there are no obvious security issues.
Evaluate tool definition quality.

Our badge communicates server capabilities, safety, and installation instructions.

Card Badge

Copy to your README.md:

[![recallnest MCP server](https://glama.ai/mcp/servers/AliceLJY/recallnest/badges/card.svg)](https://glama.ai/mcp/servers/AliceLJY/recallnest)

Score Badge

Copy to your README.md:

[![recallnest MCP server](https://glama.ai/mcp/servers/AliceLJY/recallnest/badges/score.svg)](https://glama.ai/mcp/servers/AliceLJY/recallnest)

How to claim the server?

If you are the author of the server, you simply need to authenticate using GitHub.

However, if the MCP server belongs to an organization, you need to first add glama.json to the root of your repository.

{
  "$schema": "https://glama.ai/mcp/schemas/server.json",
  "maintainers": [
    "your-github-username"
  ]
}

Then, authenticate using GitHub.

Browse examples.

How to make a release?

A "release" on Glama is not the same as a GitHub release. To create a Glama release:

Claim the server if you haven't already.
Go to the Dockerfile admin page, configure the build spec, and click Deploy.
Once the build test succeeds, click Make Release, enter a version, and publish.

This process allows Glama to run security checks on your server and enables users to deploy it.

How to add a LICENSE?

Please follow the instructions in the GitHub documentation.

Once GitHub recognizes the license, the system will automatically detect it within a few hours.

If the license does not appear on the server after some time, you can manually trigger a new scan using the MCP server admin interface.

How to sync the server with GitHub?

Servers are automatically synced at least once per day, but you can also sync manually at any time to instantly update the server profile.

To manually sync the server, click the "Sync Server" button in the MCP server admin interface.

How is the quality score calculated?

The overall quality score combines two components: Tool Definition Quality (70%) and Server Coherence (30%).

Tool Definition Quality measures how well each tool describes itself to AI agents. Every tool is scored 1–5 across six dimensions: Purpose Clarity (25%), Usage Guidelines (20%), Behavioral Transparency (20%), Parameter Semantics (15%), Conciseness & Structure (10%), and Contextual Completeness (10%). The server-level definition quality score is calculated as 60% mean TDQS + 40% minimum TDQS, so a single poorly described tool pulls the score down.

Server Coherence evaluates how well the tools work together as a set, scoring four dimensions equally: Disambiguation (can agents tell tools apart?), Naming Consistency, Tool Count Appropriateness, and Completeness (are there gaps in the tool surface?).

Tiers are derived from the overall score: A (≥3.5), B (≥3.0), C (≥2.0), D (≥1.0), F (<1.0). B and above is considered passing.

Latest Blog Posts

Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security
Open Source Has a Bot Problem
By punkpeye on March 19, 2026.
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AliceLJY/recallnest'

If you have feedback or need assistance with the MCP directory API, please join our Discord server