Skip to main content
Glama

Server Quality Checklist

67%
Profile completionA complete profile improves this server's visibility in search results.
  • Latest release: v1.0.0

  • Disambiguation4/5

    Most tools have distinct purposes, but there is some overlap that could cause confusion. For example, 'list_passages' and 'search_archival_memory' both deal with archival memory, and 'get_agent_summary' and 'retrieve_agent' both provide agent details with unclear differentiation in their descriptions. However, the descriptions generally help clarify the intended use cases.

    Naming Consistency5/5

    The tool names follow a highly consistent verb_noun pattern throughout, such as 'create_agent', 'list_agents', 'modify_agent', and 'delete_agent'. There are no deviations in naming conventions, making the set predictable and easy to understand.

    Tool Count2/5

    With 36 tools, the count is excessive for the domain of agent management and memory handling. This many tools can overwhelm users and agents, making it difficult to navigate and select the right tool efficiently. A more streamlined set would improve usability.

    Completeness5/5

    The tool set provides comprehensive coverage for managing agents, tools, memory, and conversations in the Letta system. It includes full CRUD operations for agents, memory blocks, and passages, along with advanced features like cloning, importing/exporting, and searching. No obvious gaps are present for the stated purpose.

  • Average 4.1/5 across 36 of 36 tools scored. Lowest: 3.1/5.

    See the Tool Scores section below for per-tool breakdowns.

    • 3 of 5 issues responded to in the last 6 months
    • No commit activity data available
    • Last stable release on
    • No critical vulnerability alerts
    • No high-severity vulnerability alerts
    • No code scanning findings
    • CI is passing
  • This repository is licensed under MIT License.

  • This repository includes a README.md file.

  • No tool usage detected in the last 30 days. Usage tracking helps demonstrate server value.

    Tip: use the "Try in Browser" feature on the server page to seed initial usage.

  • Add a glama.json file to provide metadata about your server.

  • If you are the author, simply .

    If the server belongs to an organization, first add glama.json to the root of your repository:

    {
      "$schema": "https://glama.ai/mcp/schemas/server.json",
      "maintainers": [
        "your-github-username"
      ]
    }

    Then . Browse examples.

  • Add related servers to improve discoverability.

How to sync the server with GitHub?

Servers are automatically synced at least once per day, but you can also sync manually at any time to instantly update the server profile.

To manually sync the server, click the "Sync Server" button in the MCP server admin interface.

How is the quality score calculated?

The overall quality score combines two components: Tool Definition Quality (70%) and Server Coherence (30%).

Tool Definition Quality measures how well each tool describes itself to AI agents. Every tool is scored 1–5 across six dimensions: Purpose Clarity (25%), Usage Guidelines (20%), Behavioral Transparency (20%), Parameter Semantics (15%), Conciseness & Structure (10%), and Contextual Completeness (10%). The server-level definition quality score is calculated as 60% mean TDQS + 40% minimum TDQS, so a single poorly described tool pulls the score down.

Server Coherence evaluates how well the tools work together as a set, scoring four dimensions equally: Disambiguation (can agents tell tools apart?), Naming Consistency, Tool Count Appropriateness, and Completeness (are there gaps in the tool surface?).

Tiers are derived from the overall score: A (≥3.5), B (≥3.0), C (≥2.0), D (≥1.0), F (<1.0). B and above is considered passing.

Tool Scores

  • Behavior2/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool executes prompts but lacks details on permissions, side effects, rate limits, or response format. This is inadequate for a tool that likely involves complex interactions, leaving significant gaps in understanding its behavior.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is extremely concise with two short sentences that are front-loaded and waste no words. Every sentence directly contributes to understanding the tool's purpose, making it efficient and well-structured.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness3/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's complexity (executing prompts with arguments) and the presence of an output schema, the description is minimally adequate. However, with no annotations and incomplete behavioral context, it falls short of being fully complete for safe and effective use by an AI agent.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description adds no additional meaning beyond implying that 'prompt_name' refers to registered templates and 'arguments' are prompt-specific, which is minimal value over the schema's detailed descriptions.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose4/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the action ('Execute') and resource ('registered prompt template'), with specific examples of what types of prompts can be run ('wizards, workflows, and guided interactions'). However, it doesn't explicitly differentiate from sibling tools like 'prompt_agent' or 'list_prompts', which limits the score.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines2/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides no guidance on when to use this tool versus alternatives like 'prompt_agent' or 'list_prompts'. It mentions what the tool does but offers no context on prerequisites, exclusions, or comparative use cases with sibling tools.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior2/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the action is to 'List', implying a read-only operation, but doesn't describe any behavioral traits such as pagination, sorting, filtering, rate limits, or authentication requirements. For a tool with zero annotation coverage, this leaves significant gaps in understanding how it behaves.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is a single, clear sentence that directly states the tool's purpose without any fluff or redundancy. It's front-loaded with the core action and resource, making it highly efficient and easy to parse.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness3/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool has 0 parameters, 100% schema coverage, and an output schema exists (so return values are documented elsewhere), the description is minimally adequate. However, it lacks behavioral context (e.g., how prompts are organized or returned) and usage guidelines, which are important for a listing tool in a context with many siblings like 'use_prompt'. It meets basic needs but has clear gaps.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters4/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    The input schema has 0 parameters with 100% coverage, so no parameter information is needed. The description appropriately doesn't discuss parameters, which is efficient. A baseline score of 4 is applied since the schema fully documents the absence of parameters, and the description doesn't add unnecessary details.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose4/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the verb 'List' and the resource 'all available prompt templates including wizards and workflows', which is specific and unambiguous. It distinguishes the tool's scope from generic listing tools like list_agents or list_memory_blocks by focusing on prompt templates, though it doesn't explicitly differentiate from the sibling tool 'use_prompt' (which likely applies rather than lists prompts).

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines2/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, context for listing prompts (e.g., for selection before using 'use_prompt'), or exclusions. Without such information, the agent must infer usage from the tool name and sibling tools alone.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    With no annotations provided, the description carries full burden for behavioral disclosure. It usefully states that returns are 'paginated message history' and lists content types included, which goes beyond basic retrieval. However, it doesn't mention permission requirements, rate limits, whether all message types are always included, or what the pagination structure looks like. For a read operation with 6 parameters, more behavioral context would be helpful.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness4/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is appropriately concise with two sentences. The first sentence states the core purpose and return content. The second provides usage context. No wasted words, though it could be slightly more front-loaded by mentioning pagination earlier given its importance.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness3/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    For a read operation with 6 parameters and no annotations or output schema, the description provides adequate but incomplete context. It covers the basic purpose and usage but lacks details about the return format structure, error conditions, or how pagination works in practice. With no output schema, the agent must infer the response format from the description alone, which is insufficiently detailed.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already documents all 6 parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema. It mentions 'paginated' which relates to 'before'/'after' parameters, but doesn't explain their semantics further. Baseline 3 is appropriate when schema does the heavy lifting.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose4/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the tool's purpose: 'Retrieve messages from an agent's conversation history.' It specifies the resource (messages) and action (retrieve), and mentions the content includes user messages, assistant responses, tool calls, and system messages. However, it doesn't explicitly differentiate from potential sibling tools like 'search_memory' or 'search_archival_memory' that might also retrieve message-like data.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines3/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides implied usage guidance: 'Use for reviewing past conversations or debugging agent behavior.' This suggests appropriate contexts but doesn't explicitly state when NOT to use this tool or name alternatives among the many sibling tools. It lacks clear exclusions or comparisons to tools like 'search_memory' or 'get_agent_summary' that might serve overlapping purposes.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    The description adds value by mentioning common labels, which provides context beyond the input schema. However, with no annotations provided, it doesn't disclose critical behavioral traits such as whether this operation is read-only, destructive, requires specific permissions, or has rate limits. The description doesn't contradict annotations (none exist), but it fails to fully compensate for the lack of structured behavioral hints.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is efficiently structured in two sentences: the first states the core purpose, and the second provides practical usage tips and examples. Every sentence adds value without redundancy, making it appropriately concise and front-loaded for quick understanding.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool has an output schema (not provided in context but indicated as present), the description doesn't need to explain return values. It covers the basic purpose, usage context, and parameter hints adequately. However, for a mutation tool with no annotations, it could benefit from more behavioral details (e.g., idempotency, error cases) to be fully complete, though the presence of an output schema mitigates some gaps.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    The input schema has 100% description coverage, with clear documentation for all three parameters (block_id, agent_id, label). The description adds minimal value by reinforcing the label parameter with examples ('persona', 'human', 'system'), but doesn't provide additional semantic context beyond what's already in the schema. This meets the baseline of 3 for high schema coverage.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose4/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the action ('Attach') and resource ('memory block to an agent'), making the purpose understandable. However, it doesn't explicitly differentiate this tool from similar siblings like 'attach_tool' or 'bulk_attach_tool_to_agents', which would require more specific context about what makes memory block attachment distinct from tool attachment.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines4/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides helpful context by mentioning 'list_memory_blocks to find blocks' and 'create_memory_block to make new ones', which guides users on prerequisites. It also suggests common labels like 'persona', 'human', 'system'. However, it doesn't explicitly state when to use this tool versus alternatives like 'attach_tool' or clarify if it's for single vs. bulk operations compared to 'bulk_attach_tool_to_agents'.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    The description adds some behavioral context beyond the minimal annotations (which only provide a title). It mentions the relationship with other tools (attach_tool, list_agent_tools) which helps understand workflow. However, it doesn't disclose important behavioral traits like whether this is a mutating operation, what permissions are needed, or what happens on success/failure. With no annotations covering these aspects, the description carries more burden but provides only moderate transparency.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is extremely concise with just two sentences that each serve a clear purpose. The first sentence states the core functionality, and the second provides usage guidance. There's zero wasted text, and the information is front-loaded appropriately.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given that this is a mutating tool (upload/creation) with no annotations covering safety or behavior, and with an output schema present (so return values are documented elsewhere), the description provides reasonable context. It covers the core purpose and usage workflow, though it could be more complete by addressing permissions, side effects, or error conditions that aren't covered by annotations.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    With 100% schema description coverage, the input schema already documents all 5 parameters thoroughly. The description adds no additional parameter information beyond what's in the schema. According to scoring rules, when schema coverage is high (>80%), the baseline is 3 even with no parameter info in the description.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose4/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the action ('Upload') and resource ('a new tool to the Letta system'), making the purpose immediately understandable. However, it doesn't explicitly differentiate this tool from sibling tools like 'add_mcp_tool_to_letta' or 'attach_tool', which appear to have related functionality in the same domain.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines4/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool by mentioning two specific follow-up actions ('Use with attach_tool to add it to agents' and 'list_agent_tools to verify attachment'). This gives clear context for usage, though it doesn't explicitly state when NOT to use it or mention alternatives like 'add_mcp_tool_to_letta'.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions that entries are 'formatted with metadata for easy filtering' and that they become 'searchable via search_archival_memory,' which adds useful context about the tool's behavior. However, it doesn't disclose important behavioral traits like whether this is a write operation (implied but not stated), what permissions are needed, or how the data is stored/persisted.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is efficiently structured in two sentences that each earn their place: the first states the core purpose and use case, the second explains the formatting benefit. There's no wasted text, and the most important information (what the tool does) is front-loaded.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness3/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    For a write operation with 6 parameters and no annotations or output schema, the description provides adequate but minimal context. It covers the basic purpose and relationship to search_archival_memory, but doesn't address important contextual aspects like error conditions, data validation, or what happens on successful creation. The lack of output schema means the description should ideally mention what the tool returns.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    With 100% schema description coverage, the schema already documents all 6 parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema descriptions. It mentions metadata formatting generally but doesn't explain how specific parameters like source or session_id affect this formatting.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Store a conversation entry') and resource ('in an agent's archival memory'), distinguishing it from sibling tools like search_archival_memory (which searches) or create_memory_block (which creates different memory types). It explicitly mentions the tool's purpose for recording conversations from external sources.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines4/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides clear context for when to use this tool ('to record conversations from external sources like Claude Code sessions that should be searchable via search_archival_memory'), giving a specific use case and mentioning the related search tool. However, it doesn't explicitly state when NOT to use it or compare it to similar tools like create_memory_block or attach_memory_block.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations include a title but no hints like readOnlyHint or destructiveHint. The description adds value by implying this is a read operation ('List') and specifying the context ('configured on the Letta server'), but doesn't disclose behavioral traits such as rate limits, authentication needs, or output format. With no annotations covering safety or behavior, the description carries some burden but provides only basic operational context.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is two sentences, front-loaded with the core purpose followed by usage guidance. Every sentence earns its place by providing essential information without waste. It's appropriately sized for a simple tool with no parameters, making it highly efficient and well-structured.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's low complexity (0 parameters, no annotations beyond title, and an output schema exists), the description is fairly complete. It explains what the tool does and how to use it, and since an output schema is present, it doesn't need to detail return values. However, it could be more comprehensive by addressing potential limitations or sibling tool relationships, but for this context, it's sufficient.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters4/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    The input schema has 0 parameters with 100% coverage, so the schema fully documents the absence of inputs. The description doesn't need to add parameter details, and it appropriately doesn't mention any. This meets the baseline for zero parameters, as the description focuses on purpose and usage without redundancy.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose4/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the tool's purpose: 'List available embedding models configured on the Letta server.' It uses a specific verb ('List') and identifies the resource ('embedding models'), but doesn't explicitly differentiate it from sibling tools like 'list_llm_models' or 'list_agents' beyond the resource type. This makes it clear but not fully sibling-distinctive.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines4/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides clear context for when to use the tool: 'Use with create_agent or modify_agent to set agent embedding preferences.' This gives a practical application scenario. However, it doesn't specify when not to use it or mention alternatives, such as whether other tools might list similar resources or if there are prerequisites for usage.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    No annotations are provided, so the description carries the full burden. It discloses key behavioral traits such as supporting text search with optional date filtering and performance implications of the source parameter. However, it lacks details on permissions, rate limits, error handling, or result format, leaving gaps for a search tool with no annotations.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is appropriately sized with two sentences that are front-loaded with the core purpose and efficiently convey key usage guidance. Every sentence earns its place by adding value without redundancy, making it highly concise and well-structured.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness3/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's moderate complexity (6 parameters, no annotations, no output schema), the description is adequate but incomplete. It covers the search scope and performance tips but lacks details on authentication, result structure, pagination, or error cases, which are important for a search tool without annotations or output schema.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds marginal value by mentioning the source parameter's impact on performance and the optional date filtering, but doesn't provide additional semantics beyond what's in the schema, meeting the baseline for high coverage.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the tool's purpose with specific verbs ('search an agent's memory') and resources ('across both archival passages and conversation messages'), distinguishing it from siblings like 'search_archival_memory' which only searches one source. It explicitly mentions the dual memory sources, making its scope unambiguous.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines4/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides clear context for usage with the recommendation to 'use source parameter to search only archival or messages for better performance,' which guides when to use specific parameter values. However, it doesn't explicitly state when to use this tool versus alternatives like 'search_archival_memory' or 'list_messages,' missing explicit sibling differentiation.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    The description adds useful context about the workflow (prerequisite and next step), but doesn't disclose behavioral traits beyond what annotations might imply. Since annotations only provide a title ('List Server Tools'), the description carries more burden but still lacks details about pagination behavior, rate limits, or error conditions. It doesn't contradict annotations, but could provide more operational context.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is perfectly concise with two sentences that each serve a clear purpose: the first states the tool's function, the second provides workflow guidance. There's zero wasted language, and the most important information (what the tool does) comes first.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given that there's an output schema (which handles return values) and 100% schema coverage for inputs, the description provides good contextual completeness for a listing tool. It explains the tool's role in the broader workflow and when to use it. The main gap is lack of behavioral details about pagination or filtering behavior, but the output schema likely covers the response structure.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    With 100% schema description coverage, the input schema already fully documents all 4 parameters. The description doesn't add any parameter-specific information beyond what's in the schema, so it meets the baseline expectation but doesn't provide additional semantic context about how parameters interact or practical usage examples.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose4/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the verb ('List') and resource ('all available tools for a specific MCP server'), making the purpose immediately understandable. However, it doesn't explicitly differentiate from sibling tools like 'list_agent_tools' or 'list_tools' (if they exist), which would require more specific scope clarification.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool ('Use list_mcp_servers first to see available servers') and what to do after ('then add_mcp_tool_to_letta to import tools into Letta'). It clearly establishes prerequisites and next steps, helping the agent understand the workflow context.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a title ('List Memory Blocks'), so the description carries the full burden of behavioral disclosure. It describes the tool as a list operation, which implies read-only behavior, but doesn't explicitly state if it's safe, requires authentication, has rate limits, or what the output format is. However, since annotations don't contradict the description (no readOnlyHint or destructiveHint provided), and the description adds some context by mentioning the system ('Letta system'), it meets the baseline for minimal transparency without annotations.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is appropriately sized and front-loaded: the first sentence states the core purpose clearly, and the second sentence efficiently provides usage guidelines without redundancy. Every sentence earns its place by adding value, and there's no wasted wording, making it highly concise and well-structured for an AI agent.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's complexity (8 parameters, no required parameters) and the presence of an output schema (which handles return values), the description is mostly complete. It covers purpose and usage guidelines adequately. However, it lacks details on behavioral aspects like authentication or rate limits, which could be relevant for a list operation in a system context. Since the output schema exists, the description doesn't need to explain return values, but some behavioral context is missing, preventing a perfect score.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, with all 8 parameters well-documented in the input schema (e.g., 'filter' for searching, 'page' for pagination). The description doesn't add any parameter-specific details beyond what the schema provides, such as explaining how filters combine or pagination defaults. According to the rules, with high schema coverage (>80%), the baseline score is 3, as the schema does the heavy lifting and no extra value is added by the description.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose4/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the tool's purpose: 'List all memory blocks available in the Letta system.' It specifies the verb ('List') and resource ('memory blocks'), and distinguishes it from siblings like 'create_memory_block' and 'update_memory_block' by mentioning them as alternatives. However, it doesn't explicitly differentiate from other list tools like 'list_passages' or 'list_agents', which slightly limits sibling distinction.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit usage guidelines by stating when to use this tool ('List all memory blocks') and when to use alternatives ('Use create_memory_block to add new ones, update_memory_block to modify, or attach_memory_block to link them to agents'). This clearly directs the agent to this tool for listing and to other tools for creation, modification, or attachment, covering both when-to-use and when-not-to-use scenarios.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    No annotations are provided, so the description carries the full burden. It mentions the search returns 'passages most similar to the query' but doesn't disclose important behavioral traits like authentication requirements, rate limits, error conditions, or what happens when no matches are found. The description is adequate but lacks depth for a search tool with 8 parameters.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is perfectly concise with three sentences that each earn their place: states the purpose, describes the return, and provides usage guidelines. It's front-loaded with the core functionality and wastes no words.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness3/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    For a search tool with 8 parameters and no output schema, the description is minimally complete. It covers the basic purpose and alternatives but lacks details about return format, pagination, error handling, or performance characteristics. Without annotations or output schema, more behavioral context would be helpful.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already documents all 8 parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema. It mentions semantic similarity for the query but doesn't provide additional context about parameter interactions or usage patterns.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the tool's purpose with specific verbs ('search', 'returns') and resource ('agent's archival memory'), distinguishing it from siblings like list_passages (text-based search) and create_passage (adding memories). It precisely defines the semantic similarity matching approach.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides when-to-use guidance by naming alternatives: 'Use list_passages for text-based search or pagination, create_passage to add memories.' This gives clear context for when this tool is appropriate versus other options.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    The description adds useful context about the two-step process (registration + attachment) and prerequisites, but annotations are minimal (only a title), so it carries most of the behavioral burden. It doesn't disclose potential side effects like whether registration is persistent, if duplicates are handled, or error conditions.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    Two sentences with zero waste: the first states the purpose and dual action, the second provides prerequisite tools. Every word earns its place, and it's front-loaded with the core functionality.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the complexity (registration + attachment), minimal annotations, but 100% schema coverage and an output schema (which handles return values), the description is mostly complete. It could improve by mentioning error cases or idempotency, but the workflow guidance and clear purpose make it strong.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    With 100% schema description coverage, the schema already documents both parameters well. The description adds no additional parameter semantics beyond what's in the schema, so it meets the baseline of 3 without compensating for gaps.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('registers a tool from a connected MCP server as a native Letta tool AND attaches it to a specified agent'), identifies the resource (MCP tool), and distinguishes it from siblings like 'attach_tool' (which likely attaches existing tools) and 'upload_tool' (which may upload custom tools).

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    It explicitly provides when-to-use guidance by naming two alternative tools ('list_mcp_tools_by_server to find available tools, and list_agents to get agent IDs') that should be used first to gather required information, creating a clear workflow context.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a title ('Bulk Attach Tool'), which doesn't cover behavioral traits like read-only/destructive hints. The description adds context about bulk operations and filtering, but doesn't disclose critical behavioral details such as whether this is a mutation (implied by 'Attaches'), permission requirements, rate limits, or what happens on partial failures. With minimal annotations, the description carries more burden but provides only basic operational context.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is two sentences, front-loaded with the core purpose and followed by prerequisite guidance. Every sentence earns its place: the first defines the tool's function and scope, the second provides essential usage context. There is zero waste or redundancy.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's moderate complexity (bulk mutation with filtering), the description is reasonably complete: it covers purpose, prerequisites, and basic operation. With an output schema present (per context signals), return values need not be explained. However, it lacks details on behavioral aspects like error handling or side effects, which are important for a bulk mutation tool.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, with all parameters well-documented in the schema itself (tool_id, agent_name_filter, agent_tag_filter). The description adds no additional parameter semantics beyond what's in the schema—it mentions filter criteria generically but doesn't elaborate on syntax, format, or behavior. Baseline 3 is appropriate since the schema does the heavy lifting.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Attaches a specified tool to multiple agents') and the resource ('multiple agents'), distinguishing it from sibling tools like 'attach_tool' (likely single agent) and 'add_mcp_tool_to_letta' (specific to Letta). It explicitly mentions the filter criteria (name or tags) for targeting agents.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool by naming alternative tools for prerequisite steps: 'Use list_agents to find agents and list_mcp_tools_by_server or upload_tool to get tool IDs.' This clearly indicates the tool should be used after identifying agents and tools via those other tools, though it doesn't explicitly state when NOT to use it (e.g., vs. single-agent attachment).

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a title, so the description carries the full burden. It discloses that this is a creation/mutation operation (implied by 'creates'), but lacks details on permissions, rate limits, or error conditions. It adds some context about the cloning process but doesn't fully compensate for the absence of annotations.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is efficiently structured in three sentences: purpose, prerequisite, and alternatives/follow-up. Each sentence adds distinct value without redundancy, making it front-loaded and zero-waste.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the presence of an output schema (which handles return values), 100% schema coverage, and clear purpose/guidelines, the description is largely complete. However, as a mutation tool with no annotations, it could benefit from more behavioral context like error handling or side effects.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description doesn't add any additional meaning or clarification beyond what's in the schema, such as explaining interactions between parameters. Baseline 3 is appropriate when schema does the heavy lifting.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('creates a new agent by cloning') and resource ('configuration of an existing agent'), distinguishing it from siblings like create_agent (new from scratch) and export_agent/import_agent (separate workflow). It explicitly names the verb and target resource with precision.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool: it names list_agents as a prerequisite for finding source IDs, positions it as an alternative to the export_agent + import_agent workflow, and suggests modify_agent for subsequent adjustments. This covers both prerequisites and alternatives clearly.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    The description adds useful context about the post-creation workflow (attach_tool, attach_memory_block, prompt_agent), which goes beyond the annotations. However, it doesn't disclose behavioral traits like whether this operation is idempotent, what permissions are required, or how errors are handled. With annotations limited to just a title, the description carries more burden but provides only moderate behavioral insight.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is perfectly concise with two sentences that each serve a distinct purpose: the first states the core function, and the second provides essential workflow guidance. There's zero wasted language, and the information is front-loaded with the primary action stated immediately.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given that this is a creation tool with 4 parameters, 100% schema coverage, and an output schema exists, the description provides good contextual completeness. It covers the purpose, distinguishes from siblings, and provides workflow guidance. The main gap is lack of behavioral details about the creation operation itself, but the existence of an output schema means return values don't need explanation.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    The description doesn't add any parameter-specific information beyond what's already in the schema (which has 100% coverage). It mentions 'specified configuration' but doesn't elaborate on what parameters are available or their meanings. Since schema coverage is complete, the baseline score of 3 is appropriate as the description doesn't compensate but also doesn't need to given the comprehensive schema.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Create a new Letta agent') and resource ('with specified configuration'), distinguishing it from sibling tools like clone_agent, modify_agent, or import_agent. It explicitly defines the tool's function as initial agent creation rather than modification or duplication.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool versus alternatives by stating 'After creation, use attach_tool to add capabilities, attach_memory_block to configure memory, or prompt_agent to start conversations.' This clearly delineates the tool's role in the workflow and directs users to specific sibling tools for subsequent actions.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a title, so the description carries the full burden of behavioral disclosure. It states this is a creation operation (implying mutation) and mentions common labels, but doesn't address permissions, error conditions, rate limits, or what happens when creating duplicate blocks. It adds some context but lacks comprehensive behavioral details.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is perfectly concise with two sentences that each earn their place: the first states the core purpose, the second provides usage guidance and sibling differentiation. No wasted words, and information is front-loaded appropriately.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given that an output schema exists (so return values are documented elsewhere), the description provides good context for a creation tool. It covers purpose, usage guidelines, and parameter hints, though could benefit from more behavioral details about the creation operation's implications in the system.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already documents all 5 parameters thoroughly. The description adds minimal value beyond the schema by mentioning common label examples ('persona', 'human', 'system'), which slightly enhances understanding of the 'label' parameter. This meets the baseline for high schema coverage.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Create a new memory block') and resource ('in the Letta system'), distinguishing it from siblings like 'update_memory_block' (modification) and 'attach_memory_block' (linking). It provides concrete examples of common labels ('persona', 'human', 'system') to clarify the tool's domain.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides when-to-use guidance by naming alternatives: 'Use attach_memory_block to link to agents, or update_memory_block to modify later.' This clearly distinguishes this tool's role from sibling tools and provides context for when each should be used.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations provide only a title ('Create Archival Memory'), which doesn't cover behavioral traits. The description adds value by indicating this is for insertion into archival memory, but it doesn't disclose details like authentication needs, rate limits, or side effects beyond the basic operation. No contradiction with annotations exists.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is two sentences with zero waste: the first states the purpose, and the second provides usage guidelines. It's front-loaded with the core function and efficiently structured, making every sentence earn its place.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's complexity (a write operation with 3 parameters), annotations are minimal, but the description covers purpose and usage well. An output schema exists, so return values don't need explanation. However, it lacks details on behavioral aspects like permissions or error handling, which could be useful for a mutation tool.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema fully documents all parameters. The description doesn't add any extra meaning or context about the parameters beyond what's in the schema, such as explaining the purpose of 'include_embeddings' or providing examples. This meets the baseline for high schema coverage.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Insert a memory') and resource ('archival memory store'), distinguishing it from sibling tools like list_passages, modify_passage, and delete_passage mentioned in the same sentence. It precisely communicates the tool's function without being vague or tautological.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides usage context by naming alternatives: 'Use list_passages to view existing memories, modify_passage to edit, or delete_passage to remove.' This gives clear guidance on when to use this tool versus others for related operations.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations are not provided, so the description carries full burden. It mentions 'concise summary' and includes 'core memory snippets and attached tool/source names', which adds context about what information is returned. However, it lacks details on permissions, rate limits, or error handling. With no annotations, this is a moderate disclosure but could be more comprehensive.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is front-loaded with the core purpose in the first sentence, followed by two concise guidance sentences. Every sentence earns its place by providing essential usage information without redundancy, making it efficient and well-structured.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool has an output schema (which covers return values), one parameter with full schema coverage, and no annotations, the description is mostly complete. It explains the purpose, usage guidelines, and what information is included in the summary. However, it could improve by adding more behavioral context (e.g., error cases or limitations) to fully compensate for the lack of annotations.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, with the schema fully documenting the single parameter 'agent_id'. The description does not add any parameter-specific information beyond what's in the schema. According to the rules, with high schema coverage (>80%), the baseline is 3, as the description doesn't need to compensate.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the tool's purpose with specific verbs ('Provides a concise summary') and resources ('agent's configuration, including core memory snippets and attached tool/source names'). It distinguishes from siblings like list_agents (which finds IDs) and retrieve_agent (which might have different scope), making the purpose specific and differentiated.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides usage guidelines: 'Use list_agents to find agent IDs' for prerequisites and 'Follow up with modify_agent to change settings or attach_tool to add capabilities' for next steps. This gives clear when-to-use context and names alternatives, helping the agent select this tool appropriately.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a title ('Import Agent Configuration'), so the description carries the burden. It describes the core behavior (importing and recreating agents) and hints at customization steps, but lacks details on permissions, error handling, or rate limits. No contradiction with annotations exists.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is two sentences with zero waste: the first states the purpose, and the second provides usage guidance. It is front-loaded and efficiently structured, with every sentence adding clear value.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's complexity (importing agents with customization), the description is complete enough with purpose and workflow guidance. Annotations are minimal, but the output schema exists, so return values need not be explained. It could improve by addressing potential errors or constraints.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema fully documents all parameters. The description does not add any parameter-specific details beyond what the schema provides, such as file format requirements or project context, meeting the baseline for high coverage.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Import a serialized agent JSON file and recreate the agent in the system') and distinguishes it from sibling tools like 'export_agent', 'modify_agent', and 'attach_tool' by explaining their relationship in the workflow.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    It explicitly states when to use this tool ('Use export_agent to create the JSON file, then modify_agent or attach_tool to customize the imported agent'), providing clear workflow context and alternatives for customization, which helps differentiate it from tools like 'create_agent' or 'clone_agent'.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations provide a title ('List All Agents') but no hints like readOnly or destructive. The description adds value by implying this is a read operation (listing) and suggesting it's used in conjunction with creation and interaction tools, but doesn't disclose behavioral traits such as pagination, rate limits, or auth requirements. With no annotations, the description carries some burden but leaves gaps.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is two sentences: the first states the purpose clearly, and the second provides usage guidelines efficiently. Every sentence earns its place with no wasted words, making it appropriately sized and front-loaded.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's low complexity (one optional parameter), 100% schema coverage, and the presence of an output schema (which handles return values), the description is mostly complete. It covers purpose and usage well but could improve by adding more behavioral context (e.g., response format or limitations), keeping it from a perfect score.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100% for the single parameter 'filter', documented as 'Optional filter to search for specific agents'. The description adds no additional parameter information beyond what the schema provides, so it meets the baseline of 3 for high coverage without extra value.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the verb ('List') and resource ('all available agents in the Letta system'), making the purpose specific and unambiguous. It distinguishes this tool from siblings like 'get_agent_summary' (for details) and 'prompt_agent' (to interact), avoiding tautology.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides usage context by naming three sibling tools as alternatives for different purposes: 'create_agent' to add new ones, 'get_agent_summary' for details, and 'prompt_agent' to interact. This gives clear guidance on when to use this tool versus others.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations are minimal (only a title), so the description carries the burden. It states the tool lists tools, implying a read-only operation, but does not disclose behavioral traits like pagination, rate limits, or authentication needs. It adds some context by mentioning related tools, but lacks detailed behavioral information.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is two sentences with zero waste: the first states the purpose, and the second provides usage guidelines. It is front-loaded with the core function and efficiently structured, earning its place without unnecessary elaboration.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's low complexity (one parameter, read-only implied), high schema coverage, and presence of an output schema, the description is mostly complete. It covers purpose and usage well, but could benefit from more behavioral context (e.g., response format hints) to fully compensate for minimal annotations.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, with the single parameter 'agent_id' fully documented in the schema. The description does not add any semantic details beyond what the schema provides (e.g., format examples or constraints), so it meets the baseline for high schema coverage without extra value.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('List all tools') and resource ('for a specific agent'), distinguishing it from sibling tools like 'list_mcp_tools_by_server' (which lists tools by server) and 'attach_tool' (which adds tools). It precisely defines the tool's scope and function.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides usage guidance by naming when to use alternatives: 'Use attach_tool to add more tools or list_mcp_tools_by_server to discover available tools.' This clearly differentiates this tool from its siblings and provides actionable context for selection.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a title ('Modify Agent Configuration'), so the description carries the full burden. It correctly indicates this is a mutation operation ('Update'), but doesn't disclose behavioral traits like permission requirements, rate limits, or what happens to unspecified fields. The description adds some context about tool handling but lacks comprehensive behavioral details.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is perfectly front-loaded with the core purpose in the first clause, followed by three concise usage guidelines. Every sentence earns its place by providing essential context without redundancy. The structure flows logically from what the tool does to how to use it properly.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given this is a mutation tool with minimal annotations but an output schema exists, the description provides good context about the operation and usage alternatives. However, it could benefit from mentioning that this is a partial update (implied by 'fields to update') and clarifying authentication or permission requirements. The existence of an output schema means return values don't need explanation.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already fully documents both parameters (agent_id and update_data with its nested properties). The description mentions 'provided data' but doesn't add meaningful semantic context beyond what's in the schema. Baseline 3 is appropriate when schema does the heavy lifting.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Update an existing agent by ID') and resource ('agent'), distinguishing it from siblings like create_agent (creation) or delete_agent (deletion). It precisely identifies the operation as modifying an existing entity rather than creating a new one.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool versus alternatives: it directs users to get_agent_summary for current config, list_llm_models/list_embedding_models for model options, and specifically states to use attach_tool for tools instead of this tool. This clearly defines the scope and exclusions.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations are minimal (only a title), so the description carries most of the burden. It implies this is a read operation ('Get full details'), which aligns with the tool name 'read_memory_block', but doesn't explicitly state whether it's safe, requires specific permissions, or has rate limits. It adds some context about dependencies (list_memory_blocks for IDs) but lacks detailed behavioral traits beyond the basic operation.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is front-loaded with the core purpose in the first sentence, followed by two concise sentences providing usage guidelines. Every sentence adds value without redundancy, making it efficient and well-structured for quick understanding.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's simplicity (read operation with 2 parameters), high schema coverage, and the presence of an output schema (which handles return values), the description is largely complete. It covers purpose, prerequisites, and next steps. However, it could benefit from more behavioral context (e.g., error handling or permissions), keeping it from a perfect score.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, with clear descriptions for both parameters (block_id and optional agent_id). The description doesn't add any additional meaning beyond what the schema provides, such as format examples or authorization details for agent_id. With high schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate with extra insights.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Get full details') and resource ('a specific memory block by ID'), distinguishing it from siblings like list_memory_blocks (which lists blocks) and update_memory_block (which modifies blocks). It explicitly identifies the target resource and operation.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool ('to find block IDs, use list_memory_blocks') and what to do after reading ('use update_memory_block to modify content'). It clearly differentiates this tool from its siblings by specifying prerequisites and next steps.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a title, so the description carries the burden of behavioral disclosure. It implies a mutation operation ('Update'), which is consistent with the tool name, but does not specify permissions, side effects, or response behavior. No contradictions with annotations exist, but it lacks details like whether updates are reversible or require specific auth beyond the optional agent_id parameter.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is two sentences, front-loaded with the core purpose and followed by practical usage guidance. Every sentence adds value without redundancy, making it efficient and well-structured for quick comprehension.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool has an output schema (not provided in context but indicated as present), the description need not detail return values. It covers purpose, usage guidelines, and prerequisites effectively. However, as a mutation tool with minimal annotations, it could benefit from more behavioral context (e.g., auth requirements or side effects), slightly reducing completeness.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema fully documents all parameters. The description does not add any parameter-specific details beyond what the schema provides (e.g., it doesn't explain format for 'value' or 'metadata'). Baseline 3 is appropriate as the schema handles parameter documentation adequately.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the action ('Update') and resource ('contents and metadata of a memory block'), distinguishing it from sibling tools like 'create_memory_block', 'delete_memory_block', 'read_memory_block', and 'list_memory_blocks'. It specifies both content and metadata updates, making the purpose specific and unambiguous.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides when-to-use guidance by referencing sibling tools: 'Use list_memory_blocks to find block IDs, or read_memory_block to see current content before updating.' This clearly directs the agent to use alternatives for prerequisite steps, enhancing proper tool selection in context.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations provide a title ('Get Agent Details') but no behavioral hints like readOnlyHint or destructiveHint. The description adds value by clarifying that this retrieves 'full state' and 'complete details', which suggests a comprehensive read operation. However, it doesn't disclose other behavioral traits like error conditions, rate limits, or authentication requirements, leaving some gaps.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is highly concise and well-structured with two sentences: the first states the purpose and differentiation, and the second provides usage guidance. Every sentence earns its place by adding critical information without redundancy, making it front-loaded and efficient.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness5/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's low complexity (single parameter, read-only operation implied), high schema coverage (100%), and the presence of an output schema (which handles return values), the description is complete enough. It covers purpose, differentiation, and prerequisites, addressing all necessary contextual aspects without overloading.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, with the single parameter 'agent_id' fully documented in the schema. The description doesn't add any parameter-specific details beyond what the schema provides, such as format examples or constraints. Given the high schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate but doesn't need to.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the verb ('Get') and resource ('full state of a specific agent by ID'), making the purpose explicit. It also distinguishes this tool from its sibling 'get_agent_summary' by specifying that it returns 'complete details' instead of a summary, providing clear differentiation.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool vs. alternatives: it names 'get_agent_summary' as a sibling with different output detail and 'list_agents' as a prerequisite for finding agent IDs. This covers both when-to-use and when-not-to-use scenarios effectively.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior4/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    The description adds valuable behavioral context beyond annotations. While annotations only provide a title ('Delete Agent'), the description discloses critical traits: the action is 'permanent' (irreversible destruction) and includes a 'WARNING' about this. This compensates for the lack of annotations like destructiveHint, providing essential safety information for a mutation tool.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is highly efficient with three sentences that each serve a distinct purpose: stating the action, providing usage guidance, and warning about permanence. It is front-loaded with the core function and wastes no words, making it easy for an agent to parse quickly.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's complexity (a destructive mutation), the description is mostly complete. It covers purpose, usage, and critical behavioral warnings. With an output schema present, return values don't need explanation. However, it lacks details on permissions or error conditions, which could be relevant for such a high-stakes operation.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    With 100% schema description coverage (the single parameter 'agent_id' is fully documented in the schema), the description adds minimal value beyond the schema. It mentions 'by ID' but doesn't provide additional syntax, format, or constraints. The baseline score of 3 reflects adequate but not enhanced parameter documentation.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Delete') and target resource ('a specific agent by ID'), distinguishing it from siblings like 'bulk_delete_agents' (for bulk operations) and 'list_agents' (for finding IDs). It uses precise verb+resource phrasing that leaves no ambiguity about the tool's function.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides when to use this tool ('Delete a specific agent by ID') versus alternatives ('Use list_agents to find agent IDs. For bulk deletion, use bulk_delete_agents'). It offers clear guidance on prerequisites (finding IDs) and sibling tool selection for different use cases.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior5/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively communicates critical traits: the action is 'permanent and cannot be undone', which is essential for a destructive operation. This goes beyond what the input schema indicates, adding vital safety context.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is front-loaded with the core action ('Delete a memory block by ID'), followed by usage guidance and a critical warning. Every sentence earns its place: the first states the purpose, the second provides context, and the third delivers essential safety information, with zero wasted words.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's complexity (destructive operation with 2 parameters), no annotations, and no output schema, the description is largely complete. It covers purpose, usage context, and critical behavioral warnings. However, it could improve by mentioning potential side effects (e.g., impact on related data) or error conditions, though the warning suffices for basic use.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    The schema description coverage is 100%, so the input schema already documents both parameters ('block_id' and 'agent_id') thoroughly. The description adds no additional meaning about parameters beyond implying 'block_id' is required for deletion, which is already clear from the schema. Baseline 3 is appropriate when the schema does the heavy lifting.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Delete') and resource ('a memory block by ID'), distinguishing it from siblings like 'update_memory_block' (modify) and 'read_memory_block' (view). It avoids tautology by not merely restating the tool name.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines4/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    It provides clear context for when to use this tool by referencing 'list_memory_blocks to find block IDs', which helps the agent understand prerequisites. However, it does not explicitly mention when not to use it or name alternatives like 'update_memory_block' for non-destructive changes.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior4/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    The description adds valuable behavioral context beyond annotations by including 'WARNING: This action is permanent,' which discloses the irreversible nature of the deletion. Since annotations only provide a title ('Delete Archival Memory') and no other hints (e.g., destructiveHint), the description carries the burden of conveying this critical trait, though it could mention response format or error handling.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is highly concise and well-structured in two sentences: the first states the purpose and usage guideline, and the second provides a critical warning. Every sentence earns its place by adding essential information without redundancy, making it front-loaded and efficient.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's complexity as a destructive operation with no annotations covering safety, the description is mostly complete by stating the purpose, usage, and permanence warning. Since an output schema exists, it doesn't need to explain return values, but it could improve by mentioning error cases or confirmation steps, leaving a minor gap.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    The description does not add semantic details about the parameters beyond what the input schema provides, as schema description coverage is 100% with clear descriptions for agent_id and memory_id. It implies the parameters are needed but doesn't explain their format or relationships, so it meets the baseline of 3 where the schema does the heavy lifting.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Delete a memory') and resource ('from an agent's archival memory store'), distinguishing it from siblings like delete_agent or delete_memory_block by specifying the target as a passage in archival memory. It uses precise terminology that aligns with sibling tools like list_passages and create_passage.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides usage guidance by stating 'Use list_passages to find memory IDs,' which indicates a prerequisite step and distinguishes it from alternatives like search_archival_memory. It also includes a warning about permanence, helping the agent understand when to use this tool cautiously versus other read-only operations.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    The annotations provide a title but no behavioral hints (readOnlyHint, destructiveHint, etc.). The description adds useful context about what gets listed (LLM models configured on the server) and their purpose (for agent model preferences), but doesn't disclose behavioral traits like rate limits, authentication needs, or pagination behavior. With no annotations covering these aspects, the description provides basic context but lacks operational details.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is two sentences with zero waste. The first sentence states the purpose, and the second provides usage guidelines. Every word earns its place, and the information is front-loaded with the core functionality stated immediately.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's simplicity (0 parameters, no annotations, but has an output schema), the description is mostly complete. It explains what the tool does and when to use it. The output schema exists, so the description doesn't need to explain return values. However, it could benefit from mentioning any limitations (e.g., whether it lists all models or only available ones), slightly reducing completeness.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters4/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    The tool has 0 parameters with 100% schema description coverage, so the schema fully documents the absence of parameters. The description appropriately doesn't add parameter information, which is correct for a parameterless tool. The baseline for 0 parameters is 4, and the description doesn't contradict or confuse this.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('List available LLM models') and identifies the resource ('configured on the Letta server'). It distinguishes from siblings like list_embedding_models or list_mcp_servers by focusing specifically on LLM models, not other types of models or servers.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly states when to use this tool: 'Use with create_agent or modify_agent to set agent model preferences.' This provides clear context about its purpose in the workflow and distinguishes it from other listing tools that serve different purposes (like list_agents or list_memory_blocks).

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior3/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations are minimal (only a title), so the description carries the burden of behavioral disclosure. It describes the action (listing servers) and hints at a workflow with another tool, but lacks details on output format, pagination, or error handling. With no annotations to cover these aspects, the description adds some context but remains basic.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is two sentences that are front-loaded and efficient. The first sentence states the core purpose, and the second provides essential usage guidance without redundancy. Every sentence adds value, making it appropriately sized and well-structured.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's low complexity (0 parameters, output schema exists), the description is largely complete. It covers purpose and usage, and the output schema will handle return values. However, it could benefit from mentioning any limitations (e.g., if the list is static or dynamic), but overall, it's sufficient for this simple tool.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters4/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    The input schema has 0 parameters with 100% coverage, so no parameter documentation is needed. The description correctly omits parameter details, focusing on the tool's purpose and usage. This aligns with the baseline for zero parameters, where the description compensates by being clear and concise.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the verb ('List') and resource ('all configured MCP servers on the Letta server'), making the purpose specific and unambiguous. It also distinguishes from the sibling tool 'list_mcp_tools_by_server' by explaining their relationship, which helps differentiate their roles.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides usage guidance by stating 'Use with list_mcp_tools_by_server to explore available tools from each server.' This indicates when to use this tool (to get server list) and how it complements an alternative (to then explore tools per server), offering clear context for selection.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior4/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a generic title ('Update Archival Memory'), so the description carries full burden. It discloses key behavioral traits: this is a mutation operation (implied by 'Modify'), it requires specific IDs (agent_id, memory_id), and has a current limitation (text-only updates). However, it doesn't mention permissions, rate limits, or whether changes are reversible.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    Two sentences, zero waste. The first sentence states the core purpose, the second adds critical constraints. Both sentences earn their place by providing essential information not obvious from the tool name or annotations.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given this is a mutation tool with no annotations covering safety/behavior, the description does well by stating the purpose, prerequisite (list_passages), and current limitation. Since an output schema exists, it doesn't need to explain return values. However, it could better address behavioral aspects like permissions or idempotency.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already documents all 4 parameters thoroughly. The description adds minimal value beyond the schema: it reinforces that 'update_data' is for 'text content' updates only, but doesn't provide additional syntax, format, or contextual details not already in the schema descriptions.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Modify a memory'), resource ('in the agent's archival memory store'), and scope ('Currently only supports updating the text content'). It distinguishes from sibling tools like 'create_passage' (creation), 'delete_passage' (deletion), and 'list_passages' (listing).

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides when-to-use guidance: 'Use list_passages to find memory IDs' (prerequisite) and 'Currently only supports updating the text content' (limitation vs. alternatives like 'update_memory_block' for broader updates). It clearly directs to a specific sibling tool for ID discovery.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior4/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a title ('Send Message to Agent'), so the description carries the full burden. It adds valuable behavioral context: the tool requires tools to be attached first (a prerequisite), mentions using 'list_agents' for IDs (a setup step), and implies it's for interactive messaging (not destructive). However, it doesn't specify rate limits, auth needs, or response format details, leaving some gaps.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is front-loaded with the core purpose in the first sentence, followed by two concise prerequisite and setup sentences. Each sentence adds value without redundancy, making it efficient and well-structured for quick comprehension.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness4/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's complexity (interactive messaging with prerequisites), 100% schema coverage, and an output schema (implied by 'get a response'), the description is mostly complete. It covers purpose, usage, and prerequisites adequately. However, it lacks details on behavioral aspects like error handling or response structure, which could be useful despite the output schema.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, with clear descriptions for 'agent_id' and 'message'. The description adds no additional parameter semantics beyond what the schema provides, such as format examples or constraints. Since the schema does the heavy lifting, the baseline score of 3 is appropriate, as the description doesn't enhance parameter understanding.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the action ('Send a message to an agent and get a response') and the resource ('an agent'), with specific verbs and object. It distinguishes from siblings like 'list_agents' (which lists agents) and 'attach_tool' (which attaches tools), making the purpose explicit and differentiated.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool ('Send a message to an agent and get a response'), prerequisites ('Ensure the agent has necessary tools attached (see attach_tool) first'), and alternatives ('Use list_agents to find agent IDs'), covering when, when-not, and related tools comprehensively.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior4/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    The description adds valuable behavioral context beyond annotations: it explains that tool names can correspond to MCP tools not yet in Letta (which will be registered first), and mentions verification with 'list_agent_tools'. Annotations only provide a title, so the description carries most of the burden and does so effectively without contradiction.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is front-loaded with the core purpose, followed by concise usage notes and verification steps in three clear sentences. Every sentence adds value without waste, making it efficient and well-structured for quick understanding.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness5/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's moderate complexity, 100% schema coverage, and the presence of an output schema, the description is complete enough. It covers purpose, usage guidelines, behavioral nuances (like MCP tool registration), and references to related tools, leaving no significant gaps for the agent to operate effectively.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds minimal parameter semantics beyond the schema, such as clarifying that tool names can be for existing Letta tools or MCP tools, but this is largely redundant with schema details. Baseline 3 is appropriate given high schema coverage.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the action ('Attach') and resource ('one or more tools to an agent'), specifying that tools can be identified by ID or name. It distinguishes this tool from siblings like 'list_agent_tools' (for verification) and 'upload_tool' (for creation), providing specific differentiation.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly states when to use this tool ('to attach tools to an agent') and provides clear alternatives for related actions: 'Find tools with list_mcp_tools_by_server or create custom ones with upload_tool. Use list_agent_tools to verify attachment.' This gives comprehensive guidance on prerequisites and verification.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior4/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a title, so the description carries the full burden. It discloses critical behavioral traits: the action is 'permanent' (destructive nature) and includes a 'WARNING' about irreversibility. However, it doesn't mention authentication needs, rate limits, or error handling, which could be useful additions.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is front-loaded with the core action, followed by usage guidance and a critical warning. Every sentence earns its place: the first defines purpose, the second provides prerequisites, and the third highlights risks—all without redundancy or fluff.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness5/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's complexity (destructive bulk operation) and the presence of an output schema (which handles return values), the description is complete enough. It covers purpose, usage, and risks, addressing key contextual needs without over-explaining parameters or outputs.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description adds minimal value beyond the schema by mentioning 'filter criteria (name or tags) or a specific list of IDs,' which is already implied in the schema. This meets the baseline for high schema coverage.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Deletes multiple agents') and distinguishes it from sibling tools like 'delete_agent' (singular) and 'list_agents' (read-only). It specifies the resource ('agents') and the scope ('multiple'), making the purpose unambiguous.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description explicitly provides usage guidance: 'Use list_agents first to identify agents to delete.' This tells the agent when to use this tool (after identification) and references a specific alternative tool for preparation, which is optimal for sibling differentiation.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior4/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    Annotations only provide a title, so the description carries the full burden. It clearly describes the core behavior (export to JSON file with optional upload) and mentions the optional upload functionality to XBackbone. However, it doesn't mention potential side effects, rate limits, or authentication requirements beyond the XBackbone parameters.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is perfectly concise with three focused sentences: the core functionality, when to use alternatives, and a prerequisite. Every sentence earns its place with no wasted words, and the most important information (what the tool does) comes first.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness5/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the presence of a comprehensive input schema (100% coverage) and an output schema (implied by context signals), the description provides exactly what's needed: clear purpose, usage guidelines, and behavioral context. The description doesn't need to explain return values since an output schema exists.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    Schema description coverage is 100%, so the schema already fully documents all 6 parameters. The description adds no additional parameter semantics beyond what's in the schema descriptions. The baseline of 3 is appropriate when the schema does all the parameter documentation work.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Export an agent's configuration to a JSON file and optionally upload it') and distinguishes it from sibling tools by explicitly naming alternatives (import_agent, clone_agent, list_agents). It specifies both the resource (agent configuration) and output format (JSON file).

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool versus alternatives: 'Use import_agent to recreate the agent later, or clone_agent for a quick copy. Use list_agents to find agent IDs.' This gives clear context for tool selection and prerequisites.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

  • Behavior4/5

    Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

    The description adds valuable behavioral context beyond what annotations provide: it specifies this is a 'paginated query' (which isn't in the annotations) and clarifies the scope ('memories in an agent's archival memory store'). While annotations only provide a title ('Search Archival Memory'), the description adds operational details about pagination and memory store context that help the agent understand how to use the tool effectively.

    Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

    Conciseness5/5

    Is the description appropriately sized, front-loaded, and free of redundancy?

    The description is perfectly concise with just two sentences that each serve distinct purposes: the first states the core functionality, and the second provides usage guidelines. There's zero wasted language, and the most important information (what the tool does) comes first, making it easy for an agent to quickly understand the tool's purpose.

    Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

    Completeness5/5

    Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

    Given the tool's complexity (8 parameters, pagination, search capabilities) and the presence of both comprehensive schema documentation (100% coverage) and an output schema, the description provides exactly what's needed: clear purpose, usage boundaries, and key behavioral context (paginated query). The description doesn't need to explain return values since an output schema exists, making this complete for the agent's needs.

    Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

    Parameters3/5

    Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

    With 100% schema description coverage, the input schema already comprehensively documents all 8 parameters. The description doesn't add any parameter-specific information beyond what's in the schema, so it meets the baseline expectation. The description's mention of 'paginated query' aligns with the after/before/limit parameters in the schema but doesn't provide additional semantic context.

    Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

    Purpose5/5

    Does the description clearly state what the tool does and how it differs from similar tools?

    The description clearly states the specific action ('Retrieve the memories'), identifies the resource ('agent's archival memory store'), and distinguishes it from sibling tools by explicitly naming create_passage, modify_passage, and delete_passage as alternatives for different operations. This provides excellent differentiation from similar tools like search_archival_memory and search_memory.

    Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

    Usage Guidelines5/5

    Does the description explain when to use this tool, when not to, or what alternatives exist?

    The description provides explicit guidance on when to use this tool versus alternatives: 'Use create_passage to add new memories, modify_passage to edit, or delete_passage to remove them.' This clearly defines the boundary between this retrieval tool and mutation operations, helping the agent select the right tool for the job.

    Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

GitHub Badge

Glama performs regular codebase and documentation scans to:

  • Confirm that the MCP server is working as expected.
  • Confirm that there are no obvious security issues.
  • Evaluate tool definition quality.

Our badge communicates server capabilities, safety, and installation instructions.

Card Badge

Letta-MCP-server MCP server

Copy to your README.md:

Score Badge

Letta-MCP-server MCP server

Copy to your README.md:

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/oculairmedia/Letta-MCP-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server