Penfield Memory
Server Details
Persistent memory and knowledge graphs for AI agents. Hybrid search, context checkpoints, and more.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- penfieldlabs/penfield-mcp
- GitHub Stars
- 3
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Score is being calculated. Check back soon.
Available Tools
17 toolsawakenARead-onlyInspect
⚡ CALL THIS TOOL FIRST IN EVERY NEW CONVERSATION ⚡
Loads your personality configuration and user preferences for this session.
This is how you learn WHO you are and HOW the user wants you to behave.
Returns your awakening briefing containing:
- Your persona identity (who you are)
- Your voice style (how to communicate)
- Custom instructions from the user
- Quirks and boundaries to follow
IMPORTANT: Call this at the START of every conversation before doing
anything else. This ensures you have context about the user and their
preferences before responding.
Example:
>>> await awaken()
{'success': True, 'briefing': '=== AWAKENING BRIEFING ===...'}| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Complements readOnlyHint=true by clarifying the operation loads configuration without mutation. Adds valuable context about what the briefing contains (persona, voice, quirks) beyond what annotations indicate. Could mention idempotency or caching behavior, but adequately covers the scope.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with critical instruction (call first), followed by purpose, output specification, and rationale. Visual emphasis (⚡) and bullet lists enhance scannability. Example demonstrates async invocation pattern. No wasted text despite length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool returns structured data (output schema exists), the description appropriately previews the briefing content without duplicating the full schema. Covers initialization logic, return value preview, and sequencing prerequisites completely for a zero-parameter initialization tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Zero parameters per schema (baseline 4). Description appropriately omits parameter discussion since the tool requires no inputs. Schema coverage is 100% (empty object confirms no params needed), so no compensation required.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity with clear verb ('Loads') and resource ('personality configuration and user preferences'). Distinguishes from siblings (e.g., 'recall', 'retrieve_artifact') by focusing on session initialization and identity establishment rather than data retrieval or storage.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit sequencing constraints with 'CALL THIS TOOL FIRST IN EVERY NEW CONVERSATION' and 'before doing anything else'. Clear rationale provided ('ensures you have context'). Establishes temporal precedence over all sibling tools without ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
connectAInspect
Connect memories to build knowledge graphs.
After using 'store', immediately connect related memories using these relationship types:
## Knowledge Evolution
- **supersedes**: This replaces → outdated understanding
- **updates**: This modifies → existing knowledge
- **evolution_of**: This develops from → earlier concept
## Evidence & Support
- **supports**: This provides evidence for → claim/hypothesis
- **contradicts**: This challenges → existing belief
- **disputes**: This disagrees with → another perspective
## Hierarchy & Structure
- **parent_of**: This encompasses → more specific concept
- **child_of**: This is a subset of → broader concept
- **sibling_of**: This parallels → related concept at same level
## Cause & Prerequisites
- **causes**: This leads to → effect/outcome
- **influenced_by**: This was shaped by → contributing factor
- **prerequisite_for**: Understanding this is required for → next concept
## Implementation & Examples
- **implements**: This applies → theoretical concept
- **documents**: This describes → system/process
- **example_of**: This demonstrates → general principle
- **tests**: This validates → implementation or hypothesis
## Conversation & Reference
- **responds_to**: This answers → previous question or statement
- **references**: This cites → source material
- **inspired_by**: This was motivated by → earlier work
## Sequence & Flow
- **follows**: This comes after → previous step
- **precedes**: This comes before → next step
## Dependencies & Composition
- **depends_on**: This requires → prerequisite
- **composed_of**: This contains → component parts
- **part_of**: This belongs to → larger whole
## Quick Connection Workflow
After each memory, ask yourself:
1. What previous memory does this update or contradict? → `supersedes` or `contradicts`
2. What evidence does this provide? → `supports` or `disputes`
3. What caused this or what will it cause? → `influenced_by` or `causes`
4. What concrete example is this? → `example_of` or `implements`
5. What sequence is this part of? → `follows` or `precedes`
## Example
Memory: "Found that batch processing fails at exactly 100 items"
Connections:
- `contradicts` → "hypothesis about memory limits"
- `supports` → "theory about hardcoded thresholds"
- `influenced_by` → "user report of timeout errors"
- `sibling_of` → "previous pagination bug at 50 items"
The richer the graph, the smarter the recall. No orphan memories!
Args:
from_memory: Source memory UUID
to_memory: Target memory UUID
relationship_type: Type from the categories above
strength: Connection strength (0.0-1.0, default 0.5)
ctx: MCP context (automatically provided)
Returns:
Dict with success status, relationship_id, and connected memory IDs| Name | Required | Description | Default |
|---|---|---|---|
| strength | No | ||
| to_memory | Yes | ||
| from_memory | Yes | ||
| relationship_type | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=false (mutation) and openWorldHint=true. Description adds semantic transparency by explaining that connections create 'knowledge graphs,' the meaning of relationship semantics (e.g., 'supersedes' vs 'updates'), and consequences ('smarter recall'). Could clarify idempotency behavior for duplicate connections.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with hierarchical headers and front-loaded purpose. Length is substantial due to necessary enumeration of 20+ relationship types (required given open string parameter), but every section earns its place for agent reasoning. Minor deduction for 'Returns' section which duplicates output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 0% schema coverage and complex graph semantics domain, description is comprehensive. Includes parameter docs, relationship taxonomy, usage workflow, and concrete examples—sufficient for correct invocation without external reference.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% (only titles). Description fully compensates with an Args section documenting all 4 parameters: UUID semantics for memories, enumerated relationship_types (critical since schema lacks enum), strength range (0.0-1.0), and ctx annotation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Opens with precise action ('Connect memories') and outcome ('build knowledge graphs'). Immediately distinguishes from sibling 'store' by stating 'After using store...' and implies distinction from 'disconnect' by focusing on creation semantics.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit temporal guidance ('immediately connect related memories'), a 5-question workflow for decision-making, and contextualizes when specific relationship types apply. Clear activation pattern post-storage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_artifactADestructiveInspect
Delete an artifact from storage.
Permanently removes an artifact and its associated memory record.
Args:
path: Full path of the artifact to delete
ctx: MCP context (automatically provided)
Returns:
Dict with success status| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare destructiveHint=true, so the description's emphasis on 'permanently removes' is consistent but redundant. It adds valuable context that the operation also deletes an 'associated memory record,' which is not indicated in annotations. It does not address idempotency, error cases (e.g., missing path), or authorization requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately sized with a clear structure: one-line summary, behavioral details, Args section, and Returns section. Every sentence adds value. The Returns section is slightly vague ('Dict with success status') but acceptable given the output schema exists.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter destructive operation, the description covers the essential mechanics and permanence warning. However, given the destructive nature and lack of schema descriptions, it should mention error behavior (e.g., what happens if path doesn't exist) or whether the operation is idempotent. The output schema exists, so detailed return value explanation is not required in the description.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description compensates effectively by documenting the 'path' parameter as 'Full path of the artifact to delete.' It also clarifies that 'ctx' is automatically provided, which helps the agent understand it should not manually supply this argument.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool deletes artifacts from storage and permanently removes associated memory records. It uses specific verbs (delete, removes) and identifies the resource (artifact, memory record). However, it does not explicitly distinguish this from sibling tools like 'retrieve_artifact' or 'save_artifact'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description emphasizes 'permanently removes,' implying irreversibility and warning against accidental use. However, it lacks explicit guidance on when to use this versus alternatives (e.g., 'use retrieve_artifact first to verify contents') or prerequisites like necessary permissions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
disconnectADestructiveInspect
Remove a connection between memories.
Deletes the relationship between two memories in the knowledge graph.
Args:
from_memory: Source memory UUID
to_memory: Target memory UUID
ctx: MCP context (automatically provided)
Returns:
Dict with success status and disconnected memory IDs
Examples:
>>> await disconnect("uuid-abc", "uuid-def")
{'success': True, 'from_id': '...', 'to_id': '...'}| Name | Required | Description | Default |
|---|---|---|---|
| to_memory | Yes | ||
| from_memory | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare destructiveHint=true. The description adds critical clarification that only the relationship is deleted ('Deletes the relationship between two memories'), implying the memory nodes persist. It also establishes domain context ('knowledge graph'). No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The docstring structure is clear but verbose. The Returns section is redundant given the output schema exists, and the Examples block, while helpful, contributes to length. The core semantic content is front-loaded in the first two sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 2-parameter destructive operation with existing annotations and output schema, the description provides adequate coverage including parameter semantics, behavioral context, and invocation examples. It appropriately compensates for the schema's lack of property descriptions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the Args section compensates effectively by documenting both 'from_memory' as 'Source memory UUID' and 'to_memory' as 'Target memory UUID', adding semantic directionality and format expectations (UUID) absent from the raw schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Remove', 'Deletes') with clear resource ('connection between memories', 'relationship'). It effectively distinguishes from sibling 'connect' (creates connections) and 'delete_artifact' (removes nodes/artifacts, not edges).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies appropriate use by specifying it removes relationships rather than memories themselves, implicitly distinguishing from memory deletion tools. However, it lacks explicit guidance on when to prefer this over other modification tools or prerequisites for the operation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
exploreARead-onlyInspect
Explore connections from a memory.
Traverses the knowledge graph to find related concepts, following
relationships up to the specified depth.
Args:
start_memory: Starting memory UUID
max_depth: How deep to traverse (default 3, max 10)
relationship_types: Filter by specific relationship types (optional)
ctx: MCP context (automatically provided)
Returns:
Dict with paths found, max depth reached, and path details
Examples:
>>> await explore("uuid-123", max_depth=2)
{'success': True, 'paths_found': 5, 'max_depth_reached': 2, 'paths': [...]}| Name | Required | Description | Default |
|---|---|---|---|
| max_depth | No | ||
| start_memory | Yes | ||
| relationship_types | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true and openWorldHint=true. Description adds valuable behavioral constraints not in schema: max_depth ceiling of 10, specific return structure (paths_found, max_depth_reached), and automatic nature of 'ctx' parameter. Does not disclose performance implications of deep traversal.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Uses docstring format (Args/Returns/Examples) which is structured but slightly verbose for agent consumption. Every section adds value given 0% schema coverage, though headers like 'Args:' consume tokens without adding semantic content. Front-loaded with clear one-line summary.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a graph traversal tool: covers all parameters (compensating for empty schema), documents return values despite presence of output schema, provides runnable example, and explains traversal depth mechanics. Minor gap on explaining 'path' structure details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, requiring description to carry full burden. Successfully documents all 3 schema parameters: clarifies start_memory expects UUID, explains max_depth traversal semantics and constraints, and describes relationship_types as optional filters. Deducted point for not enumerating valid relationship type values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb-resource combination (traverses knowledge graph / explores connections) and specific functionality (graph traversal by depth). Lacks explicit differentiation from sibling tools like 'search' or 'recall' which also retrieve memories.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides concrete examples showing invocation patterns, which offers implied usage guidance. However, lacks explicit when-to-use guidance versus alternatives like 'search' or 'fetch', and no prerequisites are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fetchBRead-onlyInspect
Fetch memory by ID.
Returns a single memory with proper citation support (id, title, url, text fields).
Args:
id: Memory UUID to fetch
ctx: MCP context
Returns:
Dict with id, title, url, text, metadata fields| Name | Required | Description | Default |
|---|---|---|---|
| id | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark this as readOnly and openWorld. The description adds valuable context about citation support and return structure (id, title, url, text fields), but does not elaborate on the implications of openWorldHint (e.g., potential latency or external source availability).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Uses a clear docstring structure (description, args, returns) with the purpose front-loaded in the first sentence. The Returns section is somewhat redundant given the existence of an output schema, but acceptable given the need to compensate for poor schema coverage.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequately complete for a simple read operation with one parameter. It documents the input semantics, return structure, and read-only nature sufficiently given the presence of annotations and output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description compensates effectively by specifying that 'id' is a 'Memory UUID to fetch', clarifying the expected format and semantics. The 'ctx' mention appears extraneous (not in schema), but the critical parameter is well-documented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb (Fetch) and resource (memory) with access pattern (by ID), which clearly defines the tool's scope. However, it misses explicit differentiation from semantic siblings like 'recall' or 'search'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this versus 'search', 'recall', or 'list_artifacts'. The 'by ID' phrasing implies exact match retrieval, but does not state prerequisites (e.g., 'use when you have a UUID from a previous search') or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_artifactsARead-onlyInspect
List artifacts in a directory.
Returns the immediate contents of a directory (not recursive).
Separates folders and files for easy navigation.
Args:
path_prefix: Directory path to list (default: "/")
Returns:
Formatted directory listing or error message
Examples:
>>> await list_artifacts("/")
"📂 Artifacts at /\n\nFolders:\n 📁 project/\n 📁 docs/\n\nFiles:\n 📄 readme.md (1024 bytes)\n 📄 LICENSE (1067 bytes)"| Name | Required | Description | Default |
|---|---|---|---|
| path_prefix | No | / |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds valuable operational details beyond annotations: non-recursive scope limitation, folder/file separation in output, and error message possibility. However, fails to address the 'openWorldHint: true' annotation (external resource access implications). Matches 'readOnlyHint: true' with read-only 'List' operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured docstring format with clear Args/Returns/Examples sections. Front-loaded purpose statement. Example output, while lengthy, efficiently demonstrates the emoji formatting and directory structure without wasted prose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriately complete for a simple 1-parameter tool with output schema. The example output is particularly valuable for revealing the emoji-formatting convention (📂📁📄) and size metadata that the formal schema doesn't capture.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Effectively compensates for 0% schema description coverage via Args section explaining 'path_prefix' as 'Directory path to list' with default value. Clear semantics provided where schema lacks descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'List' with specific resource 'artifacts' and context 'in a directory'. Distinguishes from 'retrieve_artifact' (which gets content) by focusing on directory enumeration, though could explicitly contrast with siblings like 'explore'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides behavioral constraints like 'immediate contents (not recursive)' which implicitly guide usage depth. However, lacks explicit 'when to use vs alternatives' guidance (e.g., when to prefer over 'retrieve_artifact' for file contents).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_contextsARead-onlyInspect
List available context checkpoints.
Shows all saved contexts available for multi-agent workflows.
Args:
limit: Maximum number of contexts to return (default 20, max 100)
offset: Number of contexts to skip for pagination (default 0)
name_pattern: Filter contexts by name (case-insensitive substring match)
include_descriptions: Include full descriptions in output (default False for compact listing)
ctx: MCP context (automatically provided)
Returns:
Dict with list of available contexts and their details
Examples:
>>> await list_contexts()
{'success': True, 'total': 3, 'contexts': [...]}
>>> await list_contexts(limit=5, name_pattern='investigation')
{'success': True, 'total': 2, 'contexts': [...]}| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| offset | No | ||
| name_pattern | No | ||
| include_descriptions | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true and openWorldHint=true, which the description supports by using read-only verbs ('List', 'Shows'). The description adds valuable behavioral context: pagination support (limit/offset), filtering capabilities (name_pattern), output verbosity control (include_descriptions), and the checkpoint concept for multi-agent workflows—none of which are in annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description follows a structured docstring format (summary, Args, Returns, Examples) appropriate for compensating poor schema coverage. While lengthy due to manual parameter documentation and examples, every section earns its place. The key information is front-loaded in the first sentence.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 4 parameters with 0% schema coverage, the description adequately documents inputs. The Returns section is somewhat vague ('Dict with list...'), though the Examples section compensates by showing actual return structure. The presence of output schema means the description doesn't need exhaustive return documentation, but parameter coverage is complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description carries the full burden of parameter documentation. It excellently compensates by providing detailed semantics for all 4 parameters: limit (with default 20 and max 100), offset (pagination), name_pattern (case-insensitive substring matching), and include_descriptions (compact vs full output). Note: description mentions a 'ctx' parameter not present in schema, but documents the actual 4 parameters comprehensively.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb and resource ('List available context checkpoints') and clarifies these are 'saved contexts available for multi-agent workflows.' It effectively distinguishes from siblings like save_context (creation) and restore_context (mutation) by focusing on read-only listing, though it doesn't explicitly contrast with list_artifacts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides usage examples showing filtering and pagination patterns, but lacks explicit guidance on when to use this tool versus alternatives like list_artifacts or save_context. No 'when-not-to-use' or prerequisite conditions are specified despite having multiple sibling context/artifact management tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recallARead-onlyInspect
Recall relevant information.
Uses hybrid search to find relevant memories, documents, and connections.
Args:
query: What to search for
source_type: Optional filter ('memory', 'document', or None for all)
tags: Optional list of tags to filter by (OR logic - memories with ANY of these tags)
start_date: Optional filter for memories created on or after this date (ISO 8601: '2025-01-01')
end_date: Optional filter for memories created on or before this date (ISO 8601: '2025-01-09')
limit: Maximum results to return (default 10, max 100)
include_inactive: Include superseded/contradicted memories (default: false)
ctx: MCP context (automatically provided)
Returns:
Dict with success status, query, found count, and memories list
Examples:
>>> await recall("Python error handling")
{'success': True, 'found': 3, 'memories': [...]}
>>> await recall("documentation", source_type="document", limit=5)
{'success': True, 'found': 2, 'memories': [...]}
>>> await recall("debugging", tags=["python"])
{'success': True, 'found': 2, 'memories': [...]} # Only memories tagged with 'python'
>>> await recall("project updates", start_date="2025-01-01", end_date="2025-01-07")
{'success': True, 'found': 5, 'memories': [...]} # Only memories from that week
Note: Document chunks include surrounding context automatically (2 chunks before/after).
Document results also include source_type="document", filename, document_title, and document_id
when available, making it easy to identify which document a result came from.| Name | Required | Description | Default |
|---|---|---|---|
| tags | No | ||
| limit | No | ||
| query | Yes | ||
| end_date | No | ||
| start_date | No | ||
| source_type | No | ||
| include_inactive | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond the readOnlyHint annotation, the description adds valuable behavioral context: it explains the hybrid search methodology, documents that inactive memories are excluded by default, clarifies the OR logic for tags, and discloses that document results include surrounding context chunks (2 before/after).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear sections (intro, Args, Returns, Examples, Notes). Front-loaded with purpose. The 4 examples are slightly lengthy but provide concrete value given the filtering complexity. Every section conveys necessary information without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 7 parameters with no schema descriptions and the presence of an output schema, the description is remarkably complete: it documents all inputs, summarizes return values, provides multiple usage examples, and explains edge cases (inactive memory handling, document chunk context).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage (only titles provided), the Args section fully compensates by documenting all 7 parameters with clear semantics, formats (ISO 8601 for dates), defaults (e.g., limit default 10), and logic (OR for tags).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool uses 'hybrid search to find relevant memories, documents, and connections' (specific verb+resources), but does not explicitly distinguish when to use 'recall' versus the sibling 'search' tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit 'when to use' guidance is provided, but the extensive examples section demonstrates implied usage patterns for various filtering scenarios (by date, tags, source type), which helps users understand applicable contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
reflectARead-onlyInspect
Reflect on recent thoughts and patterns.
Analyzes recent activity to identify patterns, topics, and insights.
Useful for understanding "what have I been thinking about?"
By default, only returns user-created memories (not document chunks).
Set include_documents=True to also include chunks from uploaded documents.
⚠️ EXPERIMENTAL:
- Importance weighting in results not yet implemented. Importance scores are stored but don't affect ranking.
Args:
time_window: Time period to analyze ('recent', 'today', 'week', 'month', '1d', '7d', '30d', '90d')
include_documents: Whether to include document chunks (default: False, only user memories)
start_date: Filter memories created on or after this date (ISO 8601: '2025-01-01' or '2025-01-01T00:00:00Z')
end_date: Filter memories created on or before this date (ISO 8601: '2025-01-09' or '2025-01-09T23:59:59Z')
ctx: MCP context (automatically provided)
Returns:
Dict with analysis including top memories, active topics, patterns
Examples:
>>> await reflect("recent")
{'success': True, 'memories_analyzed': 50, 'active_topics': [...], ...}
>>> await reflect("week", include_documents=True)
{'success': True, 'memories_analyzed': 150, ...} # includes document chunks
>>> await reflect(start_date="2025-01-01", end_date="2025-01-07")
{'success': True, 'memories_analyzed': 25, ...} # memories from first week of January| Name | Required | Description | Default |
|---|---|---|---|
| end_date | No | ||
| start_date | No | ||
| time_window | No | recent | |
| include_documents | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
While annotations establish readOnlyHint=true (safe to call), the description adds crucial behavioral context: the experimental warning about non-functional importance weighting, the distinction between user memories vs document chunks by default, and the return structure (Dict with analysis).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with clear sections (tagline, usage, warning, Args, Returns, Examples) and front-loaded purpose. It is lengthy due to necessary parameter documentation (given empty schema) and examples, but every section earns its place with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, the description appropriately provides a high-level overview of returns rather than detailed fields. It covers parameters, behavior, experimental status, and includes concrete examples. Only missing explicit sibling differentiation to be complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage (only titles present), the description fully compensates by documenting all 4 parameters in the Args section with specific formats (ISO 8601), enum-like values for time_window, and semantic meaning (e.g., 'only user memories' for include_documents).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool 'Analyzes recent activity to identify patterns, topics, and insights' using specific verbs and resources. However, it does not explicitly differentiate from siblings like 'recall' or 'search' beyond implying analytical vs. retrieval behavior.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It provides context via 'Useful for understanding 'what have I been thinking about?'' and explains when to set include_documents=True. However, it lacks explicit guidance on when to use this versus alternatives like 'search' or 'recall', or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
restore_contextARead-onlyInspect
Resume work from a saved cognitive context.
This provides a narrative briefing to quickly orient you to:
- The investigation that was in progress
- Key discoveries and insights made
- Current hypotheses being tested
- Open questions and blockers
- Suggested next steps
- All relevant memories with their connections
The briefing reconstructs the cognitive state, not just the data. You'll understand
not just WHAT was discovered, but WHY it matters and HOW the understanding evolved.
Example of what you'll receive:
"[API Timeout Investigation - Resuming after 2 hours]
SITUATION: You were investigating production API timeouts that occur at exactly batch_size=100.
This investigation started when user reported timeouts only in production, not staging.
PROGRESS MADE:
- Identified sharp cutoff at 100 items (not gradual degradation)
- Disproved connection pool theory (monitoring showed only 43/200 connections used)
- Found root cause: MAX_BATCH_SIZE=100 hardcoded in batch_handler.py:147
- Confirmed staging uses different config override (MAX_BATCH_SIZE=500)
EVIDENCE CHAIN:
User report → Reproduced locally → Noticed batch_size correlation → Searched codebase for
limits → Found MAX_BATCH_SIZE → Checked staging config → Discovered config difference
CORRECTED MISUNDERSTANDINGS:
- Initially thought it was Redis connection exhaustion (disproven by monitoring)
- Assumed gradual performance degradation (actually sharp cutoff)
- Thought staging/production were identical (config differs)
CURRENT HYPOTHESIS: Production deployment uses default MAX_BATCH_SIZE=100 from code, while
staging has environment variable override. Fix requires either code change or prod config update.
BLOCKED ON: Need production deployment access to apply fix. User considering whether to
change code default or add production environment variable.
RECOMMENDED NEXT STEPS:
1. Verify production environment variables (check if MAX_BATCH_SIZE is set)
2. If not set, add MAX_BATCH_SIZE=500 to production config
3. If code change preferred, update default in batch_handler.py
4. Run load test with batch_size=100-500 range to verify fix
KEY MEMORIES FOR REFERENCE:
- 'Initial timeout report from user' - Starting point of investigation
- 'MAX_BATCH_SIZE discovery' - Root cause identification
- 'Redis monitoring data' - Evidence disproving connection theory
- 'Staging config analysis' - Explanation for environment difference"
This cognitive handoff ensures you can continue the work with full understanding of
the problem space, previous attempts, and current direction. The narrative preserves not
just facts but the reasoning process, mistakes made, and lessons learned.
SPECIAL CASE: restore_context("awakening")
The name "awakening" is reserved for loading the user's personality configuration.
This loads the Awakening Briefing which includes:
- Selected persona identity and voice style
- Custom personality traits (Premium+ users)
- Any quirks and boundaries from the persona preset
Args:
name: Name or ID of context to restore. Can be:
- Context name (exact match, case-sensitive)
- Context UUID (from list_contexts output)
- "awakening" for personality briefing
limit: Maximum number of memories to restore (default 20)
ctx: MCP context (automatically provided)
Returns:
Dict with:
- success: Whether restoration succeeded
- description: The cognitive handoff briefing
- memories: List of relevant memories
- context_id: The restored context identifier| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | ||
| limit | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Exceptional transparency beyond annotations. The extensive JSON example showing the '[API Timeout Investigation...]' output format precisely demonstrates what the agent receives. Explains the cognitive model (reconstructs understanding, not just facts). Clarifies the 'awakening' special case behavior (personality config vs investigation context). Adds significant value beyond the readOnlyHint annotation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Lengthy but justified given complexity. Structure is logical: purpose → value proposition → concrete output example → cognitive explanation → special case → Args/Returns. The large example is necessary to demonstrate the narrative output format. Could be tightened slightly but every section earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complete for a complex cognitive tool. Covers the narrative reconstruction concept, special awakening case, parameter semantics, and return structure. Given has_output_schema=true, the description appropriately focuses on explaining the cognitive model rather than return value mechanics, while still providing the example output.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Fully compensates for 0% schema coverage. Documents 'name' parameter extensively: accepts exact match (case-sensitive), UUID from list_contexts, or special value 'awakening'. Documents 'limit' as memory count cap. References sibling tool list_contexts for UUID discovery, aiding tool chaining decisions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent clarity with 'Resume work from a saved cognitive context' - specific verb (resume) + resource (cognitive context). Explicitly distinguishes from siblings like 'awaken' via the 'SPECIAL CASE: restore_context(awakening)' section, and differentiates from simple memory recall by emphasizing narrative reconstruction vs data retrieval.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Clear context via the detailed investigation example showing when to use (resuming complex multi-step work). Implies distinction from 'recall' (which returns raw memories) by emphasizing narrative briefing, but lacks explicit 'when not to use' or direct comparison. The awakening special case provides concrete alternative routing.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
retrieve_artifactARead-onlyInspect
Retrieve an artifact from storage.
Fetches the content of a previously saved artifact.
Args:
path: Full path of the artifact (e.g., "/project/docs/api.md")
Returns:
Artifact content or error message
Examples:
>>> await retrieve_artifact("/readme.md")
"# README\nThis is the readme content..."| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true (safe read). Description adds context that artifacts are stored and retrieved by path, mentions 'error message' for failure modes. Does not address openWorldHint implications or whether paths areabsolute/relative.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Docstring structure is readable but verbose. The 'Returns' section is redundant given output schema exists. Examples are helpful for a retrieval tool, though the Returns section could be omitted.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter retrieval tool. Description covers the undocumented parameter via Args section. Output schema handles return values, though 'error message' mention hints at error handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Excellent compensation for 0% schema coverage: describes the path parameter as 'Full path' with concrete example '/project/docs/api.md'. Agents can understand the expected format despite empty schema description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Retrieve'/'Fetches') and resource ('artifact') with scope ('from storage'). The phrase 'previously saved' helps distinguish from save_artifact, though it could clarify distinction from siblings like fetch/recall.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'previously saved artifact' implies usage constraints (only for existing saved artifacts), providing implied guidance. However, lacks explicit when-to-use versus alternatives like fetch or recall.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
save_artifactAInspect
Save an artifact to storage.
Stores user-created content (diagrams, notes, code) in an organized
file structure. Content is also indexed for search.
Args:
content: File content to save
path: Full path including filename (e.g., "/project/docs/api.md")
Returns:
Success message or error description
Examples:
>>> await save_artifact("# README", "/readme.md")
"✅ Artifact saved: /readme.md (8 bytes)"
>>> await save_artifact("<svg>...</svg>", "/diagrams/architecture.svg")
"✅ Artifact saved: /diagrams/architecture.svg (image/svg+xml, 45 bytes)"| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | ||
| content | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=false and openWorldHint=true; the description adds valuable behavioral context beyond these hints, specifically noting that content is 'indexed for search' and stored in an 'organized file structure'. It also documents return patterns (success messages vs errors) not covered by annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Follows a clear docstring structure (summary, Args, Returns, Examples) with the purpose front-loaded in the first sentence. The examples are somewhat verbose (Python REPL style) but serve as practical documentation. No wasted sentences in the descriptive sections.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 2-parameter tool with no nested complexity, the description is complete. It compensates for missing schema descriptions via Args documentation, includes return value documentation (addressing the output schema presence), and provides concrete usage examples showing expected return formats.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the Args section compensates effectively by documenting both parameters ('File content to save', 'Full path including filename') with a concrete path example. Examples further illustrate usage patterns, though detailed format constraints for content types could be clearer.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Save an artifact to storage' with specific resource types (diagrams, notes, code) and mentions indexing behavior. It distinguishes from retrieve_artifact and delete_artifacts siblings implicitly through the verb 'save', but does not differentiate from similar storage siblings like 'store' or 'save_context'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description characterizes suitable content types (user-created diagrams, notes, code), it provides no explicit guidance on when to choose this tool over similar storage siblings like 'store', 'save_context', or 'update_memory', or prerequisites for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
save_contextAInspect
Save your cognitive state for handoff to another agent.
Include your investigation context:
- What session/investigation is this part of?
- What role/perspective were you taking?
- Who might pick this up next? (another Claude, human, Claude Code?)
Reference specific memories that matter:
- Key discoveries (with memory IDs or quotes)
- Critical evidence memories
- Important questions that were raised
- Hypotheses that were tested
Before saving, organize your thoughts:
1. PROBLEM: What were you investigating?
2. DISCOVERED: What did you learn for certain? (reference the memories)
3. HYPOTHESIS: What do you think is happening? (cite supporting memories)
4. EVIDENCE: What memories support or contradict this?
5. BLOCKED ON: What prevented further progress?
6. NEXT STEPS: What should be investigated next?
7. KEY MEMORIES: Which specific memories are essential for understanding?
Example descriptions:
"[API Timeout Investigation - 3 hour session] Investigating production API timeouts as code
analyst. Found correlation with batch_size=100 due to hardcoded limit in batch_handler.py
(see memory: 'MAX_BATCH_SIZE discovery'). Confirmed not Redis connection issue - monitoring
showed only 43/200 connections used (memory: 'Redis connection analysis'). Earlier hypothesis
about connection pool exhaustion (memory_id: abc-123) was disproven. Key insight came from
comparing 99 vs 100 batch behavior (memory: 'batch threshold testing'). Blocked on: need
production access to verify fix. Next: Deploy with MAX_BATCH_SIZE=200 to staging first.
Essential memories for handoff: 'MAX_BATCH_SIZE discovery', 'Redis monitoring results',
'Production vs staging comparison'. Ready for handoff to SRE team for deployment."
"[Memory System Debugging - From Claude Code perspective] Worked on scoring issues where
recall wasn't finding recent memories. Discovered RRF scores (0.005-0.016) were below MCP
threshold of 0.05 (memory: 'RRF scoring analysis'). Implemented weighted linear fusion to
replace RRF (memory: 'fusion algorithm implementation'). Testing showed immediate improvement
(memory: 'fusion testing results'). This builds on earlier investigation about recall failures
(memory: 'user report of recall issues'). Critical memories for continuation: 'RRF scoring
analysis', 'ADR-023 decision', 'fusion testing results'. Next agent should verify scoring
with real queries."
"[Context Save/Restore Bug Investigation - 4 hour debugging session with user] Started with
user noticing list_contexts returned empty despite saved contexts existing. Investigation
revealed two critical bugs: (1) list_contexts was using hybrid search for 'checkpoint' word
instead of filtering by memory_type (memory: 'hybrid search misuse discovery'), (2)
restore_context hardcoded limit of 10 memories despite contexts having 20+ (memory:
'hardcoded limit bug'). Root cause analysis showed save_context grabs 20 most recent memories
regardless of relevance - fundamental design flaw (memory: 'save_context design flaw analysis').
EVIDENCE CHAIN: User reported empty list -> checked DB, contexts exist -> examined
list_contexts code -> found hybrid search looking for word 'checkpoint' -> tested /memories
endpoint with memory_type filter -> confirmed working -> implemented fix using direct endpoint.
INSIGHTS: The narrative description is doing 90% of cognitive handoff work. Memories are
supporting evidence, not primary carriers of understanding (memory: 'narrative vs memories
insight'). This suggests doubling down on narrative richness rather than perfecting memory
selection.
CORRECTED UNDERSTANDING: Initially thought memories weren't being returned. Actually they
were, just wrong ones - recent memories instead of relevant ones (memory: 'memory selection
correction').
CRITICAL MEMORIES: 'hybrid search misuse discovery', 'save_context design flaw analysis',
'narrative vs memories insight', '/memories endpoint test results'.
NEXT AGENT: Should implement Phase 2 - semantic search for relevant memories within
investigation timeframe. Ready for handoff to any Claude agent for implementation."
When referencing memories:
- **RELIABLE** — Use memory IDs: "memory_id: abc-123" (direct lookup, always works)
- **BEST-EFFORT** — Use descriptive phrases: "see memory: 'Redis connection analysis'"
(uses search + substring matching, may not resolve if the memory isn't in top results)
- Group related memories: "Essential memories: 'X', 'Y', 'Z'"
**Prefer memory_id references** whenever you have the UUID. Semantic phrase references
are a convenience that works most of the time, but may silently fail to resolve.
The response will tell you how many references resolved so you can retry with UUIDs
if needed.
Args:
name: Name for this context checkpoint
description: Detailed cognitive handoff description with memory references
ctx: MCP context (automatically provided)
Returns:
Dict with success status, context_id, and memories included| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | ||
| description | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds important behavioral details beyond annotations: explains memory resolution behavior (references may silently fail, response includes resolution count), mentions the 20-memory limit in examples, and describes return structure (Dict with success, context_id, memories). Minor gap: doesn't clarify idempotency or name collision behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear sections (bullet points, numbered list, formatted examples), but extremely verbose. The three full investigation examples, while illustrative, consume significant tokens. Content is valuable but could be more concise—agent must parse large text to extract the 7-step structure essential for usage.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a complex mutation tool. Compensates for zero schema descriptions with extensive Args documentation. Covers inputs (with examples), outputs (return dict structure), behavioral constraints (memory limits, resolution logic), and operational guidance (retry with UUIDs). Appropriate given the cognitive complexity of the handoff task.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 0% schema coverage, the Args section extensively documents both user-facing parameters. 'name' is clearly defined as checkpoint identifier. 'description' documentation is exceptional with multi-paragraph explanation, formatting requirements, and three full examples. Deducted one point because 'ctx' is mentioned but unclear if it's user-facing, and schema lacks descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specific purpose: 'Save your cognitive state for handoff to another agent' provides clear verb (save), resource (cognitive state), and scope (handoff). Distinguishes from siblings (save_artifact vs save_context, recall vs restore_context) by emphasizing session investigation handoffs and the narrative structure required.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Exceptional guidance with explicit when-to-use (handoff situations), structured 7-step template (PROBLEM through KEY MEMORIES), and three detailed examples showing exact formatting. Also specifies memory reference syntax (memory_id vs descriptive phrases) and reliability differences.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
searchBRead-onlyInspect
Search for memories.
Returns results with proper citation support (id, title, url, text fields).
Args:
query: Search query
limit: Maximum results (default 10)
include_inactive: Include superseded/contradicted memories (default: false)
ctx: MCP context
Returns:
Dict with 'results' array containing id, title, url, text fields| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | Yes | ||
| include_inactive | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds valuable context beyond annotations: explains that inactive memories are 'superseded/contradicted', describes the citation support format (id, title, url, text), and outlines the return structure. Complements readOnlyHint and openWorldHint well without contradicting them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Uses standard Args/Returns structure which is readable, but wastes space documenting 'ctx' which is not present in the input schema. Returns section is redundant given the output schema exists.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a read-only search tool: explains the citation-rich output and inactive memory semantics. However, leaves gaps on sibling differentiation and doesn't elaborate on openWorldHint implications (e.g., external vs. internal search).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema coverage, the description partially compensates: 'include_inactive' is excellently documented with semantic meaning, 'limit' mentions its purpose and default, but 'query' merely repeats the parameter name without adding syntax or format guidance.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States 'Search for memories' with specific verb and resource. Clear about the core function, but does not explicitly differentiate from similar siblings like 'recall', 'explore', or 'fetch'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use 'search' versus 'recall' or other retrieval siblings. No mention of prerequisites or when to set include_inactive to true.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
storeAInspect
Store important information from your work.
Write detailed, complete thoughts with context, reasoning, and evidence.
**Always use the connect tool** to link related items - this builds knowledge graphs for better recall.
## Memory Types (auto-detected, but be aware):
- **FACT**: Something observed or verified
- **INSIGHT**: A pattern or realization
- **CONVERSATION**: Dialogue or exchange content
- **CORRECTION**: Fixing prior understanding
- **REFERENCE**: Source material or citation
- **TASK**: Action item or work to be done
- **CHECKPOINT**: Conversation state snapshot
- **IDENTITY_CORE**: Immutable AI identity
- **PERSONALITY_TRAIT**: Evolvable AI traits
- **RELATIONSHIP**: User-AI relationship info
- **STRATEGY**: Learned behavior patterns
## Session Context
If in an ongoing work session, include:
- Session identifier: [Project/Session Name]
- Your perspective: "As [role]:" or "From [viewpoint]:"
- Current thread: What specific angle you're exploring
## What to Include
- **WHAT**: The discovery or thought
- **WHY**: Its significance
- **HOW**: Your reasoning process
- **EVIDENCE**: Supporting data/observations
- **CONNECTIONS**: Related memories to link
## Examples
### Technical Investigation
"[Performance Analysis] FACT: Database queries account for 73% of request latency
(measured across 10K requests). Specifically, the user_permissions JOIN takes 340ms
average. This contradicts hypothesis about caching issues (memory: 'cache analysis').
Evidence: APM traces show full table scan on permissions table. Next: investigate
missing index on foreign key."
### Learning & Research
"[ML Study Session] INSIGHT: Attention mechanisms work like dynamic routing - the model
learns WHERE to look, not just WHAT to see. This explains transformer advantages over
RNNs on long sequences (builds on memory: 'sequence modeling comparison'). The key-query-
value structure creates a learnable addressing system. Connects to: 'human attention
research', 'information retrieval basics'."
### Creative Work
"[Story Development] HYPOTHESIS: The protagonist's reluctance stems from betrayal, not
fear. Evidence: Three trust-questioning scenes, locked door symbolism throughout,
deflection patterns in collaborative dialogue. This reframes the arc from 'overcoming
fear' to 'rebuilding trust' (corrects memory: 'initial character motivation'). Would
explain the guardian's patience and emphasis on small victories."
### Problem Solving
"[Bug Hunt - Payment Flow] CORRECTION to 'timezone hypothesis': The 3am failures aren't
timezone-related but due to batch job lock contention. Evidence: Perfect correlation with
backup_jobs.log timestamps. The timezone pattern was spurious - batch runs at midnight
PST (3am EST). Solution: implement job queuing."
## Connection Phrases
- "Building on [earlier observation]..."
- "Contradicts [hypothesis in memory X]"
- "Answers [question from session Y]"
- "Confirms pattern from [memory Z]"
- "Extends thinking in [previous work]"
Note: Every stored item is a node. Every connection is an edge. Rich graphs enable powerful recall.
⚠️ EXPERIMENTAL FIELDS:
- **importance**: Stored for future ranking optimization. Currently not integrated into search results.
- **confidence**: Returned in response for analysis. Behavior and calculation method subject to change.
Args:
content: Detailed memory content with context and evidence
tags: Optional tags to categorize the memory
importance: Optional importance score (0.0-1.0) - EXPERIMENTAL
ctx: MCP context (automatically provided)
Returns:
Dict with success status, memory_id, type, importance, and confidence| Name | Required | Description | Default |
|---|---|---|---|
| tags | No | ||
| content | Yes | ||
| importance | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=false and openWorldHint=true. Description adds critical behavioral context: memory types are auto-detected (not required), experimental nature of importance/confidence fields with explicit warnings ('Currently not integrated into search results'), and graph semantics ('Every stored item is a node').
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely verbose with 11 auto-detected memory types listed, 4 large examples, and multiple sections. While examples are valuable for a semantic tool, the memory type enumeration is redundant given auto-detection, and Args section is buried. Every sentence has value, but there are too many sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive coverage appropriate for a complex semantic memory tool. Explains domain model (memory types, connections, session context), return values ('Dict with success status...'), and integration with sibling tools. Rich output schema exists but description provides necessary domain context beyond raw schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 0% description coverage. Description compensates via Args section documenting all 3 parameters: content ('Detailed memory content'), tags ('Optional tags'), and importance ('Optional importance score 0.0-1.0'). Deducted one point because parameter docs are buried at the end of a lengthy description rather than prominent.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Opens with specific verb+resource ('Store important information') and clearly distinguishes from siblings like 'save_artifact' (files) vs semantic memories, and 'recall'/'search' (retrieval). Explicitly frames the tool as building knowledge graph nodes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly references sibling tool 'connect' ('Always use the connect tool to link related items'), establishing workflow relationship. Provides context on when to include session identifiers. Lacks explicit 'when not to use' contrast with similar storage siblings like 'save_artifact'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_memoryAInspect
Update an existing memory.
Modifies properties of a stored memory by its UUID.
Args:
memory_id: UUID of memory to update
content: New content (optional)
importance: New importance score (optional, 0.0-1.0)
tags: New tags (optional, replaces existing tags)
ctx: MCP context (automatically provided)
Returns:
Dict with success status and updated memory_id
Examples:
>>> await update_memory("uuid-here", importance=0.9)
{'success': True, 'memory_id': 'uuid-here'}
>>> await update_memory("uuid-123", tags=["python", "errors", "important"])
{'success': True, 'memory_id': 'uuid-123'}| Name | Required | Description | Default |
|---|---|---|---|
| tags | No | ||
| content | No | ||
| memory_id | Yes | ||
| importance | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate mutating (readOnlyHint:false) and external (openWorldHint:true). Description adds crucial behavioral details: partial updates allowed (optional args), tags 'replace' existing (destructive warning), return dict structure, and concrete examples.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with summary, Args, Returns, and Examples. Slightly verbose due to Args block, but justified by zero schema coverage. Examples add value without excessive length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for 4-param mutation tool with output schema present. Covers inputs, partial update semantics, return structure, and usage examples. Lacks only explicit error handling details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 0% description coverage (only titles). Description fully compensates via Args block documenting all 4 parameters: memory_id format, content meaning, importance range (0.0-1.0), and tags behavior. Essential given schema lacks descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Update/Modifies), resource (memory), and identifier (UUID). Clear scope but does not explicitly contrast with sibling 'store' (likely for creation) beyond implying 'existing' memory.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage by requiring 'existing memory' and UUID, but lacks explicit 'when to use vs alternatives' guidance. No mention of prerequisites (e.g., memory must exist) or when to prefer this over recreating.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail — every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control — enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management — store and rotate API keys and OAuth tokens in one place
Change alerts — get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption — public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics — see which tools are being used most, helping you prioritize development and documentation
Direct user feedback — users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!