anneal-memory

Server Quality Checklist

Profile completionA complete profile improves this server's visibility in search results.

Disambiguation5/5
Each tool represents a distinct phase of the episodic memory lifecycle with clear boundaries: recording single events, querying history, initiating session wrap, finalizing compressed state, and health monitoring. No functional overlap exists.
Naming Consistency3/5
Mixed patterns: prepare_wrap and save_continuity follow verb_noun convention, while recall and record use bare verbs, and status is a noun. All use snake_case, but grammatical patterns vary significantly between tools.
Tool Count5/5
Five tools appropriately cover the scoped domain of session-based memory management with compression: write, query, lifecycle start, lifecycle end, and diagnostics. No bloat, redundancy, or obvious missing pieces for the specific annealing workflow.
Completeness4/5
Covers the core lifecycle well, though lacks explicit tools for deleting or amending individual episodes (may be intentionally append-only), and retrieving current continuity content without marking a wrap in-progress requires using prepare_wrap.
Average 4.4/5 across 5 of 5 tools scored.
See the tool scores section below for per-tool breakdowns.
This repository includes a README.md file.
This repository includes a LICENSE file.
Latest release: v0.1.4
No tool usage detected in the last 30 days. Usage tracking helps demonstrate server value.
Tip: use the "Try in Browser" feature on the server page to seed initial usage.
This repository includes a glama.json configuration file.
View server inspector
This server provides 5 tools. View schema
No known security issues or vulnerabilities reported.
Report a security issue
This server has been verified by its author.
Add related servers to improve discoverability.

Tool Scores

Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full disclosure burden. It successfully explains the accumulation behavior (episodes collect during session) and ultimate destination (compression into continuity file), but lacks operational details like idempotency, memory limits, concurrency, or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences with zero waste: sentence 1 defines the action, sentence 2 lists trigger conditions, sentence 3 gives content guidance with concrete example, sentence 4 explains session lifecycle. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 100% schema coverage and no annotations/output schema, the description appropriately focuses on systemic context—explaining how episodic recording fits into the broader continuity workflow. Missing only technical constraints (rate limits, max episode size) to achieve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage, establishing baseline 3. The description adds usage context ('Chose X because Y') that informs the content parameter but does not elaborate syntax, formatting rules, or dependencies beyond what the schema already documents.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb ('Record') and resource ('typed episode to memory'). It clearly distinguishes from siblings by explaining that episodes accumulate during a session and serve as raw material for compression into the continuity file at session end, positioning it distinctly from save_continuity and recall.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit positive guidance on when to call ('when important decisions are made, patterns are noticed...') and content quality expectations ('Record the reasoning, not just the fact'). Lacks explicit exclusions or named alternative tools (e.g., when to use recall instead).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral burden. It successfully discloses return ordering ('ordered by timestamp, newest first') and return type ('matching episodes'). However, it omits explicit read-only safety assertions or rate limit warnings that would help an agent understand operational constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Four well-structured sentences: purpose (1), usage scenarios (1), return behavior (1), filter summary (1). Zero redundancy; every sentence earns its place with distinct information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 7-parameter query tool with no output schema, the description adequately covers return format and ordering. It could improve by defining the domain-specific term 'episode' or describing behavior when no matches are found, but is sufficient for invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description maps conceptual filter categories ('time range, type, source, and keyword') to parameters but adds no syntax, format details, or semantic constraints beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with 'Query episodes from memory with filters' — a specific verb (Query), resource (episodes), and scope. It clearly distinguishes from sibling 'record' (which implies write/create) by emphasizing retrieval.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use scenarios: 'find prior context before making decisions,' 'locate specific episodes for citation during graduation,' and 'review recent work.' However, it lacks explicit when-not-to-use guidance or naming alternatives like 'use record instead for writing'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior5/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, description carries full disclosure burden and succeeds. It explicitly states the side effect 'Marks a wrap as in-progress', details the complex return structure (episodes, continuity file, stale warnings, compression instructions), and explains the cognitive purpose of the compression step.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with purpose and trigger conditions. Five sentences total, each earning its place—including the philosophical note about 'real thinking' which clarifies the abstract compression concept. Slightly verbose but justified given lack of output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Compensates well for missing annotations and output schema by describing return values and workflow. Covers domain concepts (episodes, continuity files, stale patterns). Minor gap: doesn't specify error conditions or in-progress state conflicts.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage for both parameters (max_chars and staleness_days). Description does not repeat parameter details, which is appropriate when the schema is self-documenting. Baseline score applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Prepare' with clear resource 'compression package for session wrap'. It effectively distinguishes from siblings by contrasting with save_continuity (which persists the data) and implying this handles session boundaries vs. ongoing recording.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Excellent explicit guidance: 'Call this at session boundaries — when work is ending, the user says to wrap up, or the session is getting long'. Also provides clear workflow sequence mentioning save_continuity as the required follow-up step, clarifying the tool chain.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Compensates well by extensively detailing return payload (episode counts, wrap history, continuity file size, wrap status). Implies read-only nature via 'Get,' though could explicitly state 'safe to call' given the mutation siblings context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Three efficient clauses: purpose statement ('Get memory health metrics'), usage timing ('Call this at session start...'), and return value specification ('Returns episode counts...'). Front-loaded with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Rich description compensates for missing output schema by exhaustively listing return fields. Explains domain-specific concepts (wrap, continuity, episodes) necessary to understand tool utility. Complete for a zero-parameter diagnostic tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Zero parameters present. Per calibration rules, 0 params = baseline 4. Description correctly omits parameter discussion since schema requires no input.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Get' + resource 'memory health metrics' clearly identifies this as a diagnostic read operation. Distinguished from action-oriented siblings (prepare_wrap, record, save_continuity) by its observational purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to call: 'at session start' and 'when diagnosing issues.' Provides clear temporal context for invocation. Could be strengthened by explicitly contrasting with mutation siblings, but the verb choice establishes clear separation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral disclosure burden. It comprehensively documents validation logic: structure checking (4 required sections), citation verification against real episodes, explanation overlap detection, citation gaming detection, and consequence handling (demotion of ungrounded graduations). Also specifies return values 'validation results and section sizes' despite lack of output schema. Minor gap: doesn't specify if this operation is idempotent or handles duplicates.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
Five information-dense sentences with zero waste. Front-loaded with action+resource, followed by workflow context, input constraints, validation details, and return values. Each sentence serves a distinct purpose and the structure mirrors the invocation lifecycle.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a validation-intensive tool with no annotations and no output schema, the description admirably compensates by detailing complex validation logic and return value characteristics. Would benefit from explicit mention of error handling behavior or failure modes when validation fails, but otherwise comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% but the description adds crucial semantic detail by enumerating the exact 4 required section headers (## State, ## Patterns, ## Decisions, ## Context) that the schema only refers to generically as 'all 4 required sections'. This specificity is essential for correct invocation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with specific verbs 'Validate and save' and identifies the exact resource 'compressed continuity file'. It explicitly references sibling tool 'prepare_wrap' to situate this tool in the workflow sequence, clearly distinguishing it from recall/record/status siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
Contains explicit temporal guidance 'Call this after compressing your episodes' and cites the prerequisite sibling tool 'prepare_wrap'. Provides clear preconditions (must use prepare_wrap instructions first) and validates specific input structure, effectively establishing when to use this tool versus alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

GitHub Badge

Glama performs regular codebase and documentation scans to:

Confirm that the MCP server is working as expected.
Confirm that there are no obvious security issues.
Evaluate tool definition quality.

Our badge communicates server capabilities, safety, and installation instructions.

Card Badge

Copy to your README.md:

[![anneal-memory MCP server](https://glama.ai/mcp/servers/phillipclapham/anneal-memory/badges/card.svg)](https://glama.ai/mcp/servers/phillipclapham/anneal-memory)

Score Badge

Copy to your README.md:

[![anneal-memory MCP server](https://glama.ai/mcp/servers/phillipclapham/anneal-memory/badges/score.svg)](https://glama.ai/mcp/servers/phillipclapham/anneal-memory)

How to claim the server?

If you are the author of the server, you simply need to authenticate using GitHub.

However, if the MCP server belongs to an organization, you need to first add glama.json to the root of your repository.

{
  "$schema": "https://glama.ai/mcp/schemas/server.json",
  "maintainers": [
    "your-github-username"
  ]
}

Then, authenticate using GitHub.

Browse examples.

How to make a release?

A "release" on Glama is not the same as a GitHub release. To create a Glama release:

Claim the server if you haven't already.
Go to the Dockerfile admin page, configure the build spec, and click Deploy.
Once the build test succeeds, click Make Release, enter a version, and publish.

This process allows Glama to run security checks on your server and enables users to deploy it.

How to add a LICENSE?

Please follow the instructions in the GitHub documentation.

Once GitHub recognizes the license, the system will automatically detect it within a few hours.

If the license does not appear on the server after some time, you can manually trigger a new scan using the MCP server admin interface.

How to sync the server with GitHub?

Servers are automatically synced at least once per day, but you can also sync manually at any time to instantly update the server profile.

To manually sync the server, click the "Sync Server" button in the MCP server admin interface.

How is the quality score calculated?

The overall quality score combines two components: Tool Definition Quality (70%) and Server Coherence (30%).

Tool Definition Quality measures how well each tool describes itself to AI agents. Every tool is scored 1–5 across six dimensions: Purpose Clarity (25%), Usage Guidelines (20%), Behavioral Transparency (20%), Parameter Semantics (15%), Conciseness & Structure (10%), and Contextual Completeness (10%). The server-level definition quality score is calculated as 60% mean TDQS + 40% minimum TDQS, so a single poorly described tool pulls the score down.

Server Coherence evaluates how well the tools work together as a set, scoring four dimensions equally: Disambiguation (can agents tell tools apart?), Naming Consistency, Tool Count Appropriateness, and Completeness (are there gaps in the tool surface?).

Tiers are derived from the overall score: A (≥3.5), B (≥3.0), C (≥2.0), D (≥1.0), F (<1.0). B and above is considered passing.

Latest Blog Posts

Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security
Open Source Has a Bot Problem
By punkpeye on March 19, 2026.
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/phillipclapham/anneal-memory'

If you have feedback or need assistance with the MCP directory API, please join our Discord server