Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
ANTHROPIC_API_KEYNoYour Anthropic API key for cloud summarization. Required if ALLOW_CLOUD_SUMMARIZATION is true.
ALLOW_CLOUD_SUMMARIZATIONNoSet to 'true' to enable cloud summarization using Claude API. Default is 'false'.false

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}

Tools

Functions exposed to the LLM to take actions

NameDescription
retrieveMemoriesA

Semantically search stored memories and return the top matches ranked by a weighted combination of relevance, recency, and importance. Read-only; no side effects.

WHEN TO CALL: (1) At the start of every session — pass the current task or file as the query to pre-load relevant context. (2) Mid-session whenever a new topic, file, or decision area arises that may have prior context. Do NOT call on every user turn.

WHEN NOT TO CALL: If you already retrieved memories for this topic this session. Use getMemory if you have a specific memoryId. Use listMemories only to audit the full store, not for context loading.

Returns up to limit results (default 20). mode='auto' is the standard startup path; mode='on-demand' signals an explicit mid-session lookup. depth='deep' runs a broader semantic sweep at higher latency — use when the topic is unfamiliar. Phrase query as what you need to recall, not what you are about to do.

NOTE: this tool updates access-pattern counters on the memories it returns (used to boost frequently-recalled memories in future ranking), so it is NOT side-effect-free despite being a lookup.

storeMemoryA

Persist a single memory unit to the local SQLite store. Accepts decisions, facts, architectural choices, warnings, and session summaries. NOT idempotent — each call creates a new record even with identical content. Writes to disk immediately.

WHEN TO CALL: After any significant decision, discovery, or conclusion that should be available in a future session. Good candidates: technology choices, non-obvious constraints, bug root-causes, architectural decisions, key facts about the codebase.

WHEN NOT TO CALL: For trivial observations, transient state, or content that duplicates what was just retrieved. Do not store entire files or full conversation transcripts.

kind categories: 'decision', 'fact', 'summary', 'warning', 'preference'. Write content to be self-contained — it must be useful without any surrounding conversation context. importance 1-10 (10 = most critical); directly affects retrieval ranking in future sessions.

RESPONSE may include warningCodes: 'session_write_limit_warning' (this session has stored many memories — stop storing trivia and prefer batchStoreMemory) and 'redaction_partial_failure' (a redaction rule errored; the write still succeeded). Treat them as advisory signals, not errors.

listMemoriesA

Return every memory stored for the current project, unfiltered and without ranking. Read-only; no side effects.

WHEN TO CALL: When you need a complete inventory of stored memories — to audit what has been saved, detect duplicates, or build a full summary of all known context.

WHEN NOT TO CALL: For normal context loading at session start — use retrieveMemories instead, which ranks by relevance. listMemories returns the entire store unfiltered and can be very large.

getMemoryA

Fetch a single memory record by its exact ID. Returns the full record: content, kind, importance, timestamps, and session metadata. Read-only; no side effects.

WHEN TO CALL: When you already have a specific memoryId from a prior retrieveMemories or listMemories result and need its full detail.

WHEN NOT TO CALL: For topic-based search — use retrieveMemories for that. This tool requires an exact ID and does not search by content.

forgetMemoryA

Permanently delete a single memory by ID. The record is removed from the local SQLite store immediately and CANNOT be recovered. Destructive and irreversible.

WHEN TO CALL: Only when a memory is known to be incorrect, dangerously outdated, or a duplicate that would mislead future sessions.

WHEN NOT TO CALL: If there is any doubt. A memory that is merely old or low-relevance does not need deletion — retrieval ranking deprioritizes it automatically.

statsA

Return aggregate statistics for the current project: total stored memory count and total ingested session event count. Read-only; no side effects.

WHEN TO CALL: For diagnostic or monitoring purposes — to confirm memories were stored after a session, check store health, or report usage numbers.

WHEN NOT TO CALL: As part of normal context loading. stats returns counts only, not content; use retrieveMemories to load actual context.

resetAccessCountsA

Reset access-pattern counters for all memories in the current project. Sets access_count to 0 and clears last_accessed timestamps without deleting any memories. Useful after large refactors when old access patterns no longer reflect current relevance.

WHEN TO CALL: After major codebase restructuring, project pivots, or when access-boosted rankings no longer reflect current relevance.

WHEN NOT TO CALL: During normal operation — access patterns self-correct as usage shifts.

batchStoreMemoryA

Persist multiple memory units in a single atomic SQLite transaction. Significantly faster than calling storeMemory repeatedly for session-end writes of 10-20 memories.

WHEN TO CALL: At session end or whenever you have multiple memories to store at once. Reduces overhead from per-insert fsync by wrapping all writes in one transaction.

WHEN NOT TO CALL: For a single memory — use storeMemory instead. For imports from external files — use importMemories.

Each item in the memories array follows the same schema as storeMemory (memoryId, sessionId, sourceAdapter, kind, content, importance). Invalid items are reported individually; valid items are still stored atomically.

NOTE: the per-item memory echoed back in the response has its content truncated to 2000 characters (a batch can return many rows). The full body is still persisted — fetch it with getMemory if you need the complete text. (Single-record storeMemory echoes the full content.)

ingestSessionEventsA

Push raw session events (tool calls, decisions, file edits, user turns) to sessionmem so they can be summarized at session end and counted toward token-savings analytics. Writes immediately, in a single transaction. Re-ingesting the same (sessionId, eventIndex) is a no-op, so retries are safe.

WHEN TO CALL: Periodically during a session (e.g. at task boundaries) to record what happened, OR in one batch shortly before the session ends. This is what powers automatic session-end summarization and sessionmem savings.

WHEN NOT TO CALL: For durable, individually-important facts/decisions — use storeMemory for those. Session events are transient raw material for summarization, not first-class memories.

Each event needs: id (unique), eventIndex (monotonic 0-based order within the session), eventType (e.g. 'tool_use', 'user_message'), payloadJson (a JSON string of the event body).

LIMITS: at most 500 events per call. For more than 500 events, call this tool multiple times in chunks — re-ingestion of already-stored events is safe (idempotent via the (project, session, eventIndex) UNIQUE index), so overlapping chunks never double-count.

summarizeSessionToMemoryA

Store an agent-authored session summary as a durable 'summary' memory in one call. Upserts on memoryId, so calling it again with the same memoryId replaces the prior summary rather than duplicating it.

WHEN TO CALL: At session end when you have already written a concise summary of what was accomplished and want to persist it directly (the simpler alternative to handleSessionEnd's automatic summarization).

WHEN NOT TO CALL: When you want sessionmem to generate the summary from ingested session events — use handleSessionEnd for that. For non-summary facts/decisions use storeMemory.

Provide: memoryId (stable id for this session's summary), sessionId, sourceAdapter, summary (the text), importance (1-10; 7 is typical for summaries).

handleSessionEndA

Run the full session-end pipeline: auto-summarize the session's ingested events into a durable memory (when enough events exist) and apply a light retention prune of stale memories. Idempotent on the summary memory (upsert by sessionId).

WHEN TO CALL: Once, at the very end of a session, after ingesting session events via ingestSessionEvents. Lets sessionmem generate and store the session summary for you.

WHEN NOT TO CALL: Mid-session, or when you have already written your own summary (use summarizeSessionToMemory instead). On Claude Code this also runs automatically via the installed SessionEnd hook, so calling it explicitly is usually unnecessary there.

Provide sessionId and sourceAdapter. memoryId (optional) pins the summary's id; omit to derive ${sessionId}-summary. config (optional) tunes autoSummarize / minimumEventThreshold / cloud summarization; omit for sensible local-only defaults.

RESPONSE status is one of: 'stored', 'skipped_threshold' (too few events), 'skipped_disabled', 'failed'. warningCodes may carry cloud/local fallback signals.

fetch_memoriesA

Fallback memory retrieval for hosts that do not support MCP resources. Call this instead of accessing the sessionmem:// resource URI directly when the host lacks resource support. Semantically equivalent to retrieveMemories — returns stored memories ranked by relevance to the query. Read-only; no side effects.

WHEN TO CALL: At session start and mid-session when you need to retrieve context and the host does not support MCP resources. Do not call if the host supports MCP resources — use the sessionmem:// resource URI or retrieveMemories tool instead.

Parameter query: natural-language description of what context you need to recall (e.g. 'API design decisions', 'database schema choices').

startup_inject_memoriesA

Fallback startup-injection for hosts that do not support MCP prompts. Call this once at the very start of a session instead of relying on the automatic sessionmem startup prompt when the host lacks prompt support. Injects the top relevant memories for the current project into the working context. No parameters required.

WHEN TO CALL: Once per session start, before any user task work begins, when the host does not surface MCP prompts automatically. Do not call if the host already surfaces the sessionmem startup prompt — calling both duplicates injected context.

Note: access counts are incremented on retrieval.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/catfish-1234/sessionmem'

If you have feedback or need assistance with the MCP directory API, please join our Discord server