recall
Search memories by text or entity name. Results are re-ranked by relevance, heat, and importance, with explanations. Use at task start to recall prior work.
Instructions
Retrieve memories relevant to the current context using full-text search (BM25) + entity-name match, re-ranked by a composite score (relevance × heat × momentum × importance). Returns only what fits in the token budget, with match_reasons explaining WHY each memory was returned. Opportunistically refreshes stale momentum scores for entities in the result set. Supports pagination via offset/has_more. Layer aliases accepted. Use at the start of any task that might involve prior work.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | What you want to remember (free-text, entity name, or FTS5 MATCH expression) | |
| entity_name | No | Optional — narrow to a specific entity | |
| layer | No | Optional layer filter. Accepts aliases (decisions/warnings/how/etc.) as well as canonical names. | |
| band | No | Optional — only return memories whose heat_band matches. | |
| max_tokens | No | Approx token budget. Default 2000. Either max_tokens or limit stops iteration (whichever fires first). | |
| limit | No | Optional hard cap on number of memories. Stops at min(max_tokens-budget, limit). | |
| offset | No | Skip this many top results (pagination). Use has_more from prior response to decide next offset. | |
| mark_accessed | No | Set false for preview / listing queries that should not bump heat. |
Implementation Reference
- src/mcp/server.ts:323-494 (handler)Main handler for the 'recall' tool. Executes FTS5 full-text search + LIKE fallback, computes composite relevance/heat/momentum/importance score, applies band filtering, pagination, and returns memories with match_reasons explaining why each result was returned.
function handleRecall(args: any): string { const maxTokens = Math.max(100, Number(args.max_tokens ?? 2000)); const approxTokensPerMemory = 100; const tokenBudgetLimit = Math.max(1, Math.floor(maxTokens / approxTokensPerMemory)); const hardLimit = Number.isFinite(args.limit) ? Math.max(1, Math.min(200, args.limit)) : tokenBudgetLimit; const returnLimit = Math.min(tokenBudgetLimit, hardLimit); const offset = Math.max(0, Number(args.offset ?? 0)); const markAccessed = args.mark_accessed !== false; const layer = resolveLayer(args.layer); const band = args.band as string | undefined; // Fetch extra rows for composite re-rank, pagination, and band filtering const fetchLimit = Math.max(returnLimit * 3, 30) + offset; let rows: any[] = []; let searchMethod: 'fts5' | 'like' | 'fts5+like' = 'like'; const ftsQuery = toFtsQuery(args.query ?? ''); const canUseFts = !!ftsQuery && !args.entity_name; if (canUseFts) { const ftsRows = runFtsQuery(ftsQuery, layer, fetchLimit); const likeRows = runLikeQuery(args.query, undefined, layer, fetchLimit); const seen = new Map<number, any>(); for (const r of ftsRows) seen.set(r.id, { ...r, _via: 'fts' }); for (const r of likeRows) { if (seen.has(r.id)) { seen.get(r.id)._via = 'fts+like'; // both matched } else { seen.set(r.id, { ...r, _via: 'like' }); } } rows = Array.from(seen.values()); if (ftsRows.length > 0 && likeRows.length > 0) searchMethod = 'fts5+like'; else if (ftsRows.length > 0) searchMethod = 'fts5'; else searchMethod = 'like'; } else { rows = runLikeQuery(args.query, args.entity_name, layer, fetchLimit).map((r) => ({ ...r, _via: 'like' })); searchMethod = 'like'; } const useFts = searchMethod !== 'like'; const now = Math.floor(Date.now() / 1000); // Opportunistically refresh momentum for entities about to be surfaced refreshStaleMomentum(rows.map((r) => r.entity_id)); // Re-fetch momentum after refresh (cheap single-pass update) if (rows.length > 0) { const ids = Array.from(new Set(rows.map((r) => r.entity_id))); const ph = ids.map(() => '?').join(','); const fresh = db.prepare(`SELECT id, momentum_score FROM entities WHERE id IN (${ph})`).all(...ids) as any[]; const byId = new Map(fresh.map((f) => [f.id, f.momentum_score])); for (const r of rows) r.momentum_score = byId.get(r.entity_id) ?? r.momentum_score; } const bm25Values = rows.map((r) => r.bm25_score); const minBm = Math.min(...bm25Values, 0); const maxBm = Math.max(...bm25Values, 1); const bmSpan = Math.max(0.001, maxBm - minBm); const scored = rows.map((r) => { const daysSince = (now - r.last_accessed_at) / 86400; const heat = computeHeat({ accessesLast30d: daysSince < 30 ? r.access_count : 0, accessesLast90d: daysSince < 90 ? r.access_count : 0, daysSinceLastAccess: daysSince, totalAccesses: r.access_count, baseImportance: r.importance, }); // Individual weight contributions (for transparency) const relevance = useFts && r._via !== 'like' ? 1 - (r.bm25_score - minBm) / bmSpan : 0.5; const heatNorm = heat.score / 100; const momNorm = Math.min(1, (r.momentum_score ?? 0) / 10); const importanceBoost = r.importance; // 0-1 // Composite: give a bit to importance so pinned (>=0.9) memories always rank high const w_rel = 0.45, w_heat = 0.25, w_mom = 0.15, w_imp = 0.15; const composite = w_rel * relevance + w_heat * heatNorm + w_mom * momNorm + w_imp * importanceBoost; // match_reasons: human-readable WHY this row is here const reasons: string[] = []; if (r._via === 'fts' || r._via === 'fts+like') reasons.push(`content_match_${r._via === 'fts+like' ? 'dual' : 'fts'}`); if (r._via === 'like' || r._via === 'fts+like') { if (args.entity_name || (args.query && String(r.entity_name || '').toLowerCase().includes(String(args.query).toLowerCase()))) { reasons.push('entity_name_match'); } else { reasons.push('content_substring'); } } if (heat.band === 'hot') reasons.push('heat:hot'); else if (heat.band === 'warm') reasons.push('heat:warm'); if (r.momentum_score >= 5) reasons.push('entity_active'); if (r.importance >= 0.9) reasons.push('pinned'); else if (r.importance >= 0.8) reasons.push('high_importance'); if (r.protected === 1 && r.layer === 'caveat') reasons.push('caveat_protected'); return { ...r, heat_score: heat.score, heat_band: heat.band, composite_score: composite, relevance_score: relevance, _reasons: reasons, _breakdown: { relevance: Number(relevance.toFixed(3)), heat: Number(heatNorm.toFixed(3)), momentum: Number(momNorm.toFixed(3)), importance: Number(importanceBoost.toFixed(3)), }, }; }); // Apply band filter AFTER scoring (needs heat.band) const filtered = band ? scored.filter((s) => s.heat_band === band) : scored; // Sort by composite filtered.sort((a, b) => b.composite_score - a.composite_score); // Pagination: skip offset, take returnLimit const total = filtered.length; const windowed = filtered.slice(offset, offset + returnLimit); const hasMore = total > offset + windowed.length; // Mark accessed (only for the returned window, and only if asked) if (markAccessed && windowed.length > 0) { const mark = db.prepare('UPDATE memories SET last_accessed_at = ?, access_count = access_count + 1 WHERE id = ?'); const tx = db.transaction((ids: number[]) => { for (const id of ids) mark.run(now, id); }); tx(windowed.map((r) => r.id)); } // Determine what stopped iteration — max_tokens vs limit vs offset+n=total let stoppedBy: 'tokens' | 'limit' | 'end' = 'end'; if (windowed.length === returnLimit && total > offset + returnLimit) { stoppedBy = hardLimit <= tokenBudgetLimit ? 'limit' : 'tokens'; } return JSON.stringify({ ok: true, count: windowed.length, total_candidates: total, offset, has_more: hasMore, stopped_by: stoppedBy, search: searchMethod, resolved_layer: layer ?? null, memories: windowed.map((r) => { let parsedContent: unknown = r.content; try { parsedContent = JSON.parse(r.content); } catch { /* leave as string */ } return { id: r.id, entity: { id: r.entity_id, name: r.entity_name, kind: r.entity_kind, momentum: Number((r.momentum_score ?? 0).toFixed(2)), }, layer: r.layer, content: parsedContent, content_raw: r.content, importance: r.importance, pinned: r.importance >= 0.9, heat: Number(r.heat_score.toFixed(1)), band: r.heat_band, composite: Number(r.composite_score.toFixed(3)), match_reasons: r._reasons, score_breakdown: r._breakdown, }; }), }); } - src/mcp/server.ts:78-98 (schema)Input schema for the 'recall' tool: defines parameters (query, entity_name, layer, band, max_tokens, limit, offset, mark_accessed) with types, descriptions, and defaults.
{ name: 'recall', description: 'Retrieve memories relevant to the current context using full-text search (BM25) + entity-name match, re-ranked by a composite score (relevance × heat × momentum × importance). Returns only what fits in the token budget, with match_reasons explaining WHY each memory was returned. Opportunistically refreshes stale momentum scores for entities in the result set. Supports pagination via offset/has_more. Layer aliases accepted. Use at the start of any task that might involve prior work.', inputSchema: { type: 'object', properties: { query: { type: 'string', description: 'What you want to remember (free-text, entity name, or FTS5 MATCH expression)' }, entity_name: { type: 'string', description: 'Optional — narrow to a specific entity' }, layer: { type: 'string', description: 'Optional layer filter. Accepts aliases (decisions/warnings/how/etc.) as well as canonical names.', }, band: { type: 'string', enum: ['hot', 'warm', 'cold', 'frozen'], description: 'Optional — only return memories whose heat_band matches.' }, max_tokens: { type: 'number', description: 'Approx token budget. Default 2000. Either max_tokens or limit stops iteration (whichever fires first).', default: 2000 }, limit: { type: 'number', description: 'Optional hard cap on number of memories. Stops at min(max_tokens-budget, limit).' }, offset: { type: 'number', description: 'Skip this many top results (pagination). Use has_more from prior response to decide next offset.', default: 0 }, mark_accessed: { type: 'boolean', default: true, description: 'Set false for preview / listing queries that should not bump heat.' }, }, required: ['query'], }, - src/mcp/server.ts:799-821 (registration)MCP wiring — the 'recall' tool name is mapped to handleRecall() in the CallToolRequestSchema switch statement (line 805).
server.setRequestHandler(CallToolRequestSchema, async (req) => { const { name, arguments: args } = req.params; try { let text: string; switch (name) { case 'remember': text = handleRemember(args); break; case 'recall': text = handleRecall(args); break; case 'update_memory': text = handleUpdateMemory(args); break; case 'list_entities': text = handleListEntities(args); break; case 'forget': text = handleForget(args); break; case 'consolidate': text = handleConsolidate(args); break; case 'recall_file': text = handleRecallFile(args); break; case 'read_smart': text = handleReadSmart(args); break; default: throw new Error(`Unknown tool: ${name}`); } return { content: [{ type: 'text', text }] }; } catch (err: any) { return { content: [{ type: 'text', text: JSON.stringify({ ok: false, error: err?.message ?? String(err) }) }], isError: true, }; } }); - src/mcp/server.ts:59-180 (registration)TOOLS array registration — the 'recall' tool object is declared within the list of tools returned by ListToolsRequestSchema (line 79).
const TOOLS = [ { name: 'remember', description: 'Store a memory about an entity (person/company/project/concept/file) in one of 6 layers: goal (WHY), context (WHY-THIS-NOW), emotion (USER tone), implementation (HOW — success/failure), caveat (PAIN lesson, never forgotten), learning (GROWTH log). Use this when you discover non-obvious goals, unexpected failures, user preferences, or decisions worth preserving. Pasted assistant output or CI logs are rejected (use force=true only if you are sure).', inputSchema: { type: 'object', properties: { entity_name: { type: 'string', description: 'Name of the entity this memory is about' }, entity_kind: { type: 'string', enum: ['person', 'company', 'project', 'concept', 'file', 'other'] }, entity_key: { type: 'string', description: 'Optional canonical key (email, domain, file path)' }, layer: { type: 'string', description: 'One of: goal / context / emotion / implementation / caveat / learning. Common aliases (why, decisions, warnings, how, ...) are accepted.' }, content: { type: 'string', description: 'The memory content (plain text or JSON)' }, importance: { type: 'number', minimum: 0, maximum: 1, description: '0.0-1.0. Set to 0.9 or higher to "pin" a memory (protects from forgetting even outside caveat layer).' }, force: { type: 'boolean', default: false, description: 'Bypass the paste-back/CI-log quality check. Only set when you are sure the content is original user or agent thought.' }, }, required: ['entity_name', 'entity_kind', 'layer', 'content'], }, }, { name: 'recall', description: 'Retrieve memories relevant to the current context using full-text search (BM25) + entity-name match, re-ranked by a composite score (relevance × heat × momentum × importance). Returns only what fits in the token budget, with match_reasons explaining WHY each memory was returned. Opportunistically refreshes stale momentum scores for entities in the result set. Supports pagination via offset/has_more. Layer aliases accepted. Use at the start of any task that might involve prior work.', inputSchema: { type: 'object', properties: { query: { type: 'string', description: 'What you want to remember (free-text, entity name, or FTS5 MATCH expression)' }, entity_name: { type: 'string', description: 'Optional — narrow to a specific entity' }, layer: { type: 'string', description: 'Optional layer filter. Accepts aliases (decisions/warnings/how/etc.) as well as canonical names.', }, band: { type: 'string', enum: ['hot', 'warm', 'cold', 'frozen'], description: 'Optional — only return memories whose heat_band matches.' }, max_tokens: { type: 'number', description: 'Approx token budget. Default 2000. Either max_tokens or limit stops iteration (whichever fires first).', default: 2000 }, limit: { type: 'number', description: 'Optional hard cap on number of memories. Stops at min(max_tokens-budget, limit).' }, offset: { type: 'number', description: 'Skip this many top results (pagination). Use has_more from prior response to decide next offset.', default: 0 }, mark_accessed: { type: 'boolean', default: true, description: 'Set false for preview / listing queries that should not bump heat.' }, }, required: ['query'], }, }, { name: 'update_memory', description: 'Atomically edit an existing memory in-place. Preferred over forget+remember because it preserves memory_id, which matters for session_file_edits links and referential integrity. Use to correct facts, update deadlines in goal entries, refine caveats, or re-score importance. Caveat-layer memories can be updated but cannot have their protected flag removed.', inputSchema: { type: 'object', properties: { memory_id: { type: 'number', description: 'The memory.id to update' }, content: { type: 'string', description: 'New content (plain text or JSON). If omitted, content is kept.' }, layer: { type: 'string', description: 'Move to a different layer (aliases accepted). If omitted, layer is kept.' }, importance: { type: 'number', minimum: 0, maximum: 1, description: 'New importance 0-1. Set to 0.9 or higher to pin.' }, }, required: ['memory_id'], }, }, { name: 'list_entities', description: 'List the entities currently known to this memory store, sorted by recent activity. Use at the start of a new session ("what do I know about?") before issuing specific recall queries. Cheaper than recall for the "give me an overview" question.', inputSchema: { type: 'object', properties: { kind: { type: 'string', enum: ['person', 'company', 'project', 'concept', 'file', 'other'], description: 'Filter by entity kind.' }, min_memories: { type: 'number', description: 'Only include entities with at least N memories. Default 1.', default: 1 }, limit: { type: 'number', description: 'Max entities to return. Default 30.', default: 30 }, offset: { type: 'number', default: 0 }, }, }, }, { name: 'forget', description: 'Explicitly delete a memory by id, OR run auto-forgetting across all memories based on forgettingRisk (importance + heat + age). Caveat-layer, goal-layer, and pinned (importance>=0.9) memories are always preserved. Prefer update_memory for corrections — forget is destructive.', inputSchema: { type: 'object', properties: { memory_id: { type: 'number' }, dry_run: { type: 'boolean', default: false, description: 'Report what would be deleted without actually deleting.' }, }, }, }, { name: 'consolidate', description: 'Sleep-mode compression. Clusters cold low-importance memories by (entity, layer), summarizes each cluster into a single protected learning-layer entry, deletes originals, and runs a forget-sweep. Run at session end or on demand. Set dry_run=true to preview without writing.', inputSchema: { type: 'object', properties: { scope: { type: 'string', enum: ['all', 'session'], default: 'session' }, min_age_days: { type: 'number', description: 'Override the default 7-day minimum age for clustering (set to 0 to consolidate everything immediately, useful right after a bulk import).', default: 7 }, dry_run: { type: 'boolean', default: false, description: 'Preview what would be compressed without modifying the DB.' }, }, }, }, { name: 'recall_file', description: 'Get the COMPLETE edit history of a file across all sessions, with per-edit user-intent context. Returns: total edit count, daily breakdown, list of distinct user intents that drove the edits, and the linked memories. Use this when you need to understand WHY a file was modified historically — far more accurate than recall() for file-centric questions because it queries session_file_edits (every physical edit) instead of summary memories.', inputSchema: { type: 'object', properties: { path_substring: { type: 'string', description: 'Substring to match against file_path (e.g. "search-services.ts" or full absolute path)' }, max_intents: { type: 'number', description: 'Max distinct user-intent snippets to return. Default 10.', default: 10 }, }, required: ['path_substring'], }, }, { name: 'read_smart', description: 'Read a file with diff-only caching. Returns: (1) full content + chunk metadata on first read, (2) "unchanged" + cached chunk list (~50 tokens) if mtime matches, (3) "unchanged_content" if mtime changed but sha256 matches (touched but not modified), (4) changed chunks with content + unchanged chunks as metadata-only if the file was truly modified. Use INSTEAD of Read for files you have read before — saves 50%+ tokens on re-reads.', inputSchema: { type: 'object', properties: { path: { type: 'string', description: 'Absolute file path' }, force: { type: 'boolean', description: 'If true, return full content regardless of cache state', default: false }, }, required: ['path'], }, }, ]; - src/mcp/server.ts:255-264 (helper)Helper function toFtsQuery() sanitizes user query strings for FTS5 MATCH syntax (strips special chars, filters tokens shorter than 3 chars for trigram tokenizer). Called by handleRecall().
function toFtsQuery(raw: string): string { const cleaned = raw .replace(/["*:()]/g, ' ') .replace(/\s+/g, ' ') .trim(); if (!cleaned) return ''; const tokens = cleaned.split(' ').filter((t) => t.length >= 3); if (tokens.length === 0) return ''; return tokens.map((t) => `"${t}"`).join(' OR '); }