RecallNest
RecallNest is a shared, local-first memory layer for AI coding agents (Claude Code, Codex, Gemini CLI) that provides persistent, cross-session knowledge storage, retrieval, and management via MCP tools and HTTP API.
Store & Capture
Store durable memories (facts, preferences, entities, events, cases, patterns) with importance scoring, tags, and scopes
Batch store up to 20 memories at once with deduplication
Auto-capture memory-worthy signals from raw conversation text (zero LLM calls)
Save reusable workflow patterns and concrete problem-solution cases
Search & Retrieval
Semantic search with hybrid retrieval (vector + BM25), filtering by category, scope, date range, and temporal validity
Optional knowledge graph traversal (PPR) for relationship-aware search
Explain why memories matched a query for debugging
Distill retrieved memories into compact briefings with key takeaways and citations
Session Continuity
Checkpoint and resume work state across windows (decisions, open loops, next actions, files)
Compose full startup context combining memories, patterns, cases, and latest checkpoint
Memory Lifecycle & Governance
Promote evidence to durable memory, pin critical memories, and forget/cascade-delete entries
Run offline consolidation (dream) to cluster, merge, and prune memories
Set future-triggered reminders that auto-fire when matching keywords appear in future searches
Skills Management
Store and retrieve executable skills (bash, Python, MCP tool chains) with trigger patterns and verification steps
Scan cases/patterns for promotion candidates to skills
Assets & Exports
Create structured memory briefs as reusable indexed assets
Export briefings to markdown or JSON; export knowledge graph as interactive HTML
List and inspect all pinned memories and briefs
Health & Diagnostics
View aggregate stats, run health checks (vector dimensions, orphans, conflicts, duplicates), and get a quality health score
Drill down into individual memory entries
Integrations
Import from Claude Code, Claude.ai, ChatGPT, Slack, plaintext, Obsidian vaults, emails, and RSS feeds
Multilingual support (English built-in; Chinese, Japanese, Thai, and more via optional babel-memory package)
Provides integration examples for LangChain, enabling LangChain-based agents to store and retrieve persistent memories through the RecallNest memory layer via the HTTP API.
Offers integration examples for the OpenAI Agents SDK, allowing agents built with the OpenAI framework to access the RecallNest memory system for cross-session context persistence and memory retrieval.
RecallNest
Shared Memory Layer for Claude Code, Codex, and Gemini CLI
One memory. Three terminals. Context that survives across windows.
A local-first memory system backed by LanceDB that turns scattered conversation history into reusable knowledge — shared across your coding agents, recalled automatically.
Why RecallNest?
Coding agents forget everything between windows. Your context — project configs, debugging decisions, entity mappings — is scattered across Claude Code, Codex, and Gemini CLI with no shared memory.
RecallNest solves this: a single LanceDB-backed memory layer that all three terminals read and write. Context stored in one window is auto-recalled in another. Sessions checkpoint on exit and resume on start. Memory decays, evolves, and self-organizes — not just raw log storage.
Benchmark: LongMemEval (ICLR 2025)
Evaluated on 500 questions across 6 memory abilities (methodology):
RecallNest | Vector-only baseline | Delta | |
Overall Accuracy | 29.6% | 24.2% | +5.4pp |
User Facts | 64.3% | 52.9% | +11.4pp |
Knowledge Update | 43.6% | 42.3% | +1.3pp |
Abstention Rate | 55.6% | 67.8% | -12.2pp |
Wins or ties in all 6 categories, with no regression. The hybrid retrieval pipeline (BM25 + vector + recency + RIF dedup) surfaces 12.2% more relevant context than vector-only search.
Quick Start
Option A: Claude Code Plugin (recommended)
/plugin marketplace add AliceLJY/recallnest
/plugin install recallnest@AliceLJYRecallNest starts automatically with Claude Code. No manual MCP config needed.
Requires: Bun (recommended) or Node.js 18+. Dependencies install on first start.
Option B: npm install
npx recallnest --help # run directly
# or
npm install -g recallnest # install globally
recallnest doctorWorks with Node.js 18+ (via tsx) or Bun. No git clone needed.
Option C: Manual setup
git clone https://github.com/AliceLJY/recallnest.git
cd recallnest
bun install
cp config.json.example config.json
cp .env.example .env
# Edit .env → add your JINA_API_KEYStart the server
bun run api
# → RecallNest API running at http://localhost:4318Try it
# Store a memory
curl -X POST http://localhost:4318/v1/store \
-H "Content-Type: application/json" \
-d '{"text": "User prefers dark mode", "category": "preferences"}'
# Recall memories
curl -X POST http://localhost:4318/v1/recall \
-H "Content-Type: application/json" \
-d '{"query": "user preferences"}'
# Check stats
curl http://localhost:4318/v1/statsConnect your terminals
bash integrations/claude-code/setup.sh
bash integrations/gemini-cli/setup.sh
bash integrations/codex/setup.shEach script installs MCP access and managed continuity rules, so resume_context fires automatically in fresh windows.
Index existing conversations
bun run src/cli.ts ingest --source all
bun run seed:continuity
bun run src/cli.ts doctorWeb UI
bun run src/ui-server.ts
# → http://localhost:4317Core Capabilities
Access & Setup
Capability | Description |
CC Plugin | Install in Claude Code with one command — no manual config |
Shared Index | One LanceDB store for Claude Code, Codex, and Gemini CLI |
Dual Interface | MCP (stdio) for CLI tools + HTTP API for custom agents |
One-Click Setup | Integration scripts install MCP access and continuity rules |
Recall & Continuity
Capability | Description |
Hybrid Retrieval | 6-channel: vector + BM25 + L0/L1/L2 multi-vector + KG graph (PPR) |
4 Retrieval Profiles | default, writing, debug, fact-check — tuned for different tasks |
Session Continuity |
|
Session Distiller | 3-layer conversation compression: microcompact → LLM summary → knowledge extraction |
Conversation Import | Import from Claude Code, Claude.ai, ChatGPT, Slack, and plaintext |
Topic Tags | Intra-scope topic partitioning — auto-detected, filterable in search |
Memory Lifecycle & Governance
Capability | Description |
Memory Evolution | Supersede chains, decay scoring, LLM importance, consolidation, archival |
Smart Promotion | Evidence → durable memory with conflict guards, merge resolution, and audit trail |
Privacy Tiers | 4-tier ( |
Admission Control | Write-time gating: noise filter, importance floor, dedup, rate limiting |
Memory Lint | Contradiction, duplicate, stale, and orphan detection with health score |
Offline Consolidation |
|
Reasoning & Structure
Capability | Description |
Knowledge Graph | Entity relation graph with PPR algorithm for multi-hop questions |
Constructive Retrieval | Multi-source candidate expansion + grounded context reconstruction |
Narrative Architecture | 3-layer autobiographical metadata (life-period → general-event → specific-event) |
Skill Memory | Store, retrieve, and promote executable skills from recurring patterns |
Predictive Reminders | Behavioral-signal prediction engine surfaces "you might need this" suggestions |
6 Categories | profile, preferences, entities, events, cases, patterns — with category-aware merge strategies |
Visibility & Operations
Capability | Description |
Dashboard | Web UI with stats, category distribution, growth trends, and health |
Workflow Observation | Dedicated append-only workflow health records, outside regular memory |
Structured Assets | Pins, briefs, and distilled summaries — not just raw logs |
Data Checkup | Data quality health checks on the memory store (including source health) |
Source Heartbeats | Automatic ingest health tracking per data source with staleness alerts |
Export Graph | Export interactive HTML knowledge graph visualization |
Batch Operations | Store up to 20 memories in a single call with dedup |
Connector Framework | Standard connector-v1 format for external data sources with example adapters |
New in v2.1: Philosophy-Informed Memory
v2.0 built the operational memory platform; v2.1 added philosophy-informed memory behavior.
Five upgrades derived from 9 research dimensions in philosophy of memory, each mapped to concrete engineering:
Emotion-Aware Decay (Affective Memory Theory) — Memories with strong emotional content decay 20-30% slower. Keyword-based emotion detection computes
salience(mnemonic significance), which feeds into the Weibull half-life formula and a rebalanced 4-factor evolution score. Zero LLM cost.Memory Ethics Layer (Right to Be Forgotten / GDPR Art. 17) — Four privacy tiers (
ephemeral/private/durable/shared). Cascade forgetting engine that propagates deletion through KG triples, evolution chains, pin assets, and briefs. Full audit trail.forget_memoryMCP tool for agent-driven deletion.Autobiographical Narrative (Narrative Identity Theory / Conway's 3-layer model) — Memories are tagged with
lifePeriod → generalEvent → specificEventhierarchy, orthogonal to existing 6 categories. Retrieval pulls narrative siblings. Context rendering groups by life period. Rule-based tagger with EN+CN support.Constructive Retrieval (Simulation Theory / Michaelian) — Instead of returning raw stored text, RecallNest now reconstructs context from an expanded candidate set: KG neighbors + evolution chains + cluster members + narrative siblings. Source-map grounded coverage replaces lexical overlap. Contradictions are detected and flagged.
Predictive Prospective Memory (Mental Time Travel / Tulving) — Heuristic prediction engine that surfaces "you might need this" reminders from behavioral signals: stale checkpoint open loops, corrected workflow observations, high-frequency dormant memories, and uncovered query topics. Zero LLM cost. Auto-expire in 7 days if unaccepted.
New in v2.2: Retrieval Quality Hardening
v2.1 added philosophy-informed behavior; v2.2 closes the last three engine-layer gaps identified by a frontier research scan (ACC, PI-LLM, TSM).
Memory Confidence Meta-tags (ACC / Dual-Process UQ) — Each memory now carries structured
ConfidenceMetadata(score, reliability tier:direct/inferred/hearsay). Auto-assigned from source on write (manual= 0.9,agent= 0.7,conversation_import= 0.5). Retrieval scores are weighted by confidence.resume_contexttags low-confidence items with[低置信].Interference Detection + Active Forgetting Gate (PI-LLM / SleepGate) — Semantic cluster detection identifies groups of near-duplicate memories competing for retrieval. Enhanced RIF keeps only top-K (default 3) per cluster; extras are demoted 50% instead of removed. Write-time pre-warning: when a scope accumulates ≥5 high-similarity active memories, the weakest is flagged
pending_review.data_checkupreports interference density.Temporal Validity Windows (TSM / TiMem / Zep) —
store_memoryacceptsvalidUntil(expiration) andeventTime(when the event actually happened).search_memorysupportsvalidAt(point-in-time query) andincludeExpired(demote 80% instead of hide). Auto-GC applies 2× decay acceleration to expired memories.
New in v2.3: Connector Ecosystem + Source Health
v2.2 hardened retrieval quality; v2.3 opens RecallNest to external data sources with a standard connector framework and operational health monitoring.
Connector-v1 Standard (GB-2) — A JSON format (
ConnectorOutputV1) that any external script can produce. Obsidian vaults, emails, RSS feeds, log files — normalize once, ingest through the full dedup/embed/extract pipeline. Seedocs/connector-spec.mdfor the specification andconnectors/examples/for adapter skeletons (email, logs, RSS).Obsidian Vault Ingestion (GB-1) — First-party Obsidian connector: scans
.mdfiles, extracts frontmatter + wikilinks, maps folder structure to tags. One command:lm ingest --obsidian /path/to/vault.Source Health Monitoring (GB-3) — Every connector ingest writes a heartbeat to
data/source-heartbeat.json.data_checkupflags stale sources (>7d warning, >30d error).doctor --cishows a per-source heartbeat summary with human-readable age.
Architecture
┌──────────────────────────────────────────────────────────┐
│ Client Layer │
├──────────┬──────────┬──────────┬──────────────────────────┤
│ Claude │ Gemini │ Codex │ Custom Agents / curl │
│ Code │ CLI │ │ │
└────┬─────┴────┬─────┴────┬─────┴──────┬──────────────────┘
│ │ │ │
└──── MCP (stdio) ───┘ HTTP API (port 4318)
│ │
▼ ▼
┌──────────────────────────────────────────────────────────┐
│ Integration Layer │
│ ┌─────────────────────┐ ┌────────────────────────────┐ │
│ │ MCP Server │ │ HTTP API Server │ │
│ │ 41 tools │ │ 21 endpoints │ │
│ └─────────┬───────────┘ └──────────┬─────────────────┘ │
└────────────┼─────────────────────────┼───────────────────┘
└──────────┬──────────────┘
▼
┌──────────────────────────────────────────────────────────┐
│ Core Engine │
│ │
│ ┌────────────┐ ┌────────────┐ ┌─────────────────────┐ │
│ │ Retriever │ │ Classifier │ │ Context Composer │ │
│ │ (vector + │ │ (6 cats) │ │ (resume_context) │ │
│ │ BM25 + RRF)│ │ │ │ │ │
│ └────────────┘ └────────────┘ └──────────────────────┘ │
│ ┌────────────┐ ┌────────────┐ ┌─────────────────────┐ │
│ │ Decay │ │ Conflict │ │ Capture Engine │ │
│ │ Engine │ │ Engine │ │ (evidence → durable) │ │
│ │ (Weibull) │ │ (audit + │ │ │ │
│ │ │ │ merge) │ │ │ │
│ └────────────┘ └────────────┘ └──────────────────────┘ │
└──────────────────────────┬───────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────┐
│ Storage Layer │
│ ┌─────────────────────┐ ┌────────────────────────────┐ │
│ │ LanceDB │ │ Jina Embeddings v5 │ │
│ │ (vector + columnar) │ │ (1024-dim, task-aware) │ │
│ └─────────────────────┘ └────────────────────────────┘ │
└──────────────────────────────────────────────────────────┘Internal Design
L0 / L1 / L2 Dynamic Folding — every memory stores 3 granularity layers (one-liner / bullet summary / full content); retrieval dynamically selects which layer to return based on relevance score and token budget
Weibull Decay + Emotion Modulation — memories decay along a parametric Weibull curve; importance scores modulate the half-life, and emotional salience extends it further (up to 30%)
Vector Pre-filter + LLM Dedup — 90% of dedup decisions use cheap cosine similarity (>= 0.92); only borderline cases invoke LLM judgment, keeping costs low without sacrificing accuracy
Category-Aware Merge Strategies —
profileandpreferencesuse merge-on-conflict (latest wins);eventsandcasesuse append-only (history preserved)Display Score vs Elimination Score — dual-track retrieval: tier floor prevents core memories from ever dropping out, while decay boost lets fresh memories surface temporarily without permanently displacing stable ones
Full architecture deep-dive:
docs/architecture.md
Interfaces
RecallNest serves two interfaces:
MCP — for Claude Code, Gemini CLI, and Codex (native tool access)
HTTP API — for custom agents, SDK-based apps, and any HTTP client
Agent framework examples
Examples live in integrations/examples/:
Framework | Example | Language |
| TypeScript | |
| Python | |
| Python |
Tool | Description |
| Store an append-only workflow observation outside regular memory |
| Inspect workflow observation health or show a degraded-workflow dashboard |
| Build an evidence pack for a workflow primitive |
| Store a durable memory for future windows |
| Store a reusable workflow as durable |
| Store a reusable problem-solution pair as durable |
| Explicitly promote evidence into durable memory |
| List or inspect promotion conflict candidates |
| Summarize stale/escalated conflict priorities |
| Preview or apply conflict escalation metadata |
| Resolve a stored conflict candidate (keep / accept / merge) |
| Store the current active work state outside durable memory |
| Inspect the latest saved checkpoint by session or scope |
| Compose startup context for a fresh window |
| Proactive recall at task start |
| Explain why memories matched |
| Distill results into a compact briefing |
| Create a structured brief and re-index it |
| Promote a scoped memory into a pinned asset |
| Export a distilled memory briefing to disk |
| List pinned memories |
| List all structured assets |
| Preview outdated brief assets created before the cleanup rules |
| Archive dirty brief assets and remove their indexed rows |
| Show index statistics |
| Inspect a specific memory entry with full metadata and provenance |
| Heuristically extract and store memory signals from text (zero LLM calls) |
| Set a prospective memory reminder to surface in a future session |
| Cluster near-duplicate memories and merge them (dry-run by default) |
| Store an executable skill with trigger conditions and verification |
| Retrieve matching executable skills by semantic similarity |
| Scan cases/patterns for promotion candidates to skills |
| Discover available tools by tier (core/advanced/full) |
| Store up to 20 memories in a single call with dedup |
| Distill a conversation into structured knowledge via 3-layer pipeline |
| Import conversations from Claude Code, ChatGPT, Slack, and more |
| Run data quality health checks on the memory store |
| Run offline memory consolidation (clustering, merging, pruning) |
| Run memory quality checks: contradictions, duplicates, stale entries, orphans |
| Cascade-delete a memory with KG cleanup, pin archival, and audit trail |
| Export memories as an interactive HTML knowledge graph |
Base URL: http://localhost:4318
Endpoint | Method | Description |
| POST | Quick semantic search |
| POST | Store a new memory |
| POST | Store multiple structured memories |
| POST | Store a structured workflow pattern |
| POST | Store a structured problem-solution case |
| POST | Promote evidence into durable memory |
| GET | List or inspect promotion conflict candidates |
| GET | Summarize stale/escalated conflict priorities |
| POST | Preview or apply conflict escalation metadata |
| POST | Resolve a stored conflict candidate (keep / accept / merge) |
| POST | Store the current work checkpoint |
| POST | Store a workflow observation outside durable memory |
| GET | Fetch the latest checkpoint by session or scope |
| GET | Inspect workflow health or return a degraded-workflow dashboard |
| GET | Build a workflow evidence pack from recent issue observations |
| POST | Compose startup context for a fresh window |
| POST | Advanced search with full metadata |
| GET | Memory statistics |
| GET | Memory quality lint report |
| GET | Health check |
Full documentation: docs/api-reference.md
# Search & explore
bun run src/cli.ts search "your query"
bun run src/cli.ts explain "your query" --profile debug
bun run src/cli.ts distill "topic" --profile writing
bun run src/cli.ts stats
# Workflow observation
bun run src/cli.ts workflow-observe resume_context "Fresh window skipped continuity recovery." --outcome missed --scope project:recallnest
bun run src/cli.ts workflow-health resume_context --scope project:recallnest
bun run src/cli.ts workflow-evidence checkpoint_session --scope project:recallnest
# Conflict management
bun run src/cli.ts conflicts list
bun run src/cli.ts conflicts list --attention resolved
bun run src/cli.ts conflicts list --group-by cluster --attention resolved
bun run src/cli.ts conflicts audit
bun run src/cli.ts conflicts audit --export --format md
bun run src/cli.ts conflicts escalate --attention stale
bun run src/cli.ts conflicts show af70545a
bun run src/cli.ts conflicts resolve af70545a --keep-existing
bun run src/cli.ts conflicts resolve af70545a --merge
bun run src/cli.ts conflicts resolve --all --keep-existing --status open
# Memory health & visualization
bun run src/cli.ts lint # memory quality report
bun run src/cli.ts lint --scope project:myapp # lint a specific scope
bun run src/cli.ts graph --open # export & open knowledge graph
bun run src/cli.ts graph --max-nodes 50 # smaller graph
# Ingestion & diagnostics
bun run src/cli.ts ingest --source all
bun run src/cli.ts doctorMultilingual Support
RecallNest works out of the box with English. For multilingual memory (Chinese, Japanese, Thai, and 20+ more), install babel-memory with the language packs you need:
# Chinese
npm install babel-memory jieba-wasm
# Japanese
npm install babel-memory @sglkc/kuromoji
# Thai
npm install babel-memory wordcut
# European languages (German, French, Spanish, Russian, etc.)
npm install babel-memory snowball-stemmers
# Multiple languages at once
npm install babel-memory jieba-wasm @sglkc/kuromoji snowball-stemmersRecallNest auto-detects babel-memory at startup — no configuration needed. Without babel-memory, RecallNest still works perfectly with standard BM25 text search.
Project Status & Roadmap
RecallNest is actively maintained. All major architecture phases are complete — see the full Roadmap for current priorities and future plans.
Relationship to memory-lancedb-pro
RecallNest started as a fork of memory-lancedb-pro and shares its core ideas around hybrid retrieval, decay modeling, and memory-as-engineering-system. The key difference:
memory-lancedb-pro is an OpenClaw plugin — it adds long-term memory to a single OpenClaw agent.
RecallNest is a standalone memory layer — it serves Claude Code, Codex, and Gemini CLI simultaneously through MCP + HTTP API, with session continuity, structured assets, and conflict management built in.
Credit
Source | Contribution |
Fork base — hybrid retrieval, decay modeling, and memory architecture | |
Claude Code | Foundation and early project scaffolding |
OpenAI Codex | Productization and MCP expansion |
Special thanks to Qin Chao (@win4r) and the CortexReach team for the foundational work.
Part of the 小试AI open-source AI workflow:
Project | Description |
Multilingual preprocessing for BM25 — 27+ languages, zero deps | |
5-stage AI writing pipeline | |
Image generation + layout + WeChat publishing | |
Run Claude Code / Codex / Gemini in WeChat with session management | |
Telegram bots for Claude, Codex, and Gemini | |
Telegram CLI bridge for Gemini CLI | |
Docker ↔ host CLI bridge (/cc /codex /gemini) | |
OpenClaw bots configuration and memory backup | |
Build digital clones from corpus data | |
Multi-session collaboration platform for Claude Code | |
Web-based Claude chat client (PWA) — self-hosted, iPad-ready | |
One-command installer for memory + remote control | |
Complete Claude Code workflow scaffold |
License
MIT
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/AliceLJY/recallnest'
If you have feedback or need assistance with the MCP directory API, please join our Discord server