How do I use ccRecall?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@ccRecall recall our discussion about caching strategies" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

ccRecall

by tznthou

Overview Schema Related Servers Score Discussions

TypeScript

Local

ccRecall

License: Apache 2.0 TypeScript Node.js SQLite

中文版

A local memory service for Claude Code — indexes your conversation history, recalls relevant context on demand, and injects it into future sessions. Zero API cost for the core service; optional post-session extraction via Haiku adds ~$0.001/session.

Manifesto

Five stances, non-negotiable:

Local-first. Your machine, your data. No cloud, no accounts, no services beyond localhost.
SQLite is the API. One .db file. sqlite3 to inspect anything. Your memories are never a black box.
FTS5, not vectors. Code conversations search for file paths, error messages, tool names — keywords, not vibes. At hundreds of sessions, embeddings solve a problem that doesn't exist.
LLM distills, human curates. Haiku extracts post-session (~$0.001). Auto memory holds decisions you name. ccRecall holds the long tail no one writes down by hand.
Read-only toward ~/.claude/. Reads your JSONL logs, never writes to Claude Code's state. Worst case: bad search results. Never: a broken setup.

Related MCP server: ClaudeX

Core Concept

Every time you start a new Claude Code session, the AI forgets everything. The architecture you spent 20 minutes explaining, the bug you debugged together, the decisions you made — all gone. You start over.

CLAUDE.md and RESUME.md help, but they're static files you maintain by hand. ccRecall automates this: it reads your JSONL conversation logs, builds a searchable index, and serves relevant memories back to Claude Code through hooks and MCP tools. The AI remembers what it learned — you don't have to remind it.

ccRecall is the "memory" counterpart to ccRewind (a conversation replay GUI). ccRewind lets humans look back at what happened; ccRecall lets the AI remember what happened.

Note: This project is unrelated to spences10/ccrecall, an analytics-focused tool that happens to share the name. Because the npm package ccrecall is already taken, we publish as @tznthou/ccrecall and the CLI binary is named ccmem.

Dogfood baseline

Single-user daily-driver data, not a controlled study. Updated 2026-07-23.

Metric	Value
Running continuously	97 days
Sessions indexed	2,298 across 46 projects
Memories stored	532 (93% keyed for dedup)
Topics in knowledge map	18,495
DB on disk	50 MB

n = 1. These numbers show the system runs and accumulates data — not that every memory is useful. We don't yet have a good metric for "did this memory actually help the session." That's an open problem.

What the numbers don't show: without ccRecall, every session starts cold — you re-explain context the AI already learned yesterday. With it, startup injection surfaces the relevant memory and the session picks up mid-thought. One injection that saves a five-minute preamble pays for the entire daemon.

Features

Feature	Description
Rule-based summarization	Extracts intent, activity, outcome, and tags from sessions — no LLM calls, zero API cost
FTS5 full-text search	Sub-100ms keyword search across all conversation history, fast enough for hook injection
CJK / mixed-script search	Trigram tokenizer indexes Chinese / Japanese / Korean text; per-token AND LIKE fallback handles short queries like `UI 記憶` that trigram alone can't match
Incremental indexing	Only re-indexes sessions that changed (mtime diffing), handles resumed sessions via UUID dedup
Metacognition	`knowledge_map` aggregates topic mentions from sessions + memories. Depth derived from mention count (shallow / medium / deep). Exposed via MCP `recall_context`
Forgetting curve	Memories compress over time: raw → summary → one-liner → deleted. Confidence decays on unused memories. Background maintenance tick runs every 5 min
Cross-project memory (v0.4.1)	Memories surface across projects via topic intersection — if two projects share a `knowledge_map` topic, high-confidence memories from one appear in the other's startup injection (max 3 rows, confidence ≥ 0.8 gate)
Watch mode	chokidar-based JSONL watcher picks up new sessions within 2 s; periodic 10 min full-resync covers missed filesystem events
Rescue reindex	`/session/end` and `/session/last` both retry a reindex on miss, and `/session/last` staleness-gates via `notBefore` — no fresh-session race between the hook, the wrapper, and the daemon
Auto-start (macOS)	`ccmem install-daemon` registers a LaunchAgent so the service stays up across reboots
Read-only	Never modifies `~/.claude/` — only reads JSONL logs

Architecture

flowchart TB
    subgraph Input["Data Source (read-only)"]
        JSONL["~/.claude/projects/*/*.jsonl"]
    end

    subgraph Core["ccRecall Service (port 7749)"]
        Watcher["Watcher<br/>chokidar, 2 s debounce"]
        Scanner["Scanner<br/>find JSONL files"]
        Parser["Parser<br/>parse conversations"]
        Summarizer["Summarizer<br/>rule-based extraction"]
        DB[("SQLite + FTS5<br/>index & search")]
        API["HTTP API<br/>5 endpoints"]
    end

    subgraph Consumers["Consumers"]
        Hook["Claude Code Hooks<br/>SessionStart / SessionEnd"]
        Wrapper["Extraction wrapper<br/>post-session Haiku pass"]
        MCP["MCP Server<br/>recall_query / recall_save"]
    end

    JSONL --> Watcher --> Scanner --> Parser --> Summarizer --> DB
    DB <--> API
    Hook -->|"inject / end-confirm"| API
    Wrapper -->|"GET /session/last"| API
    MCP <-->|"shared SQLite via WAL"| DB

Arrows point in the direction of the call: hooks and the extraction wrapper talk to the daemon over HTTP, while the MCP server is a separate process that opens the same SQLite file directly (WAL mode) — it never goes through the HTTP API.

Session lifecycle

How the two timing-sensitive ends of a session actually flow:

sequenceDiagram
    participant CC as Claude Code
    participant H as Hooks
    participant R as ccRecall
    participant W as Wrapper
    participant X as Haiku extractor

    rect rgb(235, 244, 255)
    Note over CC,R: Session start
    CC->>H: SessionStart
    H->>R: GET /memory/startup
    R-->>H: 3-tier memory pick (< 300 tokens)
    H-->>CC: inject into context
    end

    Note over CC,R: During the session, the watcher<br/>reindexes JSONL writes (2 s debounce)

    rect rgb(255, 244, 235)
    Note over CC,X: Session end
    CC->>H: SessionEnd
    H->>R: POST /session/end (index confirm + rescue)
    W->>R: GET /session/last?notBefore=launch_ts
    R-->>W: sessionId (staleness-gated, subagent-filtered)
    W->>X: text-only transcript
    X->>R: recall_save × 0–5 (via MCP)
    end

The notBefore gate keeps a stale session from being extracted twice; the subagent filter keeps an Agent-tool side-session from shadowing the real one. Both guards exist because the wrapper queries within seconds of session close, racing the watcher.

Tech Stack

Technology	Purpose	Notes
Node.js 20–22 + TypeScript	Runtime	ES modules, strict mode
better-sqlite3	Database	Synchronous API, zero external deps
FTS5	Full-text search	Built into SQLite, trigram tokenizer with LIKE fallback for short CJK / mixed-script queries
Native `http`	HTTP server	No Express — minimal surface, localhost only
chokidar	Filesystem watcher	Cross-platform JSONL change detection with 2 s debounce + single-flight
vitest	Testing	542 tests across 34 files, integration-style
`@modelcontextprotocol/sdk`	MCP server	stdio transport, shared SQLite via WAL

Quick Start

First time here? The full walkthrough (install via npm → MCP setup → everyday usage) lives in docs/tutorial.md. The section below is the contributor / dev-mode path.

Prerequisites

Node.js >=20.0.0,<23.0.0
pnpm

Installation

git clone https://github.com/tznthou/ccRecall.git
cd ccRecall

pnpm install

# Start development server (auto-indexes on startup, watches ~/.claude/projects)
pnpm dev

The service starts at http://127.0.0.1:7749 and indexes all JSONL files in ~/.claude/projects/.

Verify

# Health check — should show mainSessionCount > 0
curl http://127.0.0.1:7749/health

# Search your conversation history
curl "http://127.0.0.1:7749/memory/query?q=authentication&limit=5"

API Endpoints

Five endpoints, each with a live caller — v0.5.0 removed the other eight (/journal/*, /memory/save, /memory/context, /metacognition/check, /session/checkpoint, /lint/warnings); they now return 404.

Endpoint	Method	Description	Caller
`/health`	GET	Service health + DB stats + integrity check status	CLI, extraction wrapper
`/memory/startup?project=...`	GET	SessionStart-tier retrieval: cold + recent-confidence + FTS fallback, token-budgeted	SessionStart hook
`/memory/query?q=...&limit=...&project=...`	GET	FTS5 search across memories with optional project filter	SessionStart hook (keyword tier)
`/session/end`	POST	Confirm the just-ended session is indexed (rescue reindex on miss)	SessionEnd hook
`/session/last?cwd=...`	GET	Most recent session metadata for a project path (staleness-gated via `notBefore`)	Extraction wrapper

MCP Tools

Tool	Purpose
`recall_query`	User-scoped FTS5 keyword search across memories with project-aware ranking. Cross-project memories surface via topic intersection
`recall_context`	Topic-clustered retrieval — normalizes keywords, groups memories by matched topic with depth signals, falls back to per-keyword FTS if no topic matches
`recall_save`	Store a new memory with optional `key` slug for dedup (same key updates instead of duplicating). Auto-extracts topics for cross-project retrieval

Memory types (for recall_save):

decision — explicit choice with rationale
discovery — non-obvious finding
preference — user style or convention
pattern — recurring workflow or code template
feedback — user correction on past work

Expose them to Claude Code. After pnpm build, the ccmem-mcp bin is on the repo's node_modules/.bin path — point claude mcp add at it or at a global install:

# Using the built bin (after pnpm build)
claude mcp add ccrecall --scope user -- /absolute/path/to/ccRecall/dist/mcp/server.js

# Or using tsx for development (no build step)
claude mcp add ccrecall --scope user -- /absolute/path/to/ccRecall/node_modules/.bin/tsx /absolute/path/to/ccRecall/src/mcp/server.ts

A ready-to-copy example lives at .mcp.json.example.

See hooks/README.md for SessionStart / SessionEnd hook installation.

CLI Commands

@tznthou/ccrecall ships two binaries:

ccmem — daemon launcher + admin commands
ccmem-mcp — MCP server (registered with Claude Code via claude mcp add)

Daemon and hook lifecycle (macOS):

Command	Purpose
`ccmem`	Run the daemon in foreground
`ccmem install-daemon`	Register a LaunchAgent (auto-start at login)
`ccmem uninstall-daemon`	Stop and remove the LaunchAgent
`ccmem install-hooks`	Merge SessionStart / SessionEnd entries into `~/.claude/settings.json`
`ccmem uninstall-hooks`	Remove ccRecall's hook entries (other hooks untouched)
`ccmem cleanup --orphans`	List (and with `--yes`, delete) memories whose session rows are gone

ccmem promote / ccmem reject were removed in v0.5.0 along with the journal pipeline they reviewed.

ccRecall vs auto memory

ccRecall lives alongside Claude Code's built-in auto memory (~/.claude/projects/*/memory/). They're complementary — use them for different things.

	auto memory	ccRecall
Write path	Claude curates by hand — new `.md` file + MEMORY.md index line	Two tracks: rule-based session summaries (automatic, free) + opt-in Haiku extraction (~$0.001/session, where most searchable memories come from)
Read path	Always in session context (MEMORY.md loads at session start)	On-demand MCP query when auto memory has no entry
Signal density	High — facts worth naming	Long tail — everything the hook can extract
Typical use	"Remember X" / "always Y" — durable preferences, decisions	"Didn't we fix that?" / "last time" — cross-session recall

Default for saving: write to auto memory, let the hook harvest ccRecall independently. Don't call recall_save to mirror a fact you already curated — duplicate writes just create noise.

Default for querying: MEMORY.md is already in context — check the index first. Fall back to recall_query / recall_context only when the user references past work and auto memory has no matching entry.

ccRecall's value is the long tail that auto memory can't cover (nobody hand-curates 500 sessions of notes). If Claude defaults to both, auto memory wins because it's already loaded and curated. ccRecall earns its keep when the curated index misses.

What if Anthropic builds this in? Auto memory today is .md files tied to Claude Code's project structure — portable as text, but not queryable. ccRecall is a single SQLite file you can inspect with sqlite3, query with SQL, back up by copying one file, and use independently of Claude Code's config. The difference isn't local vs. cloud — it's a curated document store vs. a searchable database. If built-in memory covers your needs, use it. ccRecall is for when you want structured recall across hundreds of sessions that no one will curate by hand.

Running as a service (macOS)

ccRecall runs as a local HTTP daemon. To keep it up across reboots, register a per-user LaunchAgent:

pnpm build
node dist/index.js install-daemon        # or `ccmem install-daemon` if globally linked
node dist/index.js install-daemon --dry-run   # preview plist without writing

# verify
launchctl list | grep ccrecall
curl http://127.0.0.1:7749/health

# remove
node dist/index.js uninstall-daemon

The installer:

writes ~/Library/LaunchAgents/com.tznthou.ccrecall.plist
routes logs to ~/Library/Logs/ccrecall/ccrecall.{out,err}.log
propagates CCRECALL_PORT / CCRECALL_DB_PATH from the current shell into the plist, so the LaunchAgent uses the same settings as your interactive run
refuses to touch a plist whose Label isn't ccRecall's (safety check)

Full manual-install, troubleshooting, and uninstall docs: docs/launchd.md.

Linux/Windows equivalents (systemd unit, Windows service) are planned for a future release. For now, run under nohup or your process manager of choice.

Monitoring

The daemon runs PRAGMA integrity_check on startup and every 6 hours. The result (timestamp + boolean) is cached and surfaced on /health as lastIntegrityCheckAt / lastIntegrityCheckOk. When drift is detected, the full integrity_check output is written to a timestamped file under ~/.ccrecall/integrity-alerts/.

If you see a drift alert, snapshot the DB before running REINDEX. REINDEX fixes the symptom but destroys the forensic state:

cp ~/.ccrecall/ccrecall.db ~/ccrecall-drift-snapshot.db
sqlite3 ~/.ccrecall/ccrecall.db 'REINDEX;'

WAL maintenance

Each indexer batch ends with PRAGMA wal_checkpoint(TRUNCATE) so the ccrecall.db-wal sidecar is reset to 0 bytes after every reindex. On a long-running daemon you should see WAL hovering near 0 most of the time, spiking briefly while a batch runs.

If you ever see WAL growing unboundedly (close to the size of the main DB), check stderr for [indexer] WAL checkpoint busy warnings — that means a reader has been holding a snapshot past busy_timeout across several consecutive batches and the truncate keeps deferring. Identify the offending client and the next clean batch will reclaim the space.

Project Structure

ccRecall/
├── src/
│   ├── core/
│   │   ├── types.ts              # All type definitions
│   │   ├── parser.ts             # JSONL conversation parser
│   │   ├── scanner.ts            # File system scanner
│   │   ├── summarizer.ts         # Rule-based session summarizer
│   │   ├── topic-extractor.ts    # Rule-based topic extraction
│   │   ├── database.ts           # SQLite + FTS5 (trimmed from ccRewind)
│   │   ├── indexer.ts            # Indexing pipeline orchestrator
│   │   ├── memory-service.ts     # Memory lifecycle (touch / delete / update)
│   │   ├── compression.ts        # L0→L1→L2→delete state machine
│   │   ├── maintenance-coordinator.ts  # Background compression tick
│   │   ├── watcher.ts            # chokidar JSONL watcher (Phase 4e)
│   │   └── log-safe.ts           # scrubErrorMessage — log-injection defence
│   ├── api/
│   │   ├── server.ts             # HTTP server
│   │   └── routes.ts             # Request routing + rescue reindex
│   ├── mcp/
│   │   ├── server.ts             # MCP stdio server entry (shebang bin)
│   │   └── tools.ts              # recall_query + recall_context + recall_save
│   ├── cli/
│   │   └── daemon.ts             # install-daemon / uninstall-daemon (macOS)
│   └── index.ts                  # HTTP entry point + subcommand dispatch
├── hooks/
│   ├── session-start.mjs         # Inject memories on SessionStart (stdout)
│   ├── session-end.mjs           # POST /session/end on SessionEnd
│   └── README.md                 # Hook installation guide
├── docs/
│   ├── tutorial.md               # End-user walkthrough (install → MCP → usage)
│   ├── architecture.md           # Daemon design rationale (contributor-oriented)
│   └── launchd.md                # macOS LaunchAgent install/troubleshoot
├── tests/                        # 542 tests across 34 files (parser, scanner,
│   │                             # summarizer, database, indexer, e2e, MCP,
│   │                             # memories, hooks, watcher, CLI, migrations,
│   │                             # FTS5 CJK edge cases, integrity monitor, ...)
│   └── fixtures/                 # Sample JSONL + shared test helpers
├── .mcp.json.example             # MCP client config template
└── NOTICE / SECURITY.md / CONTRIBUTING.md / CODE_OF_CONDUCT.md

ccRewind — Session replay GUI for Claude Code. ccRecall's core modules (parser, scanner, summarizer, database, indexer) were extracted from ccRewind.

Philosophy

Why this exists

Thariq from Anthropic's Claude Code team wrote about context management in April 2026 — 11,908 bookmarks, because everyone saved it to re-read but nobody had the tools to actually do it. He described the problem perfectly: context rot degrades model performance in long sessions, and autocompact fires at the worst possible moment.

He gave methodology, not tools. ccRecall is the tool.

The real trigger was simpler: I kept re-explaining the same architecture to Claude Code across sessions. Not because the AI is bad at remembering — it literally can't. Every session starts from zero. CLAUDE.md helps, but it's a static file I maintain by hand. The maintenance cost grows faster than the value. Sound familiar? That's exactly why humans abandon wikis too (Karpathy's LLM Wiki insight).

Design stances

Each flows from the manifesto. Each was a conscious choice against a popular alternative.

Rule-based, with a ceiling. The daemon uses heuristic extraction (regex patterns, tool usage analysis, outcome inference) for session summaries — zero API cost. For that job, "Edit x8, 5 files, committed" is more useful than a paragraph of prose. But rule-based has a ceiling: it handles structured signals (tool calls, file edits, commit messages) well; it misses discussion-only decisions, nuanced trade-offs, and anything that lives in natural language. That ceiling is why v0.4.1 added post-session Haiku extraction (~$0.001/session) — an LLM pass that catches what heuristics can't. In practice, the memories I actually search for mostly come from Haiku extraction or manual recall_save, not the rule-based layer. The daemon itself never calls an LLM; the LLM pass is opt-in and runs outside the daemon as a separate process.

FTS5, not vector search. Semantic search sounds better on paper, but for conversation logs — specific tools, file paths, error messages — keyword matching wins. FTS5 queries run in <10ms locally. No embedding model, no Chroma, no Docker container. At hundreds of sessions (not millions of documents), Karpathy's own analysis confirms: "plain index + keyword search is already sufficient under 500 sources."

HTTP + MCP dual interface. MCP tools are the most stable way to inject context into Claude (pull-based, Claude decides when to fetch). SessionStart hooks (push-based, automatic) are also stable. ccRecall runs both: HTTP for hooks, MCP for on-demand queries. Same SQLite backend, two access patterns.

Read-only, unconditionally. ccRecall never modifies ~/.claude/, never writes to session files, never injects itself into Claude Code's config. This isn't politeness — it's a trust boundary. If a background service can write to your config, one bug could corrupt your sessions. The user explicitly configures hooks and MCP. ccRecall doesn't install itself.

The excluded stack. No Docker — deployment friction for what should be a pnpm dev experience. No Electron — ccRecall has no UI (that's ccRewind's job). No vector database — solves a problem we don't have at this scale. These are deliberate exclusions, not missing features.

No opinionated injection. ccRecall doesn't decide what Claude should remember. It provides a search API — the injection layer presents results, Claude integrates them. Opinionated memory selection is a premature optimization that would be wrong in ways we can't predict.

Roadmap

Version	Theme	Status
v0.3.x	Manual save, automatic recall — memories come from explicit `recall_save` calls; SessionStart hook and MCP tools inject them into future sessions	Released
v0.4.x	Post-session extraction via Haiku, cross-project memory via topic intersection, extraction pipeline hardening (race gates, compression integrity, subagent filtering, security)	Released
v0.5.0	The knife field: journal/scorer/harvester pipelines removed (zero promotions ever), endpoints 13 → 5, `message_uuids` dual-hash rebuild (DB 114MB → 42MB)	Released
v0.5.2	Recall weighting: log-compressed half-life decay, FTS relevance-first ranking via `(-rank)*sqrt(EC)`	Released
v0.5.3	CJK topic alignment: Han-aware topic extraction (`\p{Script=Han}` tokenizer + 32 Chinese stopwords + particle split), session-less rebuild fix	Released

Tracked in GitHub Issues.

Changelog

Release notes and version history live in CHANGELOG.md. Every tagged version has a matching entry; the Unreleased section tracks what's landed on main but not yet published to npm.

License

Licensed under the Apache License, Version 2.0 — see LICENSE.

Author

tznthou — tznthou.com · tznthou@gmail.com

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

18hResponse time

3dRelease cycle

31Releases (12mo)

Commit activity

Issues opened vs closed

Resources

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Related MCP Servers

acheron-mcp-server
Knowledge & Memory Developer Tools File Systems
timmx7
A
license
A
quality
D
maintenance
Cross-surface persistent memory for Claude. Bridges context between Claude Chat, Code, and Cowork via local SQLite with full-text search.
Last updated 2026-03-26
6
3
6
MIT
ClaudeX
Knowledge & Memory Search Developer Tools
kunwar-shah
A
license
A
quality
A
maintenance
Persistent memory + FTS5 full-text search for Claude Code conversation history. Indexes ~/.claude/projects/ JSONL into SQLite, exposes 10 MCP tools (store/recall/search memories, browse sessions, get summaries) plus prompts. Includes a web UI for visual exploration
Last updated 2026-06-20
10
148
90
MIT
mcp-openmemory
Knowledge & Memory RAG Systems
baryhuang
A
license
A
quality
D
maintenance
Enables Claude to remember conversations and learn over time by storing and recalling messages, memory abstracts, and recent history using a local SQLite database.
Last updated 2025-06-06
4
47
72
MIT
sqlite-memory-mcp
Knowledge & Memory Search
miles990
A
license
-
quality
D
maintenance
Provides persistent memory, skill tracking, failure indexing, and context sharing for Claude Code using SQLite with FTS5 full-text search.
Last updated 2026-01-15
9
1
MIT

View all related MCP servers

Related MCP Connectors

Threadminder
Persistent context for Claude. Your AI always knows your projects and next actions across sessions.
Amber
Long-term memory for AI assistants. Hybrid retrieval, query expansion, auto-topics.
Engram
Persistent memory for AI agents — verbatim conversations, searchable by meaning.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tznthou/ccRecall'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

ccRecall

Manifesto

Core Concept

Dogfood baseline

Features

Architecture

Session lifecycle

Tech Stack

Quick Start

Prerequisites

Installation

Verify

API Endpoints

MCP Tools

CLI Commands

ccRecall vs auto memory

Running as a service (macOS)

Monitoring

WAL maintenance

Project Structure

Related Projects

Philosophy

Why this exists

Design stances

Roadmap

Changelog

License

Author

Maintenance

Resources

Looking for Admin?

Related MCP Servers

acheron-mcp-server

ClaudeX

mcp-openmemory

sqlite-memory-mcp

Related MCP Connectors

Latest Blog Posts

MCP directory API