CPersona
Officialcpersona
MCP Memory Server
Give Claude persistent memory across sessions. Single SQLite file. 16 tools. Zero LLM dependency.
Quick Start · Features · Architecture · All Tools · Zenn Book (JP)
The Problem
Claude forgets everything between sessions. Every conversation starts from zero — no context about your project, your preferences, or what you discussed yesterday.
cpersona fixes this. It's an MCP server that stores memories in a local SQLite file and retrieves them through hybrid search. Claude remembers you.
Quick Start
Prerequisites: Python 3.10+, Git
git clone https://github.com/Cloto-dev/cpersona.git
cd cpersona
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS / Linux
# source .venv/bin/activate
pip install .Claude Desktop — add to claude_desktop_config.json:
{
"mcpServers": {
"embedding": {
"command": "/path/to/.venv/bin/python",
"args": ["/path/to/servers/embedding/server.py"],
"env": {
"EMBEDDING_PROVIDER": "onnx_jina_v5_nano",
"EMBEDDING_HTTP_PORT": "8401"
}
},
"cpersona": {
"command": "/path/to/.venv/bin/python",
"args": ["/path/to/cpersona/server.py"],
"env": {
"CPERSONA_DB_PATH": "/home/you/.claude/cpersona.db",
"CPERSONA_EMBEDDING_MODE": "http",
"CPERSONA_EMBEDDING_URL": "http://127.0.0.1:8401/embed"
}
}
}
}Windows: use
.venv/Scripts/python.exeandC:/Users/you/.claude/cpersona.db
Claude Code:
claude mcp add-json embedding '{"type":"stdio","command":"/path/to/.venv/bin/python","args":["/path/to/servers/embedding/server.py"],"env":{"EMBEDDING_PROVIDER":"onnx_jina_v5_nano","EMBEDDING_HTTP_PORT":"8401"}}' -s user
claude mcp add-json cpersona '{"type":"stdio","command":"/path/to/.venv/bin/python","args":["/path/to/cpersona/server.py"],"env":{"CPERSONA_DB_PATH":"/home/you/.claude/cpersona.db","CPERSONA_EMBEDDING_MODE":"http","CPERSONA_EMBEDDING_URL":"http://127.0.0.1:8401/embed"}}' -s userThat's it. Claude now has persistent memory. Ask it to store something and recall it in a later session.
Features
Hybrid Search — Three independent retrieval strategies run in parallel and merge results via Reciprocal Rank Fusion (RRF):
Layer | Method | Strength |
Vector | Cosine similarity (jina-v5-nano, 768d) | Semantic meaning |
FTS5 | SQLite full-text search with trigram tokenizer | Exact terms, names, IDs |
Keyword | Fallback pattern matching | Edge cases, partial matches |
Memory Types:
Declarative memory — Individual facts, decisions, instructions stored via
storeEpisodic memory — Conversation summaries archived via
archive_episodeProfile memory — Accumulated user/project attributes via
update_profile
Confidence Scoring — Each recalled memory gets a confidence score combining:
Cosine similarity (semantic relevance)
Dynamic time decay (adapts to corpus time range — a 1-year-old corpus and a 1-day-old corpus use different decay curves)
Recall boost (frequently useful memories surface more easily, with natural fade-out)
Completion factor (resolved topics decay faster)
Zero LLM Dependency — cpersona is a pure data server. It never calls an LLM internally. All summarization and extraction is performed by the calling agent. This means zero API costs from cpersona itself, deterministic behavior, and no hidden latency.
Additional capabilities:
Agent namespace isolation — multiple agents share one DB without interference
Background task queue — DB-persisted, crash-recoverable async processing
JSONL export/import — full memory portability between environments
Agent-to-agent memory merge — atomic copy/move with deduplication
Auto-calibration — statistical threshold tuning via null distribution z-score (no labels needed)
Health check — 15 automated detections with auto-repair (contamination, duplicates, FTS desync, invalid data, stale tasks)
stdio + Streamable HTTP transport
Single-file SQLite — no external database required
Architecture
┌─────────────────────────────────────┐
│ MCP Host │
│ (Claude Desktop / Claude Code) │
└──────────────┬──────────────────────┘
│ MCP (JSON-RPC)
┌──────────────▼──────────────────────┐
│ cpersona │
│ (server.py) │
│ │
│ ┌─────────┐ ┌─────────┐ │
│ │ store │ │ recall │ ... │
│ └────┬────┘ └────┬────┘ │
│ │ │ │
│ ┌────▼─────────────▼────────────┐ │
│ │ SQLite DB │ │
│ │ │ │
│ │ memories (content + embed) │ │
│ │ episodes (summaries) │ │
│ │ profiles (attributes) │ │
│ │ memories_fts (FTS5 index) │ │
│ │ episodes_fts (FTS5 index) │ │
│ │ task_queue (async jobs) │ │
│ └────────────────────────────────┘ │
│ │
└──────────────┬───────────────────────┘
│ HTTP
┌──────────────▼──────────────────────┐
│ Embedding Server │
│ (jina-v5-nano ONNX, 768d) │
└─────────────────────────────────────┘Recall flow (RRF mode):
Query → ┌── Vector search (cosine similarity) ──┐
├── FTS5 search (episodes + memories) ──┼── RRF merge → Confidence scoring → Top-K
└── Keyword fallback ──┘Benchmarks
Tested on LMEB (Long-term Memory Evaluation Benchmark, results) — 22 evaluation tasks measuring memory retrieval quality:
Embedding Model | Params | Dimensions | Mean NDCG@10 |
MiniLM-L6-v2 | 22M | 384 | 36.88 |
e5-small | 33M | 384 | 46.36 |
jina-v5-nano | 33M | 768 | 54.14 |
jina-v5-nano achieves +47% improvement over the MiniLM baseline.
All Tools
Tool | Description |
| Store a message in agent memory |
| Recall relevant memories (vector + FTS5 + keyword, RRF merge) |
| Get current agent profile |
| Save pre-computed agent profile |
| Archive conversation episode with summary and keywords |
| List recent memories |
| List archived episodes |
| Delete a single memory (ownership enforced) |
| Delete a single episode (ownership enforced) |
| Delete all data for an agent |
| Auto-calibrate vector search threshold via z-score |
| Export to JSONL (memories, episodes, profiles) |
| Import from JSONL (idempotent via msg_id dedup) |
| Merge one agent's data into another (atomic, with dedup) |
| Background task queue status |
| 15-point database health check with auto-repair |
Configuration
All settings via environment variables with sensible defaults:
Variable | Default | Description |
|
| SQLite database path |
|
| Embedding mode ( |
|
| Embedding server URL |
|
| Vector search mode |
|
| Search strategy ( |
|
| RRF smoothing parameter |
|
| Include confidence metadata in results |
|
| Auto-calibrate on startup |
|
| Enable background task queue |
Stats
~3,000 LOC Python (single file,
server.py)117 tests across 12 test modules
Schema v7 (auto-migrating)
MIT License
Works With
cpersona is an MCP server — it works with any MCP-compatible host:
ClotoCore (AI agent platform, where cpersona originated)
Any custom MCP client
Part of ClotoCore
cpersona is the memory layer of ClotoCore, an open-source AI agent platform written in Rust. While cpersona is fully standalone (MIT license), it was designed to give AI agents persistent, searchable memory within the ClotoCore ecosystem.
Learn More
Zenn Book (Japanese) — Full design walkthrough and setup guide
Memory System Design — Technical specification
ClotoCore — The AI agent platform
License
MIT — free to use from any MCP host without restriction.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Cloto-dev/CPersona'
If you have feedback or need assistance with the MCP directory API, please join our Discord server