Mnemosure
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MnemosureRecall the decision about the API versioning strategy."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Mnemosure
English | 한국어
An AI memory layer that says "I don't know" when it doesn't, and cites its source when it does. Qwen Cloud Global AI Hackathon · Track 1 (MemoryAgent)
Across many sessions of AI-assisted work, two failures compound: the assistant forgets decisions that were made, and it hallucinates ones that were not. Mnemosure is a source-grounded memory layer that attacks both.
Its core claim: it does not invent what it cannot remember, and it does not drop what it remembers.
What it does
Stores durable facts from a conversation — decisions, changes, failures, established facts — and throws away the chatter.
Links memories over time: when a new decision overrides an old one, the old one is marked
superseded; the reason for a change is linked back to the failure that caused it (because).Recalls with a confidence level and citations. When the evidence overrides an old memory, the answer corrects the old fact instead of repeating it. When there is no evidence, it answers "not in the record" instead of guessing.
Every answer comes back as one of three confidence levels — certain / vague / unknown — with the source of each cited memory.
Related MCP server: widemem-ai
Architecture
flowchart TB
MCP["MCP server · stdio<br/>recall · remember · list_memories"]
WEB["Demo web · FastAPI<br/>/ask · /memories · /results"]
subgraph Ingest["Ingest (remember)"]
direction TB
S["Session text"] --> EX["Extract · qwen3.5-flash<br/>decision / change / failure / fact"]
EX --> EMB1["Embed · text-embedding-v4 · 1024d"]
EMB1 --> LINK["Link associations<br/>supersedes: cosine ≥ 0.35 → flash verdict<br/>because: failure, cosine ≥ 0.15 → flash is_cause"]
LINK --> STORE[("data/memories.json")]
end
subgraph Recall["Recall (recall)"]
direction TB
Q["Query"] --> EMB2["Embed query"]
EMB2 --> COS["Cosine top-6<br/>superseded included"]
COS --> RR["Rerank · qwen3-rerank<br/>top score under 0.15 → 'unknown'"]
RR --> EXP["Associative expand<br/>supersedes / because · 2 hops"]
EXP --> ANS["Answer · qwen3.7-plus · temp 0<br/>confidence + answer + citations"]
end
MCP --> S
MCP --> Q
WEB --> Q
STORE -. retrieve .-> COS
STORE -. expand .-> EXPIngest (mnemosure/memory/store.py): a session is passed to qwen3.5-flash, which extracts only what will matter later. Each memory is embedded, then two kinds of association are drawn — a lexical prefilter (cosine similarity) proposes candidates and the flash model makes the final call, so nothing is linked on surface similarity alone. Failures are never superseded (lessons are kept forever).
Recall (mnemosure/memory/recall.py): the query is embedded and the top candidates are pulled — including superseded ones, because correcting a stale belief requires finding it first. qwen3-rerank re-orders by relevance; if even the best hit is too weak, the answer is unknown rather than a guess. The surviving seeds are expanded two hops along their supersedes/because links, and qwen3.7-plus composes the final answer grounded only in that evidence (temperature 0). Broad "summarize everything" questions bypass top-K and ground on all active memories so nothing is dropped.
Models (Qwen Cloud / DashScope)
Role | Model | Endpoint |
Brain (main answer) |
| OpenAI-compatible |
Brain (extract / judge / classify) |
| OpenAI-compatible |
Index (embeddings, 1024-dim) |
| OpenAI-compatible |
Precision rerank |
| Native |
The API key is read only from the environment (or .env) and is never hard-coded. Model IDs default to the values above; override them per-run with MNEMOSURE_MODEL_* env vars if a quota runs out — there is no automatic switching. Defaults live in mnemosure/config.py (the single source of truth).
Install
pip install mnemosure # core library + MCP server
pip install "mnemosure[demo]" # also pulls FastAPI/uvicorn for the demo web serverThen provide your Qwen Cloud key (export DASHSCOPE_API_KEY=...) and run the MCP server:
mnemosure-mcp # stdio MCP serverA Qwen key is required at runtime — Mnemosure is a Qwen client (extraction, embeddings, rerank, answering all run on Qwen). Without a key the tools raise a clear error.
Where memories are stored: an installed copy starts with an empty warehouse at
~/.mnemosure/memories.json. Override the directory withMNEMOSURE_DATA_DIR. (The pre-loaded NXTBot demo snapshot lives in this repository, not in the pip package — clone the repo to see it.)
Quick start (from source)
# 1) create and activate a project virtual environment
python3 -m venv .venv
source .venv/bin/activate
# 2) install dependencies
pip install -r requirements.txt
# 3) provide your Qwen Cloud (DashScope) key
cp .env.example .env # then edit .env and set DASHSCOPE_API_KEY
# 4) verify all four models are reachable
python scripts/check_models.pyRun the demo
The repository ships with a precomputed demo snapshot (data/memories.json, data/demo_results.json), so the demo works right after cloning:
python scripts/run_demo.py # → http://127.0.0.1:8000The memory warehouse and the before/after evaluation panels render straight from the snapshot — no API key needed to browse them. Only /ask (live grounded recall) calls Qwen and therefore needs a key. To regenerate the snapshot from scratch (consumes quota):
python scripts/gen_demo_data.pyUse it as an MCP server
Mnemosure exposes the memory layer over the Model Context Protocol, so any MCP-capable agent (Claude Desktop, Claude Code, …) can call it as a tool.
mnemosure-mcp # if installed via pip
python -m mnemosure.mcp_server # equivalent, from a source checkoutRegister it in your agent's .mcp.json (or equivalent). After pip install mnemosure, the console command is enough:
{
"mcpServers": {
"mnemosure": {
"command": "mnemosure-mcp",
"env": { "DASHSCOPE_API_KEY": "your-dashscope-api-key" }
}
}
}Running from a source checkout instead of an install? Use
"command": "/abs/path/.venv/bin/python","args": ["-m", "mnemosure.mcp_server"], and add"PYTHONPATH": "/abs/path/to/repo"so the package is importable regardless of the launcher's working directory.
Tools:
Tool | Signature | Returns |
|
|
|
|
|
|
|
| list of active (or all) memories with source |
Note: the server itself calls Qwen for classification, recall, and grounding — it is agent-agnostic but assumes a Qwen key is present (via env or
.env).
Evaluation approach
Quality is measured by labeling each answer's behavior — accurate / omission / hallucination / noise / honest — alongside our three-way confidence (certain / vague / unknown), rather than a single opaque score. The whole pipeline (extraction, supersession judgment, scoring) runs at temperature 0 for reproducibility. The demo serves a fixed snapshot so results are stable across viewings.
See mnemosure/evaluation/ (harness.py, judge.py, label.py, baseline.py, answer_key.py).
Project structure
mnemosure/
config.py # models, endpoints, key loading — single source of truth
qwen_client.py # the only gateway to Qwen (chat / embed / rerank)
mcp_server.py # MCP tools: recall · remember · list_memories (stdio)
memory/
store.py # ingest: extract → embed → link supersedes/because → save
recall.py # recall: embed → rerank → associative expand → grounded answer
forget.py # forgetting / relevance handling
storage.py # JSON-file memory warehouse
models.py # Memory / Association / Source dataclasses
evaluation/ # harness · judge · label · baseline · answer_key
demo/
server.py # FastAPI: /ask · /memories · /results
index.html # single-page demo UI
sample_sessions.py# fictional NXTBot scenario used by demo & eval
scripts/ # check_models · gen_demo_data · run_demo · demo_* helpers
data/ # memories.json + demo_results.json (demo snapshot, committed)Deployment
An Alibaba Cloud deployment guide (containerized) is being added in a follow-up commit.
License
MIT.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/jsiksn/mnemosure'
If you have feedback or need assistance with the MCP directory API, please join our Discord server