qwen-memory-agent
Provides persistent memory agent capabilities using Qwen Cloud (Alibaba Cloud / DashScope) for LLM and embedding services, with tools for remembering, recalling, forgetting, and budget-constrained retrieval of user preferences across sessions.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@qwen-memory-agentRemember that I like coffee without sugar."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
qwen-memory-agent
A benchmarked, MCP-native persistent-memory agent built on Qwen Cloud (Alibaba Cloud / DashScope). Submitted to the Qwen Cloud Hackathon, Track 1 — MemoryAgent.
The agent itself decides — via Qwen function-calling — when to remember, recall, or forget. It carries user preferences across sessions, forgets superseded facts, and recalls the right memories inside a tight token budget — and proves it with numbers against naive baselines.
Why it's different
Most memory agents are "stuff everything into RAG and hope." This one adds:
Agentic memory via Qwen function-calling — the model invokes
remember/recall/forgettools through a real agent loop. It's an agent with memory, not a database with an LLM bolted on.Supersession-aware forgetting — when a new fact contradicts an old one of the same kind, the old record is retired (not just buried under recency).
Budget-constrained recall — retrieval greedily packs the most useful memories until a configurable token budget is hit, so context stays small and relevant.
A reproducible benchmark — synthetic multi-session personas, a held-out query set, and baselines (no-memory / full-history / naive-RAG / ours), scored on recall accuracy, staleness rate, and a context-efficiency curve.
Related MCP server: Memory Crystal MCP Server
Architecture
flowchart TB
U["MCP client / demo UI"]
subgraph ecs["Alibaba Cloud ECS (Singapore)"]
API["FastAPI backend<br/>/chat · /health"]
AGENT["MemoryAgent loop<br/>Qwen function-calling"]
MCP["FastMCP server<br/>memory.remember / recall / forget / stats"]
ENG["Memory Engine<br/>write · retrieve · forget<br/>supersession + token-budget packing"]
QD[("Qdrant<br/>vector store")]
end
DS["Qwen Cloud / DashScope-intl<br/>reasoning model + text-embedding"]
U -->|HTTP| API
U -.->|MCP| MCP
API --> AGENT
AGENT -->|"decides which tool to call"| ENG
MCP --> ENG
AGENT <-->|"chat + tool specs"| DS
ENG <-->|"embed"| DS
ENG <--> QDThe agent loop (/chat) lets Qwen choose tool calls; the same memory engine is also exposed directly over MCP for any MCP client. The Qwen client has bounded retry/backoff for resilience.
Stack
Python · FastAPI · Qwen function-calling agent loop · FastMCP · openai SDK → DashScope-intl · Qwen text-embedding · Qdrant.
Quickstart
uv sync
cp .env.example .env # set DASHSCOPE_API_KEY + DASHSCOPE_BASE_URL
uv run pytest -q # tests run fully mocked — zero Qwen credit spendBenchmark results
Reproducible and fully offline — uv run python -m benchmark.run uses a deterministic
keyword embedder, so the harness measures the memory engine's ranking + supersession logic
(not embedding noise) and costs zero Qwen credits.
System | Recall accuracy | Staleness rate |
B0 — no memory | 0.00 | 0.00 |
B1 — full-history stuffing | 1.00 | 0.50 |
B2 — naive top-k RAG | 1.00 | 0.50 |
B3 — ours (supersession + budget) | 1.00 | 0.00 |
B3 is the only system that recalls the current preference (1.00) and never surfaces a
superseded one (0.00 staleness). B1 and B2 match on recall but re-surface the retired
"coffee" preference on half the queries — because neither has a notion of "this fact was
replaced." Staleness = fraction of queries whose answer contained a retired fact (lower is
better), over the synthetic multi-session persona set in benchmark/generate.py.
Supersession-aware forgetting is what separates B3.
License
MIT — see LICENSE.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/rduffyuk/qwen-memory-agent'
If you have feedback or need assistance with the MCP directory API, please join our Discord server