Waggle-mcp
Waggle-mcp is a graph-backed persistent memory server for MCP-compatible AI agents, enabling structured storage, intelligent retrieval, and management of knowledge across sessions.
Store knowledge nodes: Save facts, preferences, decisions, entities, concepts, questions, or notes as typed nodes in a persistent graph.
Create relationships: Link nodes with typed edges (e.g.,
relates_to,contradicts,depends_on,updates) to build a connected memory structure.Query memory: Search the graph with natural language, including temporal references like 'recently' or 'last week', with configurable retrieval strategies.
Observe conversations (
observe_conversation): Automatically extract and store durable information from user-assistant turns.Decompose complex content: Break long passages into atomic memory nodes with inferred edges.
Retrieve node neighborhoods: Fetch connected context around a specific node by ID.
Manage conflicts: List and resolve contradictions or update edges without losing historical context.
Inspect node history: Audit why a memory exists, how it changed, and its evidence records.
Prime context: Hydrate a new AI session with the most relevant scoped memories at session start.
Detect topics: Identify topic clusters in the graph via community detection.
Track recent changes (
graph_diff): View what was added or updated over a configurable time window.Export & import: Backup/restore via portable JSON, interactive HTML visualizations, Obsidian-compatible Markdown vaults, or condensed context bundles for cross-tool handoffs.
Memory administration: Update or delete nodes, list context scopes, and retrieve graph health statistics.
Flexible deployment: Supports local-first SQLite with local embeddings, or scalable Neo4j with Docker/Kubernetes, with multi-tenant authentication and easy integration with clients like Claude, Cursor, Gemini CLI, and Codex.
Supports deployment and operation of the memory server in containerized environments, with configuration details available in the reference documentation.
Enables production deployment of the memory server in Kubernetes clusters, with dedicated documentation for orchestration and scaling.
Stores and retrieves decisions and reasons about MySQL usage within the knowledge graph, including contradictions and dependencies between database choices.
Provides round-trip compatibility with Obsidian-style vaults through markdown export/import functionality for editing graph nodes externally.
Stores and retrieves decisions and reasons about PostgreSQL usage within the knowledge graph, including dependencies and contradictions with other database choices.
Uses SQLite as the default local database backend for persistent storage of the knowledge graph with on-device embeddings.
Core
This repository is the public Waggle product repo: Apache-2.0 licensed, available on GitHub and PyPI, and focused on the local-first memory engine.
Related MCP server: Graphiti Knowledge Graph MCP Server
Quick Start
# Install globally (no venv needed)
pipx install waggle-mcp
# One-line setup — detects your MCP clients and writes config
waggle-mcp setup --yes
# Verify everything is healthy
waggle-mcp doctor(No pipx? Run brew install pipx && pipx ensurepath first.)
setup --yes detects Claude Code, Codex, Cursor, Gemini CLI, and Antigravity, writes the MCP config, and installs automatic memory hooks where supported. Restart your client and you're live.
Windows users: Run all commands with
python -X utf8or setPYTHONUTF8=1to avoidUnicodeEncodeErrorfrom emoji in log output.
Install Waggle
Waggle is a local MCP server that gives coding agents persistent graph memory.
Recommended:
VS Code: install the live
Waggle: Local Memory for AI Agentsextension from the Marketplace for one-click setupMCP clients: use docs/install and Smithery metadata in
smithery.yamlClaude: use docs/install/claude-code.md or docs/install/claude-desktop.md
Developers:
pipx install waggle-mcp
Benchmark:
LongMemEval 500-case retrieval-only:
97.4% R@5,89.0% Exact@5forgraph_rawretrieval (artifact)
VS Code extension features:
one-click
Enable for this Workspaceonboardinginstalls
waggle-mcpwith consent if it is missingsafely creates or updates
.vscode/mcp.jsonpreserves existing non-Waggle MCP servers
runs
waggle-mcp doctoropens Graph Studio
exports Waggle memory from the editor
Claude distribution:
Claude Code does not use an
.mcpbbundle. Users add Waggle directly as an MCP server:
pipx install waggle-mcp
claude mcp add --transport stdio waggle -- waggle-mcp serve --transport stdioClaude Desktop uses the
claude-desktop-extension.mcpbbundle, which can be distributed through GitHub Releases.
Manual MCP config:
{
"mcpServers": {
"waggle": {
"command": "waggle-mcp",
"args": ["serve", "--transport", "stdio"]
}
}
}Enterprise Evaluation
For self-hosted production review and security posture:
60-Second Demo
No MCP client needed. Run this from a fresh install:
waggle-mcp demoThis imports a pre-loaded example graph and runs 4 scripted queries locally — no API key, no network, no client required. Add --with-embeddings to use the real sentence-transformers model for higher-fidelity retrieval (requires ~420 MB download on first run).
Why Waggle
waggle-mcp is a local-first memory layer for MCP-compatible AI clients, built on a persistent knowledge graph.
The core difference from flat note storage or chunked RAG is the graph structure. Waggle doesn't just store facts — it stores the relationships between them: this decision depends on that constraint, this preference contradicts that earlier one, this requirement was updated three sessions ago. When you query, you get a subgraph with the reasoning chain attached, not just the matching text.
Without Waggle | With Waggle |
Paste context into every session | Compact subgraph retrieved at query time |
Session-local memory only | Persistent memory across all sessions |
Flat notes, no structure | Typed nodes and edges: decisions, reasons, contradictions |
"What changed?" requires replaying logs | Temporal queries, diffs, and conflict resolution are first-class |
Contradictions silently overwrite history | Both positions preserved, contradiction edge explicit |
What Is In Core Today
Waggle Core is the open-source local memory foundation:
SQLite-backed graph memory
MCP server integration
CLI setup and doctor flows
local embeddings or deterministic fallback
graph querying, observation, and context priming
import/export and graph inspection utilities
Product Scope
This public repo is the product-facing Waggle surface:
MCP server and tool surface
local-first graph memory
automatic memory hooks and orchestration
.abhiexport, import, diff, merge, and checkpoint handoffGraph Studio and admin tooling
Research artifacts, benchmark harnesses, evaluation reports, and paper material now live in the private waggle-pro repo.
Architecture
MCP Client (Claude / Codex / Gemini CLI / Cursor / Antigravity / ChatGPT)
↓
waggle.server — MCP tool surface
↓
RecursiveContextController — RLM-inspired context assembly (build_context)
↓
Graph Engine — MemoryGraph (SQLite) or Neo4jMemoryGraph
↓
Embeddings — sentence-transformers (local) or deterministic fallbackRecursive Context Assembly
Waggle stores memory outside the model context window. Instead of pasting long context into every prompt, agents call build_context to get a compact, high-signal context pack assembled from the graph.
Inspired by Recursive Language Models — the idea of externalising long context into an environment and interacting with it through decomposition and targeted retrieval.
How it works:
Decompose — the query is split into targeted subqueries (decisions, constraints, implementation details, unfinished work, conflicts)
Retrieve — each subquery runs against graph, hybrid, and verbatim transcript retrieval
Expand — the graph is traversed around top nodes via typed edges (
updates,contradicts,depends_on,derived_from)Resolve — update chains and contradictions are detected; superseded nodes are flagged
Deduplicate & rank — overlapping hits are merged; high-signal node types (decisions, preferences) are boosted
Compress — everything is packed into a structured context brief under a configurable token budget
Example MCP call:
{
"tool": "build_context",
"arguments": {
"query": "Continue implementing Waggle from where we left off",
"project": "waggle-mcp",
"token_budget": 1000,
"depth": 2
}
}Example output:
### Waggle Recursive Context Pack
Task: Continue implementing Waggle from where we left off
Current relevant decisions:
- [decision] Use SQLite for local storage: We chose SQLite with WAL mode for local-first deployments.
- [decision] Hybrid retrieval default: Hybrid (vector + BM25 + graph) is the default retrieval mode.
Active constraints:
- [preference] No external LLM APIs required: All retrieval must work fully local.
Important implementation context:
- [fact] RecursiveContextController added: New module waggle/recursive_context.py implements build_context.
Conflicts or superseded context:
- Possible conflict: 'Use Flask' contradicts 'Use FastAPI'Config env vars:
Variable | Default | Description |
|
| Enable/disable the feature |
|
| Default token budget |
|
| Max decomposed subqueries |
|
| Graph expansion depth |
|
| Include transcript evidence |
Tool aliases: recursive_context, assemble_context, rlm_context all resolve to build_context.
How It Works
User → Agent → observe_conversation(...) → Graph stores typed nodes + edges
User → Agent → query_graph("database") → Subgraph returned → Agent answers with linked rationaleSession 1
User: Let's use PostgreSQL. MySQL replication has been painful.
Agent: [calls observe_conversation()]
→ stores decision node: "Chose PostgreSQL over MySQL"
→ stores reason node: "MySQL replication painful"
→ links them with a depends_on edgeSession 2 (fresh context window, no history)
User: What did we decide about the database?
Agent: [calls query_graph("database decision")]
→ retrieves the decision node + linked reason from Session 1
"You decided on PostgreSQL. The reason recorded was that MySQL replication had been painful."Session 3
User: Actually, let's reconsider — the team is more familiar with MySQL.
Agent: [calls store_node() + store_edge(new_node → old_node, "contradicts")]
→ both positions are preserved, and the contradiction is explicitSetting Up as an MCP Server
One-time install:
pipx install waggle-mcp— no API key, no cloud account, no Docker required for local use.
Shared JSON config for clients that accept mcpServers JSON:
{
"mcpServers": {
"waggle": {
"command": "waggle-mcp",
"args": ["serve"],
"env": {
"WAGGLE_TRANSPORT": "stdio",
"WAGGLE_BACKEND": "sqlite",
"WAGGLE_DB_PATH": "~/.waggle/waggle.db",
"WAGGLE_DEFAULT_TENANT_ID": "local-default",
"WAGGLE_MODEL": "all-MiniLM-L6-v2",
"WAGGLE_STARTUP_MODE": "normal"
}
}
}
}First run takes ~30 s —
all-MiniLM-L6-v2(~420 MB) downloads on first use. To skip the download: set"WAGGLE_MODEL": "deterministic"(offline-safe, instant start, slightly lower retrieval quality).
Claude Desktop
Config file location:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.json
Add the mcpServers block above.
Claude Code
claude mcp add waggle \
--env WAGGLE_TRANSPORT=stdio \
--env WAGGLE_BACKEND=sqlite \
--env WAGGLE_DB_PATH=~/.waggle/waggle.db \
--env WAGGLE_DEFAULT_TENANT_ID=local-default \
--env WAGGLE_MODEL=all-MiniLM-L6-v2 \
-- waggle-mcp serveClaude Code also supports automatic memory hooks — see the Hooks section below.
Codex
Add to ~/.codex/config.toml:
[mcp_servers.waggle]
command = "waggle-mcp"
args = ["serve"]
env = {
WAGGLE_TRANSPORT = "stdio",
WAGGLE_BACKEND = "sqlite",
WAGGLE_DB_PATH = "~/.waggle/waggle.db",
WAGGLE_DEFAULT_TENANT_ID = "local-default",
WAGGLE_MODEL = "all-MiniLM-L6-v2"
}waggle-mcp setup --yes also writes a managed memory block into AGENTS.md in the current workspace so automatic memory is enabled by default for that repo.
Gemini CLI
gemini mcp add waggle \
-e WAGGLE_TRANSPORT=stdio \
-e WAGGLE_BACKEND=sqlite \
-e WAGGLE_DB_PATH=~/.waggle/waggle.db \
-e WAGGLE_DEFAULT_TENANT_ID=local-default \
-e WAGGLE_MODEL=all-MiniLM-L6-v2 \
waggle-mcp serveAfter restarting, run /mcp to confirm Waggle is connected.
Cursor
Cursor Settings → Features → MCP Servers → + Add
Command:
waggle-mcpArgs:
serveEnv vars: same keys as the JSON block above.
Antigravity
The AI agent reads ~/.gemini/antigravity/mcp_config.json (macOS/Linux) or %USERPROFILE%\.gemini\antigravity\mcp_config.json (Windows). Add the waggle block there. The VS Code extension panel reads a different file — adding waggle there will NOT make it available to the AI agent.
Run waggle-mcp doctor to see exactly which config files exist and which ones have a waggle entry.
ChatGPT
ChatGPT custom MCP connectors require a remote HTTPS server. Deploy Waggle in HTTP mode with the Neo4j backend, expose /mcp over HTTPS, then add that URL as a custom connector in ChatGPT (Settings → Connectors → Advanced → Developer mode).
WAGGLE_TRANSPORT=http \
WAGGLE_BACKEND=neo4j \
WAGGLE_DEFAULT_TENANT_ID=workspace-default \
WAGGLE_NEO4J_URI=bolt://localhost:7687 \
WAGGLE_NEO4J_USERNAME=neo4j \
WAGGLE_NEO4J_PASSWORD=change-me \
waggle-mcp serveDo not expose Waggle publicly without authentication.
waggle-mcp not on PATH?
pipx ensurepath # then restart your terminalAutomatic Memory — Prompt Rules
Registering Waggle as an MCP server only makes the tools available. For the agent to call them automatically, add this instruction block to your client's prompt, rules, or project instructions:
Use Waggle automatically for conversational memory.
At the start of a new session, if project, agent, or session scope is known, call prime_context.
Before answering questions that may depend on prior decisions, preferences, constraints, project state,
or earlier conversation context, call query_graph with the narrowest relevant scope.
After completed turns that contain durable information such as decisions, preferences, constraints,
requirements, user corrections, project facts, or meaningful task outcomes, call observe_conversation
automatically.
Waggle should remember relevant context automatically. If memory appears empty, the session is likely
missing the automatic memory policy or the runtime hooks that call build_context before answers and
on_assistant_turn after answers.
Do not ask the user to trigger Waggle manually. Use it in the background when relevant.Use the same stable project value for the same codebase across sessions, or recall will fragment.
Automatic Memory Hooks (Claude Code)
For Claude Code, waggle-mcp setup --yes installs three hook scripts that capture memory deterministically — no prompt rules needed:
Hook script | Claude Code event | What it does |
|
| Tries scoped DB recall first; if the scope is cold and a session checkpoint exists, imports the |
|
| Applies Waggle's durable-ingest policy and only calls |
|
| Calls |
Each hook always exits 0 (a Waggle bug never blocks your session) and has a 5-second timeout. post_response.py scans turn text for likely secrets before storing, skips low-value chatter, and only ingests durable turns.
# Install hooks (included in setup --yes)
waggle-mcp setup --yes
# Skip hook installation
waggle-mcp setup --yes --no-hooks
# Remove hooks
waggle-mcp uninstall-hooksVerify It Works
After restarting your client, ask the agent:
"Store a note: we're using PostgreSQL for this project."
Then open a fresh session and ask:
"What database are we using?"
Expected:
You're using PostgreSQL for this project.MCP Tool Reference
The full tool surface is large (~40 tools). In practice, an agent in normal use only needs the six core tools. Everything else is for human-driven inspection, graph management, and export workflows.
Core tools — what the agent calls automatically
These are the tools your prompt rules or hooks should wire up. An agent that only knows these six will handle the vast majority of memory tasks correctly.
Tool | When the agent calls it |
| After any turn containing a decision, preference, constraint, correction, or project fact. Persists the verbatim turn first, then extracts graph nodes. Returns |
| Before answering questions that may depend on prior context. Hybrid retrieval (graph + verbatim transcript) by default. Supports |
| At the start of a new session to hydrate context from the most relevant scoped memories. |
| When the user asks what changed recently. |
| When the agent needs to store a single atomic fact or explicitly link two nodes. Prefer |
observe_conversationanddecompose_and_storecreate edges automatically. If you only callstore_node, you get isolated facts with no traversal value.
Extended retrieval — agent-callable, situational
Tool | Description |
| Broad filtered subgraph for map-reduce tasks. Use when you want a large scoped slice rather than high-precision top-K. Supports |
| Fetch the neighborhood around a specific node by ID. |
| Inspect a node's evidence, validity window, and connected context. |
| Topic clusters via community detection across the full tenant graph. |
| Chronological view of memory changes for a node or query. |
| List unresolved contradiction and update edges. |
| Mark a conflict resolved. Pass |
Operator / human tools — not for routine agent use
These are for you as the operator: graph health, deduplication, export, and migration. Exposing all of these to an agent in normal use adds noise without benefit.
Graph management
Tool | Description |
| Update an existing node's content, label, or tags. |
| Delete a node and all its edges. |
| Break long content into atomic nodes and infer edges automatically. |
| Return near-duplicate node pairs above a similarity threshold for human review. |
| Merge multiple nodes into one canonical node. Repoints all edges, collects aliases. Idempotent. |
| Audit edge quality — counts, average confidence per type, top/bottom confidence edges. |
| Diagnose retrieval ranking for a query — embedding scores, window routing, tiered vs flat comparison. |
Context windows
Tool | Description |
| Known agent, project, and session scope values. |
| Chat/session-level memory containers with status and node counts. |
| Inspect one context window and its nodes. |
| Close a session window and derive cross-window edges. |
| Export the context-window graph as an interactive HTML visualization. |
| Export the memory graph as an interactive HTML visualization. |
| Node/edge counts, type breakdowns, and recent highly-connected nodes. |
Memory files — git-vocabulary interface
Waggle uses a git-inspired vocabulary for portable memory snapshots:
Tool | Git analogy | Description |
|
| Snapshot the graph to a |
|
| Load a |
|
| Compare two |
|
| Three-way merge two |
|
| Validate a |
|
| Inspect a |
|
| Execute a saved or ad hoc query against a |
| — | Load only selected or query-relevant chunks from a |
Legacy tool names (export_graph_backup, import_abhi, diff_abhi, merge_abhi, validate_abhi, inspect_abhi, query_abhi) are still accepted and automatically mapped to their canonical equivalents.
Vault
Tool | Description |
| Export the graph as an Obsidian-compatible Markdown vault. |
| Import an edited Obsidian vault back into the graph non-destructively. |
What To Ask The Agent
Ask the agent... | Tool called |
"Remember that..." |
|
"What do you know about X?" |
|
"What changed recently?" |
|
"Summarize context for a new session" |
|
"Show all stored topics" |
|
"Export my memory to a file" |
|
"Are there any duplicate nodes?" |
|
"What's the quality of my graph edges?" |
|
Edges are what make graph memory work.
observe_conversationanddecompose_and_storecreate edges automatically. If you only callstore_node, you get isolated facts — not a connected graph.
For broad summarization tasks, prefer aggregate_graph over query_graph when you want a large scoped slice of memory instead of high-precision semantic ranking.
Graph Data Model
Node types: fact, entity, concept, preference, decision, question, note
Edge types: relates_to, contradicts, depends_on, part_of, updates, derived_from, similar_to
Temporal validity: Every node supports valid_from and valid_to fields. query_graph and aggregate_graph exclude expired nodes by default. Pass include_invalidated: true to include them, or as_of: "<ISO-8601 datetime>" to query the graph at a specific point in time. resolve_conflict with a winner node ID automatically sets the losing node's valid_to to now.
Cross-Client Handoffs & Migration
Same machine — automatic sharing
Point multiple clients at the same WAGGLE_DB_PATH (default ~/.waggle/waggle.db) and they share one brain automatically.
Session handoffs
# Explicit checkpoint before switching sessions or apps
waggle-mcp checkpoint-context --project MCP --session-id thread-123 --output ./handoff.abhiResume order is:
same machine / shared
WAGGLE_DB_PATH: use the live SQLite memory firstif that scoped DB recall is empty: import the session
.abhicheckpointdifferent machine or explicit transfer:
waggle-mcp pull ./handoff.abhi
Full migration
# Export
waggle-mcp export-graph-backup --output-path my_memory.json
# Import on new machine
waggle-mcp import-graph-backup --input-path my_memory.jsonCLI Command Reference
Command | Description |
| Show all commands and options. |
| Best first command — tool map, workflows, and setup hints. |
| Run this if something isn't working — checks config, model cache, DB path. |
| Re-embed stale rows after a |
| Non-interactive one-line setup for all detected clients. |
| Interactive setup wizard for one client. |
| Run the MCP server (usually started by your client). |
| Run the 60-second local demo with a pre-loaded example graph. |
| Launch Graph Studio in the browser. |
| Remove the waggle-managed hooks block from Claude Code settings. |
| Export a portable Markdown/JSON context pack. |
| Export the graph as an Obsidian-style vault. |
| Ingest a rollover transcript, export a handoff bundle, and emit a session |
WAGGLE_STARTUP_MODE
Value | Behaviour | Best for |
| Model loads in background; server responds immediately | Daily use |
| ML never loads; semantic tools return | Schema inspection, tool listing |
| Server blocks until model is fully loaded | Production deployments |
Graph Studio
waggle-mcp edit-graphA local browser-based graph editor for inspecting and editing memory directly. Features:
Dual-layer graph/conversation views in the same UI
Transcript provenance and retrieval-debug inspection for hybrid memory results
Mouse-based node dragging and shift-drag edge creation
Collapsible side panels, focus mode, and label toggling for large graphs
Live graph stats: connected nodes, isolates, cluster count
Export/import of the current graph, including
.abhipreview, diff, and sharing workflows
.abhi files are JSON underneath — they support optional embedded vectors, optional AES-256-GCM encryption, deterministic content hashing, and a magic-bytes header (WGL\x01) for format identification. Legacy bare-ZIP files are read transparently.
waggle-mcp push encrypts .abhi exports by default. Export paths refuse to proceed if transcript text contains likely secrets unless you pass --force.
Model Support
Waggle uses a local sentence-transformers model selected by WAGGLE_MODEL.
Default:
all-MiniLM-L6-v2Any locally available
sentence-transformersmodel name works.If the model is unavailable, Waggle falls back to deterministic SHA-256 embeddings.
WAGGLE_MODEL=all-mpnet-base-v2 waggle-mcp serveFor existing DBs, a model change is a migration event: run waggle-mcp doctor, then waggle-mcp doctor --fix if mixed embedding_model_id values are reported.
WAGGLE_DEDUP_THRESHOLD
Controls the cosine similarity threshold for automatic node deduplication at write time (default 0.88, minimum 0.85). Nodes above this threshold with matching type and scope are merged automatically. Use dedup_candidates to review near-duplicates below the auto-merge threshold.
Security & Privacy
Data stays local by default (
~/.waggle/waggle.db). No telemetry, no cloud calls for local operation.Memory only leaves your machine if you configure a remote backend or explicitly export/push.
Local SQLite is not encrypted at rest — use OS disk encryption if the stored history is sensitive.
Before
.abhiexport, Waggle scans transcript text for likely secrets (API keys, JWTs, passwords). Export is refused if secrets are found unless you pass--force.waggle-mcp pushdefaults to AES-256-GCM encrypted export.
Known Limitations
Edges are load-bearing.
observe_conversationanddecompose_and_storecreate them automatically. Rawstore_nodecalls without follow-up edges produce disconnected nodes with no traversal value.Graph retrieval trades tokens for reasoning context. Factual lookups are often cheaper than chunked RAG; graph-expansion queries intentionally spend more tokens to carry update chains and contradictions.
Hybrid rerank is not the default. The no-rerank hybrid path is stronger right now. The rerank path is available but intentionally not the launch default.
Deduplication is similarity-based, not universal semantic equivalence. Broader production text may still require additional aliases or stricter domain guards.
Troubleshooting
Run waggle-mcp doctor first — it catches the most common issues automatically.
Symptom | Likely cause | Fix |
|
|
|
| Embedding model downloading (~420 MB) | Set |
| Windows stdout not UTF-8 |
|
| Old pre-v0.2 field names | Use |
Waggle registered but agent doesn't see it (Antigravity) | Wrong config file | Agent reads |
| Wrong Python environment | Use |
| Python/OS wheel mismatch | Use Python 3.11+; upgrade: |
Reference & Docs
Environment variables, full tool surface, admin commands, Docker setup:
waggle-mcp --helpordocs/reference.mdProduction deployment:
deploy/kubernetes/README.mdOperations and troubleshooting:
docs/runbooks/Automatic memory rules (copy-pasteable):
docs/automatic-memory-rules.mdHook integration details:
docs/hooks.md.abhi format spec:
docs/abhi-format-v2.md
Contributing
This repository is maintained privately. Internal contributors can use the docs in this repo as the source of truth.
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Abhigyan-Shekhar/Waggle-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server