Enables automated database backups by exporting memory data to a private GitHub repository using pg_dump.
Leverages OpenAI's API for generating text embeddings, performing semantic chunking, and handling deduplication and conflict resolution within the memory system.
Utilizes PostgreSQL with pgvector and ltree support to provide persistent storage, hybrid vector search, and a hierarchical taxonomy for semantic memory.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@memory-mcpSearch for the architecture notes we saved for the Hardpoint project"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
memory-mcp
Persistent, self-organizing semantic memory for AI agents — served as an MCP server.
What is this?
memory-mcp is a Model Context Protocol server that gives AI agents durable, searchable memory backed by PostgreSQL and pgvector. Drop it into any MCP-compatible client (Claude Code, Cursor, Windsurf, etc.) and your agent gains the ability to remember, retrieve, and reason over information across sessions — without you managing any schema or storage logic.
What it does autonomously:
Chunks and embeds incoming text
Categorizes memories into a hierarchical taxonomy (
ltreedot-paths)Deduplicates against existing memories and resolves conflicts
Synthesizes a System Primer — a compressed, always-current summary of everything it knows — and surfaces it at session start
Expires stale memories via TTL and prompts for verification of aging facts
Why memory-mcp?
memory-mcp | Simple vector DB | LangChain / LlamaIndex memory | |
Schema management | Automatic | Manual | Manual |
Deduplication | Semantic + LLM | None | None |
Taxonomy | Auto-assigned ltree | None | None |
Session bootstrap | System Primer | Manual RAG | Manual |
Conflict resolution | LLM-evaluated | None | None |
Ephemeral context | Built-in (TTL store) | No | No |
Self-hostable | Yes (Docker) | Varies | No |
MCP-native | Yes | No | No |
Architecture
AI Agent (Claude Code / Cursor / Windsurf)
│ HTTP (MCP — Streamable HTTP)
▼
┌──────────────────────────────────────────┐
│ server.py │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Production MCP │ │ Admin MCP │ │
│ │ :8766/mcp │ │ :8767/mcp │ │
│ └────────┬────────┘ └────────┬────────┘ │
│ │ tools/ │ │
│ ┌────────▼──────────────────▼────────┐ │
│ │ ingestion · search · context │ │
│ │ crud · admin_tools · context_store│ │
│ └────────────────┬───────────────────┘ │
│ │ │
│ ┌────────────────▼───────────────────┐ │
│ │ Background Workers │ │
│ │ Ingestion Queue · TTL Daemon │ │
│ │ System Primer Auto-Regeneration │ │
│ └────────────────┬───────────────────┘ │
└───────────────────┼──────────────────────┘
│ asyncpg
▼
PostgreSQL + pgvector
┌─────────────────┐
│ memories │ chunks, embeddings, ltree paths
│ memory_edges │ sequence_next, relates_to, supersedes
│ ingestion_staging│ async job queue
│ context_store │ ephemeral TTL store
└─────────────────┘
│
┌──────────▼──────────┐
│ Backup Service │ pg_dump → private GitHub repo
└─────────────────────┘Two servers, one process:
Production (
:8766) — tools safe for the agent to call freelyAdmin (
:8767) — superset including destructive tools (delete, prune, bulk-move). Point your agent at production; use admin for maintenance.
Quickstart (Docker)
Prerequisites: Docker + Docker Compose, an OpenAI API key.
# 1. Clone
git clone https://github.com/isaacriehm/memory-mcp.git
cd memory-mcp
# 2. Configure
cp .env.example .env
$EDITOR .env # set OPENAI_API_KEY and DB_PASSWORD at minimum
# 3. Start
docker compose up -d
# Production MCP endpoint: http://localhost:8766/mcp
# Admin MCP endpoint: http://localhost:8767/mcpTo rebuild after code changes:
docker compose up -d --build memory-apiConnecting to an MCP Client
Claude Code
Add to your project's .claude/settings.json or ~/.claude/settings.json:
{
"mcpServers": {
"memory": {
"type": "http",
"url": "http://localhost:8766/mcp"
}
}
}Or via the CLI:
claude mcp add memory --transport http http://localhost:8766/mcpThen add this instruction to your CLAUDE.md so the agent always bootstraps memory at session start:
## Memory
At the start of every session, call `initialize_context` before anything else.
This returns your System Primer — your identity, current knowledge taxonomy, and retrieval guide.
Always consult it before answering questions about prior context.Cursor / Windsurf
Add to your MCP settings (.cursor/mcp.json or equivalent):
{
"mcpServers": {
"memory": {
"url": "http://localhost:8766/mcp"
}
}
}MCP Tools
Production Tools (:8766)
Tool | Description |
| Call first every session. Returns the System Primer + verification prompts for aging memories. |
| Ingest raw text. Automatically chunks, embeds, categorizes, and deduplicates. Supports |
| Poll async ingestion job by |
| Hybrid vector + BM25 search with Reciprocal Rank Fusion. Supports optional, bounded feedback rerank behind a kill switch. Filter by |
| Record retrieval feedback ( |
| Return all occupied taxonomy paths with memory counts. |
| Drill into a collapsed |
| Reconstruct a full document by following |
| Compare two memory IDs and return semantic |
| Inspect the full supersession chain (oldest → newest) for a memory. |
| Return chronological decision events ( |
| Build and store a deterministic execution handoff at |
| Inspect recent conflict-resolution events with optional category, resolution, and time filters. |
| Confirm an aging memory is still accurate. Advances its |
| Rewrite a memory's content in-place (preserves identity, edges, history). |
| Write a key/value pair to the ephemeral context store with a TTL. |
| Retrieve an ephemeral context entry by key. |
| List active (non-expired) context keys, optionally filtered by scope. |
| Explicitly delete a context entry before its TTL expires. |
| Push a context entry's expiry forward by N hours. |
Admin-Only Tools (:8767)
Tool | Description |
| Hard-delete a memory by ID (cascades edges). |
| Batch-delete superseded memories older than N days. |
| Export all active memories to JSON. |
| Move a single memory to a new taxonomy path. |
| Move an entire taxonomy branch (e.g. |
| Patch a memory's metadata JSONB in-place. |
| Report on pool health, memory counts, ingestion queue depth. |
| Breakdown of ingestion job statuses. |
| Clear all completed/failed staging jobs immediately. |
Feedback Rerank Rollout
Feedback reranking is intentionally guarded:
Base retrieval (semantic + keyword + RRF) always stays primary.
Feedback is a bounded secondary adjustment (
FEEDBACK_MAX_DELTA, default0.05).Tier floors can protect diversity in top-K (
CANONICAL_MIN_IN_TOPK,HISTORICAL_MIN_IN_TOPK).Historical memories receive a mild base-score multiplier before rerank (
HISTORICAL_BASE_SCORE_MULTIPLIER, default0.85).Collection can stay on while rerank is off.
Rollback is immediate:
FEEDBACK_RERANK_ENABLED=falseTaxonomy
Memories are organized into a dot-path hierarchy using PostgreSQL ltree. The system assigns paths automatically during ingestion. You can override with recategorize_memory or bulk_move_category.
Project classifications under projects.* are derived dynamically from active taxonomy roots during ingestion. Known roots are preferred; if no known root fits and content strongly signals a new project slug, a new projects.<slug> root can be admitted automatically.
Example paths:
profile.identity.core
profile.health.medical
projects.myapp.architecture
projects.myapp.decisions
organizations.acme.business
concepts.ai.behavior
reference.system.primer ← auto-generated System Primer lives hereSearch is subtree-aware — passing category_path: "projects.myapp" returns everything under that branch.
System Primer
initialize_context returns a synthesized summary stored at reference.system.primer. It includes:
A compressed user/agent profile
The full taxonomy tree with memory counts
Retrieval guidance
The primer auto-regenerates in the background when ≥10 new memories are ingested or when the previous primer is older than 1 hour. You can force regeneration via the admin tool synthesize_system_primer.
Environment Variables
Copy .env.example to .env and fill in your values.
Required
Variable | Description |
| PostgreSQL connection string (e.g. |
| OpenAI API key for embeddings and LLM calls |
| PostgreSQL password (used by Docker Compose) |
Optional — Models & Embeddings
Variable | Default | Description |
|
| OpenAI embedding model |
|
| LLM for semantic section extraction and categorization |
|
| LLM for conflict/dedup evaluation |
|
| Embedding vector dimension (must match model) |
Optional — Search & Limits
Variable | Default | Description |
|
| Default result count for |
|
| Default result count for |
|
| Cosine similarity threshold for deduplication |
|
| Similarity threshold for conflict detection |
|
| Similarity threshold for |
|
| Minimum character length for a chunk to be stored |
|
| Max taxonomy paths assigned per ingestion |
Optional — Feedback Rerank (Guarded)
Variable | Default | Description |
|
| Kill switch for applying feedback rerank in |
|
| Max absolute score adjustment from feedback (bounded around base score). |
|
| Exponential decay half-life for older feedback events. |
|
| Minimum canonical memories kept in top-K when available. |
|
| Minimum historical memories kept in top-K when available. |
|
| Optional number of top-K slots reserved for underexplored candidates. |
|
| Multiplier applied to historical-tier base retrieval score before feedback rerank. |
Optional — Tier Inference
Variable | Default | Description |
|
| Enables LLM-suggested memory tier at ingestion (explicit/manual tier still wins). |
Optional — OpenAI & Concurrency
Variable | Default | Description |
|
| Per-request OpenAI timeout in seconds |
|
| Exponential-backoff retry limit |
|
| Semaphore for parallel OpenAI requests |
|
| Reasoning effort for extraction LLM |
|
| Reasoning effort for conflict LLM |
Optional — Database
Variable | Default | Description |
|
| asyncpg minimum pool connections |
|
| asyncpg maximum pool connections |
|
| Days to retain completed/failed staging jobs |
Optional — Authentication
Variable | Default | Description |
| (unset) | Static Bearer token for the production server. Also used as OAuth client secret in the minimal connector bridge. |
|
| OAuth bridge client id expected from connector OAuth settings. |
|
| OAuth bridge client secret expected at |
|
| Optional comma-separated allowlist for OAuth bridge redirect URIs. |
| (auto from request URL) | Optional explicit issuer URL when behind reverse proxies/CDNs. |
Optional — Server
Variable | Default | Description |
|
| Production MCP server port |
|
| Admin MCP server port |
|
| FastMCP transport mode |
| — | Set to |
|
|
|
Optional — System Primer
Variable | Default | Description |
|
| Max seconds before auto primer regeneration |
Optional — Context Store
Variable | Default | Description |
|
| Default TTL for context store entries |
|
| Max character length for context values |
|
| Max character length for context keys |
Optional — Backup Service
Variable | Description |
| GitHub Personal Access Token with |
| Target repo in |
| Seconds between backups (default: |
External Provider Auth
This server uses static Bearer token auth (API_KEY) as the primary security model.
Set an API key:
API_KEY=your-generated-tokenProvider-side secret mapping:
Secret key id:
api-keySecret value: your server
API_KEYvalue
Every request to the production server then requires Authorization: Bearer <token>.
Header template for clients that support secret interpolation:
Authorization: Bearer {{secrets.api-key}}Minimal OAuth Bridge for Claude Connect
For Claude connector compatibility, the server also exposes minimal OAuth routes:
GET /authorizePOST /token/.well-known/oauth-authorization-server/.well-known/oauth-protected-resource
Bridge behavior is intentionally minimal:
OAuth
client_idmust matchOAUTH_CLIENT_ID(defaultapi-key).OAuth
client_secretmust matchOAUTH_CLIENT_SECRET(defaults toAPI_KEY).Successful token exchange returns bearer token
API_KEYfor MCP calls.
This keeps API-key auth as the only credential while satisfying connector OAuth route expectations.
MCP client config (external, with auth):
{
"mcpServers": {
"memory": {
"type": "http",
"url": "https://your-public-url/mcp",
"headers": {
"Authorization": "Bearer your-generated-token"
}
}
}
}WireGuard / trusted network (no auth):
{
"mcpServers": {
"memory": {
"type": "http",
"url": "http://10.x.x.x:8766/mcp"
}
}
}The same server handles both direct Bearer usage and connector OAuth handshake mapped to API_KEY.
Running Locally (Development)
Requirements: Python 3.11+, PostgreSQL with pgvector.
# Create and activate virtual environment
python3.11 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Configure
cp .env.example .env
$EDITOR .env
# Start the server
python -m server
# Production: http://0.0.0.0:8766
# Admin: http://0.0.0.0:8767Backup Service
The backup/ directory contains a containerized PostgreSQL backup job that:
Runs
pg_dumpon the configured interval (default: every 6 hours)Commits the dump to a private GitHub repository
The backup service starts automatically with docker compose up. Set GITHUB_PAT and GITHUB_BACKUP_REPO in your .env to enable it. If those variables are unset, the service will error on startup — remove the memory-backup service from docker-compose.yml if you don't need backups.
CLI Scripts
Standalone scripts in scripts/ (require DATABASE_URL in environment):
# Export all memories to a timestamped JSON file
python scripts/export_memories.py
# Generate an interactive graph visualization
python scripts/visualize_memories.py
open memory_map.htmlContributing
See CONTRIBUTING.md.
License
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.