memory-mcp
Allows exposure of the memory-mcp server to the internet via Cloudflare Tunnel for secure remote access.
Polls Home Assistant's REST API for device states and pushes them as readings into memory-mcp.
Provides an MQTT bridge to subscribe to topics and feed sensor data into memory-mcp as time-series readings.
Enables two-way sync of memories with Obsidian-compatible Markdown files for export and import.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@memory-mcpremember that my favorite color is blue"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
memory-mcp-server
Unified semantic memory + time-series intelligence layer for OpenHome abilities.
Architecture: memory tiers
Tier 1 — Semantic memory entities, memories, relations, vectors
Tier 1.5 — Episodic memory conversation sessions, turn-by-turn transcripts
Tier 1.75 — Working memory short-lived task scratchpads, promote-on-close
Tier 2 — Time-series store readings (numeric/categorical/composite), rollups, schedule
Tier 3 — Pattern engine background task: promotes stable trends → Tier 1 memories
Tier 4 — Prospective memory intentions: trigger_text → action_text, checked on each turn
Tier 5 — Spatial memory last-known object locations with confidence decayThe pattern engine closes the loop: raw sensor data (Tier 2) automatically becomes searchable, natural-language memory ("Brian's temperature preference is consistently 68°F") that any ability can recall semantically (Tier 1).
Related MCP server: Mengram
Files
File | Purpose |
| MCP server (stdio transport, all 35 tools) |
| FastAPI HTTP wrapper + admin UI mount |
| Admin UI router (served at |
| Speaker identity API ( |
| Entity graph API ( |
| Markdown two-way sync — export and import of Obsidian-compatible |
| Utility to re-embed all memories when swapping models |
| Jinja2 HTML templates for the admin UI |
| vis.js entity graph SPA template |
| Standalone tools that connect external systems to memory-mcp via HTTP |
Documentation
Doc | Contents |
| What it is, motivation, integration patterns (OpenHome, HA, MQTT, IoT) |
| Requirements, step-by-step setup, first run, verification |
| First entity, memory, reading — common operations with curl examples |
| Full HTTP API — every endpoint, request/response shapes, examples |
| Admin dashboard guide, pages, reading confidence, prune, security |
| AI backend config, provider examples, model swap guide |
| How detectors work, all 5 detector types, how to add new ones |
| Retention policy config, what gets deleted, storage estimates |
| systemd service, Docker Compose, reverse proxy, environment config |
| Keeping it healthy — backups, upgrades, model swaps, reembed.py walkthrough |
| Running tests, fixture design, what is and isn't covered |
| Common errors, what they mean, how to fix them |
| Integration index — MQTT bridge, HA state poller, OpenHome, Cloudflare |
| Background worker template — health data, environment sensors, weather |
| Pull-based HA state poller — polls HA REST API, pushes to memory-mcp |
| HA package setup — rest_commands, automations, scripts |
| OpenHome ability setup — background daemon + recall skill |
| Cloudflare Tunnel setup — safe internet exposure for cloud callers |
Setup
# Install deps
pip install -r requirements.txt
# Pull embedding and LLM models (Ollama default)
ollama pull nomic-embed-text
ollama pull llama3.2
# Run MCP server (for OpenHome abilities)
python server.py
# Run HTTP API + admin UI (for HA webhooks, Node-RED, scripts)
python api.py # listens on :8900
# admin UI at http://localhost:8900/admin/AI Backend
Uses the OpenAI-compatible API (/v1/embeddings + /v1/chat/completions).
Works with Ollama, OpenAI, LM Studio, Together AI, or any compatible provider.
Configure via environment variables — no code changes needed:
# Default (local Ollama)
export MEMORY_AI_BASE_URL=http://localhost:11434/v1
export MEMORY_EMBED_MODEL=nomic-embed-text # 768-dim
export MEMORY_LLM_MODEL=llama3.2
# OpenAI
export MEMORY_AI_BASE_URL=https://api.openai.com/v1
export MEMORY_AI_API_KEY=sk-...
export MEMORY_EMBED_MODEL=text-embedding-3-small
export MEMORY_EMBED_DIM=1536
export MEMORY_LLM_MODEL=gpt-4o-miniSplit backends are supported — embed and LLM can run on different hosts:
# nomic-embed-text on a Raspberry Pi 4, LLM on a GPU machine
export MEMORY_AI_BASE_URL=http://pi4.local:11434/v1
export MEMORY_LLM_BASE_URL=http://gpu-host.local:11434/v1See docs/ai-backend.md for full configuration guide, provider examples, and split backend setup.
Source trust tiers
Every memory carries a trust tier that controls conflict resolution. When a new fact is written, it can only supersede an existing contradicting memory if its trust is equal or higher. Lower-trust sources cannot overwrite what you explicitly told the system.
Tier | Label | When to use |
5 |
| Direct user statements, manual entries via admin UI |
4 |
| Verified sensors, signed device data |
3 |
| Pattern engine promotions, LLM-extracted facts |
2 |
| Working memory promotions, low-confidence extractions |
1 |
| Third-party imports, unverified webhooks |
Example: a sensor (tier 4) recording "bedroom temperature is 68°F" will not overwrite an explicit user statement (tier 5) "I keep my bedroom at 66°F at night."
Set via the source_trust parameter on remember / POST /remember.
Defaults to MEMORY_TRUST_DEFAULT_REMEMBER (env var, default: 5=user).
Confidence decay
Memory confidence decays automatically over time so stale facts fade gracefully instead of accumulating indefinitely. Decay runs every hour in the pattern engine.
Formula: confidence = confidence × 2^(−days / halflife)
Category | Default half-life | Meaning |
| 90 days | Stable — takes months to fade |
| 90 days | Same default |
| 90 days | Same default |
Location records | 24 hours | Unconfirmed location drops to 50% overnight |
Configure per-category overrides:
export MEMORY_DECAY_HALFLIFE_DAYS=90 # global default
export MEMORY_DECAY_CATEGORY_HALFLIFE='{"preference": 180, "insight": 30}'
export MEMORY_LOCATION_DECAY_HALFLIFE_HOURS=24Use GET /fading (or get_fading_memories) to surface memories whose
confidence has dropped below a threshold — a prompt to confirm or update them.
AI call timeout
export MEMORY_AI_TIMEOUT=30 # seconds; applies to both embed() and LLM callsIncrease if using a slow local model. LLM calls use max(MEMORY_AI_TIMEOUT, 60)
to guarantee at least 60 seconds for generation.
Testing
pip install -r requirements.txt
python -m pytest # full suite (722 tests, no Ollama needed)
python -m pytest tests/test_tools.py # just tool tests
python -m pytest tests/test_spatial.py # just spatial/location testsSee docs/testing.md for fixture design and conventions.
OpenHome SDK config
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["/path/to/memory-mcp/server.py"]
}
}
}Schema
TIER 1
entities id, name*, type, meta(JSON), created, updated
memories id, entity_id, fact, category, confidence, source, created, updated,
last_accessed, access_count, superseded_by
relations id, entity_a, entity_b, rel_type, meta(JSON), created,
valid_from, valid_until
memory_vectors rowid=memories.id, embedding FLOAT[768] ← sqlite-vec
TIER 1.5
sessions id, entity_id, started_at, ended_at, summary, meta
session_turns id, session_id, role, content, ts
TIER 1.75
working_memory_tasks id, name, entity_id, status, ttl_ts, created, closed_at
working_memory_slots id, task_id, key, value(JSON), created, updated
TIER 2
readings id, entity_id, metric, unit, value_type, value_num,
value_cat, value_json, source, ts
(composite readings also decomposed into {metric}.{key} child rows)
reading_rollups id, entity_id, metric, bucket_type, bucket_ts,
count, avg_num, min_num, max_num, p10_num, p90_num, mode_cat
rollup_watermarks entity_id, metric, last_ts ← incremental build tracking
schedule_events id, entity_id, title, start_ts, end_ts, recurrence, meta, created
TIER 3
promoted_patterns id, entity_id, metric, pattern_key, memory_id, detected
TIER 5
locations id, entity_id, container_id, container_name, confidence,
last_confirmed_ts, active, source, note, created
(active=1 → current location; active=0 → archived sighting)Entity types (open — add any string)
person | house | room | device
Memory categories
preference | habit | routine | relationship | insight | general
Value types for readings
value_type | field | example |
| value_num | temperature=71.4, heart_rate=62 |
| value_cat | mood="calm", presence="home" |
| value_json |
|
MCP Tools
Tier 1 — Semantic memory
Tool | Description |
| Store a fact about any entity (embeds + indexes it) |
| Semantic search — multi-factor: cosine × recency × confidence |
| Relevance-filtered context snapshot (preferred for ability use) |
| Full profile: memories + relationships + readings |
| Create directed relationship between entities |
| Soft-delete a relationship (sets valid_until, preserves history) |
| Delete a memory or entire entity |
| LLM-powered fact extraction from conversation text |
Tier 2 — Time-series
Tool | Description |
| Ingest a reading (numeric/categorical/composite) |
| Query readings: raw or hour/day/week rollups |
| Natural-language trend summary for a metric |
| Add a schedule event (one-off or recurring) |
Episodic memory
Tool | Description |
| Open a conversation session for an entity |
| Append a turn (user/assistant/system) to a session |
| Close a session with optional summary |
| Retrieve full session transcript |
Working memory (Tier 1.75)
Tool | Description |
| Open a task-scoped scratchpad; optional TTL and entity association |
| Write a key/value slot into an open task |
| Read one slot by key, or all slots with task metadata |
| List tasks filtered by status (open/closed/expired/all) and entity |
| Close a task; optionally promote slots to long-term memory |
FTS keyword recall + session search
Tool | Description |
| Add |
| Full-text search across episodic session turns (FTS5/BM25) |
Token-budget context assembly
Tool | Description |
| Greedily fills a token budget with ranked memories + readings; |
Prospective / intention memory (Tier 4)
Tool | Description |
| Set a condition → action intention for an entity |
| Check if current text triggers any active intentions (FTS5) |
| Deactivate an intention |
| List active (or all) intentions for an entity |
Spatial / location memory (Tier 5)
Tool | Description |
| Store or update where an object was last seen |
| Return last known location with confidence + age ("where are my keys?") |
| Confirm object is still at its location; bumps confidence |
| Full trail of past sightings in reverse-chronological order |
Cross-tier
Tool | Description |
| Semantic search across memories AND live readings |
Maintenance
Tool | Description |
| Delete raw readings older than |
| Return memories whose confidence has fallen below a threshold, most faded first |
HTTP API endpoints (api.py)
GET /health liveness + row counts
GET /entities list all entities
POST /remember store a memory
POST /recall semantic search (mode=vector|keyword|hybrid, recency_weight, min_confidence)
POST /get_context relevance-filtered context snapshot
GET /profile/{entity_name} full profile
POST /relate create relationship
POST /forget delete memory or entity
POST /record ingest a reading
POST /record/bulk ingest multiple readings at once
POST /query_stream query time-series
POST /get_trends trend summary
POST /schedule add schedule event
POST /cross_query unified search
POST /prune delete readings older than RETENTION_DAYS
GET /fading memories below a confidence threshold (most faded first)
POST /open_session open a conversation session for an entity
POST /log_turn append a turn (user/assistant/system) to a session
POST /close_session close a session with optional summary
GET /get_session/{id} retrieve full session transcript
POST /extract_and_remember LLM-extract facts from text and store as memories
POST /wm/open open a working-memory task scope
POST /wm/set set a key/value slot in an open task
POST /wm/get get one slot or all slots from a task
GET /wm/list list tasks (?status=open|closed|expired|all&entity_name=X)
GET /wm/{task_id} get all slots and metadata for a task
POST /wm/close close a task (promote=true bundles slots into long-term memory)
POST /locate store/update last-known location of an object
POST /find return last known location with confidence + age
POST /seen_at confirm object is still at a location; bumps confidence
GET /location_history/{name} full location trail for an object
POST /search_sessions keyword search across session turn content (FTS5/BM25)
POST /get_context_budget token-budget context snapshot (greedy fill, truncated flag)
POST /intend store a prospective intention (trigger_text → action_text)
POST /check_intentions check if text triggers any active intentions (FTS5)
POST /dismiss_intention deactivate an intention (soft-delete)
GET /intentions list intentions (?entity_name=X&active_only=true)
GET /voices/unknown list unenrolled provisional speaker entities
POST /voices/enroll rename provisional entity to real person
POST /voices/merge merge provisional entity into enrolled entity
POST /voices/update_print update voiceprint embedding (running average)
GET /graph vis.js entity relationship graph (SPA)
GET /api/graph entity graph data { nodes, edges }
GET /export/markdown export all entities as Obsidian-compatible Markdown
GET /export/markdown/{name} export single entity as .md file download
POST /import/markdown import entities from Markdown files (two-way sync)
GET /admin/ dashboard
GET /admin/entities entity list
GET /admin/entity/{name} entity detail
GET /admin/readings readings stream
POST /admin/prune prune (HTMX-friendly HTML response)Usage examples
Ability: build context before responding to Brian
# Pull full profile (memories + latest readings + schedule)
profile = await mem.tool_get_profile("Brian")
# → inject as <memory>...</memory> in system prompt
# Or cross-query to pull what's relevant to the current question
context = await mem.tool_cross_query("how is Brian feeling today?")Home Assistant → record sensor readings via HTTP
# configuration.yaml — rest_command
rest_command:
push_temperature:
url: http://localhost:8900/record
method: POST
content_type: application/json
payload: >
{"entity_name":"{{ room }}","metric":"temperature",
"value":{{ temp }},"unit":"F","source":"ha","entity_type":"room"}
push_presence:
url: http://localhost:8900/record
method: POST
content_type: application/json
payload: >
{"entity_name":"{{ person }}","metric":"presence",
"value":"{{ state }}","source":"ha"}
push_mood:
url: http://localhost:8900/record
method: POST
content_type: application/json
payload: >
{"entity_name":"{{ person }}","metric":"mood",
"value":{"mood":"{{ mood }}","confidence":{{ conf }}},"source":"avatar_ability"}Avatar ability: store inferred mood state
# After detecting mood from conversation
await mem.tool_record(
entity_name="Brian",
metric="mood",
value={"mood": "focused", "confidence": 0.87},
source="avatar_ability",
)
# The pattern engine will promote this to a memory like:
# "Brian's mood is predominantly 'focused' (72% of days)"Query last week of temperature with daily rollup
result = await mem.tool_query_stream(
entity_name="living_room",
metric="temperature",
granularity="day",
start_ts=time.time() - 7 * 86400,
)Cross-entity semantic query
result = await mem.tool_cross_query("who in the house prefers a cooler environment?")
# Returns: matching memories (explicit preferences) + live temperature readings scored by relevanceSwapping embedding models
# 1. Set the new model and dimension via env vars
export MEMORY_EMBED_MODEL=mxbai-embed-large
export MEMORY_EMBED_DIM=1024
# 2. Pull the new model
ollama pull mxbai-embed-large # 1024-dim — richer but slower
# 3. Re-embed all memories (non-destructive — only rebuilds memory_vectors)
python reembed.py --dry-run # preview
python reembed.py # run itExpanding the schema
New entity types: pass any string — SQLite won't enforce the enum
New metric names: pass any string to
record()— fully dynamicNew memory categories: same — add to enum in schema or free-text
New pattern detectors: write
_detect_*(entity_name, metric, data) → list[tuple], call it via_maybe_promote()in_promote_patterns(). Seedocs/pattern-engine.md.New rollup statistics: add columns to
reading_rollupsand compute in_build_rollups()Structured entity attributes: use the
metaJSON column on entities (e.g.{"age": 35, "diet": "vegetarian", "wake_time": "06:30"})Retention window: change
RETENTION_DAYSinserver.py. Seedocs/retention.md.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/brianchilders/memory-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server