mem0-mcp-toggle
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@mem0-mcp-toggleRemember my email address is example@email.com"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
mem0-mcp-toggle
A local Mem0 MCP server for macOS that you can flip on/off from a menu bar switch (or the CLI). Memories are stored locally in Chroma, and fact extraction runs against any OpenAI-compatible LLM (e.g. local LM Studio).
Unofficial community tool — not affiliated with mem0ai.
It runs the MCP server as a single HTTP server managed by launchd, so multiple MCP clients (Kiro, Claude Desktop, Cursor, …) share one process — no per-client duplicate/zombie processes. Turn it off when you don't need it to free RAM (~200 MB).
menu bar ┌──────────────┐ HTTP ┌──────────────────────┐
toggle ──▶│ launchd │──────────────▶│ mem0 MCP server │
(NSSwitch)│ com.mem0mcp │ 127.0.0.1 │ Chroma + OpenAI LLM │
└──────────────┘ :8765/mcp └──────────────────────┘
│
MCP clients (Kiro/Claude/Cursor) ─────────────────┘ (connect by URL)Prerequisites
macOS 12+
Xcode Command Line Tools (for
swiftc):xcode-select --installPython 3.10+ (
python3)An OpenAI-compatible LLM endpoint. Default targets LM Studio at
http://localhost:1234/v1.⚠️ Use a NON-reasoning instruct model (e.g.
Qwen2.5-14B-Instruct,Qwen2.5-7B-Instruct,Llama-3.1-8B-Instruct). Reasoning models (Qwen3 / QwQ / R1 …) put output in a separate channel and break mem0's extraction.
Related MCP server: Local Mem0 MCP Server
What install.sh does (and what you provide)
install.sh handles all software automatically — you do not need to pre-install mem0 or any Python package. It creates an isolated virtualenv (./.venv) and installs everything from requirements.txt there (mem0ai, fastmcp, chromadb, sentence-transformers). Any system/global mem0 is irrelevant and left untouched.
The embedding model (
all-MiniLM-L6-v2) is downloaded automatically on the first memory write (needs internet once, then it's cached locally).
You provide (once):
macOS + Xcode CLT + Python 3.10+ (see Prerequisites above).
A running LLM endpoint (see below) — the only required external service.
One line in your MCP client config (see "Connect your MCP client").
Using a cloud LLM instead of LM Studio
The default targets local LM Studio, but any OpenAI-compatible API works — pass overrides at install time:
MEM0_LLM_BASE_URL=https://api.openai.com/v1 \
MEM0_LLM_API_KEY=sk-... \
MEM0_LLM_MODEL=gpt-4o-mini \
MEM0_DISABLE_JSON_RESPONSE_FORMAT=0 \
./install.shSet MEM0_DISABLE_JSON_RESPONSE_FORMAT=0 when your endpoint supports {"type":"json_object"} (e.g. OpenAI); keep 1 for LM Studio.
Install
git clone <this-repo> mem0-mcp-toggle
cd mem0-mcp-toggle
./install.shOverride defaults with env vars:
MEM0_LLM_MODEL=qwen2.5-7b-instruct \
MEM0_LLM_BASE_URL=http://localhost:1234/v1 \
MEM0_MCP_PORT=8765 \
./install.shinstall.sh creates a Python venv, installs deps, builds the menu bar app to ~/Applications/mem0 toggle.app, and installs two launchd agents:
Label | Role | Start policy |
| the mem0 HTTP MCP server | manual (starts OFF; toggle it on) |
| the menu bar switch app | starts at login |
Connect your MCP client
Add to your client's MCP config (e.g. ~/.kiro/settings/mcp.json, Claude Desktop, Cursor):
{
"mcpServers": {
"local-mem0-mcp": {
"url": "http://127.0.0.1:8765/mcp",
"type": "http",
"timeout": 300000
}
}
}Tools exposed: add_memory, search_memories, list_memories, delete_memory.
Usage
Memory works only when both are true: (1) the server is ON, and (2) your LLM endpoint is running. Here's the actual flow:
One-time, after install
Start your LLM endpoint. In LM Studio: load a non-reasoning instruct model → open the Local Server tab → Start Server (port
1234). (Or pointMEM0_LLM_BASE_URLat a cloud endpoint.)Register the MCP server in your client config once (see Connect your MCP client).
Each time you want to use memory
Turn the server ON — click the
memorychipicon in the menu bar and flip the switch. The icon brightens and the row shows the server URL (http://127.0.0.1:8765/mcp).CLI equivalent:
mem0 on
Just talk to your AI client — it calls the tools automatically. For example:
"Remember that we deploy via GitHub Actions." →
add_memory"What do you know about our deployment?" →
search_memories"List everything you remember." →
list_memories
Turn it OFF when done to free ~200 MB — flip the switch off, or
mem0 off.
Check status anytime: run mem0 (prints ON/OFF), or glance at the menu bar icon (dim = OFF).
CLI control (no GUI)
launchctl kickstart gui/$(id -u)/com.mem0mcp.server # ON
launchctl kill TERM gui/$(id -u)/com.mem0mcp.server # OFFOptional mem0 helper — add to ~/.zshrc:
mem0(){ local D="gui/$(id -u)/com.mem0mcp.server";
case "$1" in on) launchctl kickstart "$D";; off) launchctl kill TERM "$D";;
*) lsof -nP -iTCP:8765 -sTCP:LISTEN >/dev/null && echo ON || echo OFF;; esac; }Configuration (server env vars)
Set these in launchd/com.mem0mcp.server.plist.template (then re-run install) or pass to install.sh:
Var | Default | Notes |
|
| non-reasoning instruct model |
|
| OpenAI-compatible endpoint |
|
| any non-empty string for LM Studio |
|
| local embeddings |
|
| vector store location |
|
| HTTP port (also update the app constant + mcp.json if changed) |
|
| workaround for LM Studio rejecting |
Why this design
In mem0 terms, this is the official OSS quickstart pattern — run 100% locally and wrapped as MCP. The Python SDK quickstart initializes Memory.from_config(...) and calls m.add() / m.search() — exactly what this server does. We only swap every default to a local component and expose it over MCP:
Component | mem0 quickstart default | This project |
LLM | OpenAI | local LM Studio — |
Embedder | OpenAI | local HuggingFace |
Vector store | Qdrant ( | Chroma ( |
Interface | Python | MCP tools over HTTP (via FastMCP) |
So the memory/search behavior is stock mem0 — nothing custom there. Everything else is just local-first packaging (no cloud, no API key, data stays on your Mac) plus the menu bar toggle. The specific choices below come from real debugging — they're not arbitrary:
One HTTP server via
launchd(not stdio). MCP stdio spawns a separate server process per client. With multiple clients (e.g. an IDE and a CLI) reading the same config you get duplicate servers, and they orphan into zombie processes when a client crashes/quits (macOS reparents them tolaunchd). A single shared HTTP server removes duplication and keeps a single Chroma writer.Manual on/off (starts OFF). The server holds ~200 MB (embedder + runtime). A
launchd-managed single instance you toggle on demand frees that RAM when idle — with no zombies (one managed instance, not per-client spawns).A NON-reasoning instruct model is required. mem0 reads the LLM's
content. Reasoning models (Qwen3 / QwQ / R1 …) spend their output budget in a separate reasoning channel, leavingcontentempty (and they're slow), so mem0 extracts nothing. Instruct models return the JSON incontent.response_format=Noneworkaround. mem0 asks the LLM for{"type":"json_object"}, which LM Studio rejects with HTTP 400 (must be 'json_schema' or 'text'). We disable it; a good instruct model still returns valid JSON from the prompt.Long MCP
timeout(300 s). mem0's extraction prompt is large; a single long memory can take ~80 s on a 14B model. Default client timeouts (~60 s) would cut the request off before it persists.
Troubleshooting
add_memorysays success but nothing is stored → almost always the LLM. Use a non-reasoning instruct model, and make sure the LLM endpoint is actually running.Long memories don't save / time out → big inputs make mem0's extraction slow (~80 s on a 14B). The MCP
timeout: 300000covers it; for speed use a 7–8B model or store shorter facts.HTTP 400
'response_format.type' must be 'json_schema' or 'text'→ LM Studio doesn't acceptjson_object; keepMEM0_DISABLE_JSON_RESPONSE_FORMAT=1(default).Icon not in menu bar →
launchctl kickstart gui/$(id -u)/com.mem0mcp.toggle, oropen "$HOME/Applications/mem0 toggle.app".Only runs while logged in — these are LaunchAgents (per-user GUI session), not boot daemons.
Logs:
~/Library/Logs/mem0-mcp.log,~/Library/Logs/mem0-toggle.log.
Uninstall
./uninstall.shRemoves the agents + app, keeps your stored memories (~/.mem0-mcp/chroma) and venv.
License
MIT — see LICENSE. Built on mem0ai/mem0, FastMCP, Chroma, and sentence-transformers; each retains its own license.
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/ost527/mem0-mcp-toggle'
If you have feedback or need assistance with the MCP directory API, please join our Discord server