Which integrations are available for this server?

Allows searching over Zsh command history, with deduplication and filtering of trivial commands.

How do I use history-rag?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@history-rag search for the docker compose command I used last week" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

history-rag

by standingwave

Overview Schema Related Servers Score Discussions

Python

Local

Claude Code history RAG

tests

Local semantic search over your history — Claude Code sessions, shell commands, browsing, git commits, notes, calendar events, and app usage — exposed to Claude Code as MCP tools. Everything is indexed into one vector space, so a single query ranks chat turns, terminal commands, and page visits together. Runs entirely on your machine; nothing leaves it unless you opt into the remote replica (see "Remote replica" below).

Setting this up by handing it to your coding agent? Point it at AGENT_SETUP.md instead — that's the agent runbook. This README is the human walkthrough.

Quickstart

For the impatient (full detail in the numbered sections below):

git clone https://github.com/standingwave/history-rag.git && cd history-rag
brew install ollama && brew services start ollama
ollama pull nomic-embed-text
uv venv ~/.claude/rag-venv
uv pip install --python ~/.claude/rag-venv/bin/python -r requirements.txt
~/.claude/rag-venv/bin/python index.py            # build the index
claude mcp add history -- ~/.claude/rag-venv/bin/python "$(pwd)/server.py"

Related MCP server: lore

Layout

config.py — shared settings (model, dimensions, DB path, Ollama URL), each overridable by env var. Imported everywhere so build and query always agree.
index.py — driver: pulls chunks from every source in SOURCES, embeds via Ollama, writes ~/.claude/history-rag.db.
sources/ — one module per content source, each yielding (id, text, record):
- claude.py — Claude Code session prompts + assistant text.
- shell.py — bash + zsh command history, deduped.
- appusage.py — daily per-app time from the tracker (macOS, optional).
- browser.py — Safari/Chrome/Helium page visits, deduped by URL.
- git.py — your own commits across local repos (opt-in via env var).
- obsidian.py — vault notes chunked by heading (opt-in via env var).
- calendar.py — calendar events with attendees (macOS, opt-in). See "Calendar events" below.
- digest.py — precomputed daily rollups: one chunk per (local day, stream) summarizing browser visits/searches per profile, claude sessions, and shell runs. See "Daily digests" below.
- common.py — helpers shared across sources (secret redaction).
appusage/ — optional macOS app-usage tracker: a launchd daemon that logs how long you spend in each app. See "App usage" below.
server.py — the MCP server. Four tools forming a disclosure ladder: history_stats (orient; locations=true reveals filterable prefixes) → search_history / list_window (relevance-ranked vs exhaustive pointers; list_window lists newest local day first with each day's summary chunks leading, aggregates with group_by=day|source|location|domain, and takes include_meta / summaries opt-ins) → expand (the reading view: full chunk + source-aware context, live from the backing store when it still exists — surrounding conversation turns, git show --stat, the whole note, the profile's same-day visits, the calendar day's agenda, the digest's full rollup).
ask.py — the ask-mode agent loop: a model works the four tools in-process and returns a cited answer. Provider-agnostic via two adapters (openai-compatible covers OpenAI/OpenRouter/Groq/Ollama-style endpoints; anthropic the Messages API), configured as named presets in [ask.models]. Used by the remote replica's Ask mode.
com.user.history-index.plist — launchd template to re-index on an interval (see "Keep it fresh").
TESTING.md — the minimal test plan, plus known bugs to pin.
deploy/lambda/ — optional read-only replica on AWS Lambda: phone access via a claude.ai connector plus a /search page. See "Remote replica".
tools/ — dev loop and maintenance: refresh.py (the scheduled chain: index → prune → backup → sync, each step isolated, outcomes recorded in the runs table), smoke.py (exercise every tool path in-process after a change; warns if the running MCP server predates your edits), kick.sh (trigger the launchd refresh and print its stats block), backup.py (daily dated copies of the sole-copy DBs), sync-s3.py (push the index to S3 for the optional remote replica), hist.py (stdlib-only terminal client for that replica: hist search "the proxy bug" -k 5, hist ask "when did I…?" from any machine holding the secret URL; suggested alias hist='python3 <repo>/tools/hist.py'), eval-model.py / migrate-model.py (embedding-model evaluation and archive-safe switching), eval-embed-parity.py (verify a hosted embedding API matches the local index's vector space), and inspect-sessions.py (format-drift diagnostic: dumps the raw session JSONL shape if the claude source ever stops matching reality).

Config file

Machine-specific settings live outside the repo in ~/.claude/history-rag.toml (path overridable via CLAUDE_RAG_CONFIG). Precedence is env var > config file > code default, and a missing file just means defaults — so env-only setups keep working, and the file is the recommended home for anything you'd otherwise export in your shell AND inject into the launchd plist:

[sources]
enabled = ["claude", "shell", "browser", "git", "obsidian", "appusage"]

[git]
roots = ["~/dev"]

[obsidian]
vaults = ["~/Documents/Obsidian Vault"]

[shell]
histfiles = []            # archived history files

[browser]
extra = {}                # name = path, added to the built-in defaults
keep_params = {}          # per-domain query params to keep, e.g. { "youtube.com" = ["v"] }

[calendar]
apps = ["apple"]          # enables the calendar source; exclude_calendars = [...]

[digest]                  # sources (default browser/claude/shell),
                          # recompute_days (3), backfill_days (90)

[core]                    # model/dim/db/ollama — same keys as the env vars

[backup]                  # dir (default ~/.claude/backups), keep (default 7)

[sync]                    # bucket/key/region — S3 push for the remote replica

[refresh]
prune = ["calendar"]      # sources pruned on each scheduled refresh

[ask]                     # /search "Ask" mode: named model presets; keys
max_turns = 8             # env-only via each preset's key_env. See
                          # deploy/lambda/README.md "Ask mode".
# [[ask.models]]
# name = "haiku"
# backend = "anthropic"           # or "openai-compatible" (+ base_url)
# model = "claude-haiku-4-5"
# key_env = "ANTHROPIC_API_KEY"

[health]
notify = true             # macOS notification when indexing stalls (default true)

[sources].enabled picks which sources run (absent = all) — no more editing SOURCES in index.py. Unknown sections/keys warn; malformed TOML stops the run loudly. The long-lived MCP server reads config at startup, so edits need a /mcp reconnect, same as code changes.

A second optional machine-local file, ~/.claude/history-rag-instructions.md, holds answering preferences rather than indexing config: the search_history docstring tells the model to read it (if present) before presenting results, so recall-coverage and presentation rules live outside both the repo and the model's ambient context.

Sources

Every source feeds one shared index; pass source="claude", source="shell", source="appusage", source="browser", source="git", source="obsidian", source="calendar", or source="digest" to search_history to restrict a query.

Shell history reads ~/.zsh_history, ~/.bash_history, and the per-session snapshots macOS keeps in ~/.zsh_sessions/ and ~/.bash_sessions/. Live history files are capped by your shell's SAVEHIST/HISTSIZE, but the session snapshots reach further back. For history archived elsewhere (old machines, backups), point CLAUDE_RAG_HISTFILES at the extra files (colon-separated):

CLAUDE_RAG_HISTFILES="$HOME/backups/zsh_history.2019:$HOME/backups/bash_history.old" \
  ~/.claude/rag-venv/bin/python index.py

Identical commands collapse to one entry (with a run count); trivial commands (ls, cd, …) are dropped, and anything that looks like it contains a secret (passwords, tokens, API keys, user:pass@host URLs) is skipped so it's never embedded.

If atuin is installed, its store is read too (default ~/.local/share/atuin/history.db; override with [shell] atuin_db, empty string disables) — every atuin-recorded run is dated, location becomes the latest run's cwd (so location="~/dev/myrepo" filtering works for shell), meta gains cwd + exit code, and expand can show the commands around a run. Commands atuin knows are skipped when read from live histfiles to avoid double counting; archived histfiles always count. Without atuin, command timestamps only appear if zsh recorded them (setopt EXTENDED_HISTORY); bash needs HISTTIMEFORMAT set.

App usage (macOS, optional). A small tracker records how long you spend in each app so you can later ask "what was I doing the week I built X?". It's off until you install the daemon; the appusage source yields nothing without it.

The daemon samples the frontmost app and idle time every 20s (via lsappinfo and ioreg — no extra deps, no permissions), coalesces same-app stretches into segments in ~/.claude/appusage.db, and doesn't count idle (>2 min) or sleep time. sources/appusage.py feeds daily per-app totals (≥1 min) into the index, today included: the indexer re-embeds any chunk whose text changed, so today's growing total stays current while finished days settle once.

Install it as a launchd agent (fills the plist's absolute-path placeholders, then loads it):

PY=~/.claude/rag-venv/bin/python
DAEMON="$(pwd)/appusage/daemon.py"
sed -e "s#__PYTHON__#$PY#" -e "s#__DAEMON__#$DAEMON#" \
  appusage/com.user.appusage.plist > ~/Library/LaunchAgents/com.user.appusage.plist
launchctl load ~/Library/LaunchAgents/com.user.appusage.plist

See what it's captured any time (independent of the index):

~/.claude/rag-venv/bin/python appusage/report.py        # today + last 7 days

To stop and remove it:

launchctl unload ~/Library/LaunchAgents/com.user.appusage.plist
rm ~/Library/LaunchAgents/com.user.appusage.plist

Tuning: APPUSAGE_INTERVAL (sample seconds) and APPUSAGE_IDLE (idle cutoff) as env vars in the plist. Data is local, like everything else here.

Browser history reads Safari (default store plus any Safari 17+ profiles under ~/Library/Safari/Profiles/) and every Chrome and Helium profile found in their standard locations (Guest/System profiles skipped) and emits one chunk per (browser, profile, URL): <title> — <url>, with visits of the same URL within a profile merged (counts summed, last visit as the timestamp). location is browser:profile using the human-readable profile name from Chromium's Preferences (plain safari for Safari's profile-less default store), so searches can tell work from personal browsing; ids hash the stable profile directory, so renaming a profile re-labels chunks without orphaning them. Query strings and fragments are stripped (they carry tokens and churn) — except params that are the page's identity, kept per domain[/path-prefix] rule: youtube.com/watch's v and youtube.com/results' search_query by default (else every watch page collapses into one chunk). Extend or disable via [browser] keep_params ({ "google.com/search" = ["q"], "example.com" = ["id"] }; an empty list turns a rule off). Path scoping matters — google.com's q is a search on /search but a redirect target on /url, and keeping the latter would index tracking links as pseudo-searches. Localhost and non-http(s) URLs are skipped, and the shared secret regex runs on the final URL, kept params included. Other Chromium-family browsers work via CLAUDE_RAG_BROWSERS (colon-separated name=path entries; the Safari-vs- Chromium schema is sniffed from the DB, not the name):

CLAUDE_RAG_BROWSERS="arc=$HOME/Library/Application Support/Arc/User Data/Default/History" \
  ~/.claude/rag-venv/bin/python index.py --source browser

Reading Safari's History.db requires Full Disk Access for whatever process runs the indexer (System Settings → Privacy & Security → Full Disk Access → add your terminal). Without it, Safari is skipped with a note and the other browsers still index. Note Chromium browsers expire history (~90 days), so the index outlives the browser's own record — don't routinely --prune this source.

Git commits indexes your own commit messages (subject + body, no diffs) across local repos. Off until you point it somewhere — set [git] roots in the config file (or CLAUDE_RAG_GIT_ROOTS, colon-separated) to paths that are each either a repo or a directory scanned a few levels deep for repos. "Your own" means each repo's git config user.email ([git] author / CLAUDE_RAG_GIT_AUTHOR forces one email everywhere). All refs are read, so branch-only work is captured; stash refs and merge commits are excluded. Rebase/cherry-pick copies of the same message collapse to one chunk (run count in meta, latest copy wins), and ids hash repo+message so a rebase doesn't orphan chunks — only rewording a message does (--prune --source git cleans those up). The config file is read by scheduled launchd runs too — no plist env plumbing needed.

Obsidian notes indexes vault markdown, one chunk per #/##/### section (deeper headings stay inside their parent; short notes stay whole). Off until you point it at vaults via [obsidian] vaults in the config file (or CLAUDE_RAG_OBSIDIAN_VAULTS, colon-separated). Chunk ids hash vault+path+heading+occurrence — not the text — so editing a section re-embeds it in place; only deleting or renaming a section leaves an orphan (--prune --source obsidian cleans those up, and unlike claude/shell/ browser the vault is the durable record, so pruning here is safe). Timestamps come from date: frontmatter when present, else file mtime; frontmatter is stripped from the text. Hidden dirs (.obsidian, .trash), template folders, and credential-looking sections are skipped.

Calendar events (macOS, opt-in) indexes meetings and appointments from Apple Calendar's store — the strongest "what did I do Tuesday" anchors, and what turns a mic-detected call into which meeting. Off until [calendar] apps = ["apple"]; exclude_calendars skips noisy ones (holidays, birthdays). One chunk per event with attendee names, all past plus ~90 days ahead — timestamps can be in the future, so "what's coming up Thursday" works. location is app:calendar name (e.g. apple:Work), and expand returns that day's full agenda. Reading the store needs Full Disk Access (the same grant as Safari). Unlike claude/shell/browser, routine pruning is safe here: the source declares a bounded prune window, so [refresh] prune = ["calendar"] only ever touches the recent sync window, never archived events.

Daily digests precompute one summary chunk per (local day, stream) so "what did I do today/this week?" is a ~30-chunk read instead of a paged crawl of every raw chunk in the window: browser visits per profile (counts by site, the day's searches, notable titles — read from the browsers' per-visit tables, since indexed browser chunks only carry each URL's last visit), claude sessions (opening prompt as the topic), and shell runs by cwd. Text is templated — same inputs, same text, no re-embed — with the full rollup in meta for expand(). Only the last recompute_days (default 3) days are recomputed; older digests settle into archive and survive their backing data aging out (Chromium keeps ~90 days of visits), which is also why --prune --source digest is refused. A fresh index backfills backfill_days (default 90). Configure via [digest]: sources (subset of browser/claude/shell, [] disables), recompute_days, backfill_days.

Adding a source: drop a module in sources/ with an iter_chunks() generator that yields (id, text, {"source", "timestamp", "location", "meta"}), then add it to SOURCES in index.py. The id must be stable across runs so indexing stays incremental.

1. Prereqs

Install Ollama

macOS (Homebrew, gives easy updates):

brew install ollama
brew services start ollama      # runs the daemon in the background

Or download the .dmg from https://ollama.com/download/mac and drag to Applications (launch it once so the menu-bar daemon starts).

Linux:

curl -fsSL https://ollama.com/install.sh | sh   # sets up a systemd service

Verify the daemon is up (the indexer/server talk to it on port 11434):

ollama --version
curl http://localhost:11434/api/tags   # should return JSON, not connection refused

Pull the embedding model + Python deps

ollama pull nomic-embed-text          # 768-dim, fast

Using uv (recommended):

uv venv ~/.claude/rag-venv
uv pip install --python ~/.claude/rag-venv/bin/python -r requirements.txt

(requirements.txt is just sqlite-vec, requests, mcp[cli].) uv resolves to prebuilt wheels, avoiding the Rust/maturin source builds that break on Apple Silicon. Run index.py and register server.py with this venv's interpreter: ~/.claude/rag-venv/bin/python.

Don't have pip and not using uv? First get Python (it bundles pip). On macOS:

brew install python                   # installs python3 + pip3
python3 -m pip --version              # verify

Then install the deps (use pip3, or python3 -m pip if pip isn't on PATH):

python3 -m pip install -r requirements.txt

If brew install python warns about an "externally-managed environment" when installing the deps, use a venv instead:

python3 -m venv ~/.claude/rag-venv
source ~/.claude/rag-venv/bin/activate
pip install -r requirements.txt

If you use a venv, run index.py and register server.py with that venv's python: ~/.claude/rag-venv/bin/python.

2. Build the index

Use the venv interpreter you installed deps into (bare python won't see them).

First preview what survives the filter across all sources (Claude keeps real prompts + assistant text, dropping tool calls/results/thinking/meta; shell keeps deduped non-trivial commands):

~/.claude/rag-venv/bin/python index.py --dry-run

If that looks right, build:

~/.claude/rag-venv/bin/python index.py            # incremental (safe to re-run)
~/.claude/rag-venv/bin/python index.py --rebuild  # wipe + reindex from scratch

Writes ~/.claude/history-rag.db. The DB records which embedding model built it (index_meta); both the indexer and the server refuse to touch an index whose stamp doesn't match the configured model/dim — a same-dimension model swap would otherwise corrupt search silently. Adding a source needs no rebuild (sources are additive). --rebuild is the deliberate escape hatch for a model/schema change, but note it reindexes from sources: chunks whose backing data has aged out (old session transcripts, expired browser history) are lost. For a model switch that preserves them, use tools/eval-model.py (side-by-side candidate ranking) then tools/migrate-model.py (archive-safe: re-embeds from stored chunk text).

Each run prints one stats line per source (shell: 905 chunks, 3 embedded, 0 skipped, 0.4s), and a source that throws is logged and skipped without blocking the others. Two more flags for maintenance:

~/.claude/rag-venv/bin/python index.py --source shell          # run one source (any mode)
~/.claude/rag-venv/bin/python index.py --prune --source shell  # drop its stale chunks

--prune removes stored chunks whose id the source stopped yielding (edited notes, rewritten git history). Two safety rails: it requires --source, because the index is an archive — it keeps chunks whose backing data has aged out (Claude Code deletes old session transcripts, histfiles rotate), and a blanket prune would delete that outlived history. And it only prunes a source that completed cleanly and yielded at least one chunk, so a broken or absent source never wipes its own rows.

3. Register the MCP server with Claude Code

Run this from the repo directory, using the venv interpreter (bare python won't find the deps). $(pwd) fills in the absolute path to server.py (the registration needs an absolute path, not a relative one):

claude mcp add history -- ~/.claude/rag-venv/bin/python "$(pwd)/server.py"

Confirm it registered:

claude mcp list          # 'history' should appear

Then in a session, Claude can call search_history("that proxy bug we hit", k=5).

4. Keep it fresh

The index only reflects sessions present at last run. Pick one:

launchd (recommended, macOS) — a periodic agent that re-indexes every 30 min, runs once at login, and catches up after sleep (cron just skips missed runs). Fill the plist's absolute-path placeholders and load it:

PY=~/.claude/rag-venv/bin/python
sed -e "s#__PYTHON__#$PY#g" -e "s#__REFRESH__#$(pwd)/tools/refresh.py#" \
  com.user.history-index.plist > ~/Library/LaunchAgents/com.user.history-index.plist
launchctl load ~/Library/LaunchAgents/com.user.history-index.plist

Each cycle runs tools/refresh.py: index → prune (the sources named in [refresh] prune, e.g. ["calendar"]) → tools/backup.py → tools/sync-s3.py, each step isolated so one failure never hides the others, with per-step outcomes recorded in the runs table and one refresh: summary line in the log. Backups are dated copies of the index and app-usage DBs in [backup] dir (default ~/.claude/backups), at most once per local day, pruned to the newest [backup] keep (default 7). The index is an archive — it holds history whose sources have expired — so back it up like it's the only copy, because it is.

It needs Ollama running (index.py no-ops safely if it isn't). Check it fired:

tail -f /tmp/history-index.log

Every run is also recorded in a runs table inside the index itself, and history_stats surfaces the latest as a health field (last-run age, status, failing sources) — so a stalled or partially-failing refresh is reported by the model the next time you ask a history question, instead of rotting unseen in the log. On top of that, a macOS notification fires when two consecutive runs abort or the model stamp blocks indexing ([health] notify = false to turn off). The /tmp log remains as disposable per-run detail; reboot clearing is its rotation policy. Change the cadence via StartInterval (seconds) in the plist; everything else (model, prune list, sync bucket) comes from the TOML, which scheduled runs read like every other entry point. To stop: launchctl unload … then remove the plist.

manual — run when you want it current:

~/.claude/rag-venv/bin/python index.py

cron (portable / Linux) — crontab -e, then (absolute paths; cron has a minimal PATH and no ~ expansion):

*/30 * * * * /ABS/PATH/rag-venv/bin/python /ABS/PATH/tools/refresh.py >> $HOME/.claude/rag-index.log 2>&1

On macOS, cron may also need Full Disk Access (System Settings → Privacy & Security → Full Disk Access → add /usr/sbin/cron) to read ~/.claude — which is a good reason to prefer the launchd agent above.

5. Verify it works inside a Claude Code session

After indexing (step 2) and registering (step 3):

Confirm the server is connected. In a session, run the MCP status command:
```
/mcp
```
You should see history listed as connected, with search_history and history_stats tools. (history_stats reports per-source counts and date coverage — a quick way for Claude to see what's indexed before searching.)
Ask Claude something that needs your history. Natural-language prompts that force a lookup work best — Claude will call the tool on its own:
```
Search my past sessions: what did we decide about the sqlite-vec schema?
What have I worked on involving Ollama and embeddings?
What's that ffmpeg command I used to convert a webm? (search my shell history)
```
Claude should invoke search_history and cite matched snippets with their source / timestamp / location.

Call the tool explicitly if you want to test it directly:

Use the search_history tool with query "Attio CRM setup" and k=5

Sanity-check the raw DB (outside Claude Code) to confirm rows exist:

~/.claude/rag-venv/bin/python - <<'PY'
import sqlite3, sqlite_vec, os
db = sqlite3.connect(os.path.expanduser("~/.claude/history-rag.db"))
db.enable_load_extension(True); sqlite_vec.load(db)
print("chunks:", db.execute("SELECT COUNT(*) FROM chunks").fetchone()[0])
for row in db.execute("SELECT source, COUNT(*) FROM chunks GROUP BY source"):
    print(row)
for row in db.execute("SELECT source, timestamp, substr(text,1,70) FROM chunks LIMIT 5"):
    print(row)
PY

Troubleshooting:

/mcp doesn't list history → re-check claude mcp list; the path to server.py must be absolute and the interpreter must be the venv's.
Tool errors with a connection error → Ollama isn't running (the server embeds your query at call time). Start it: open -a Ollama.
Tool returns nothing → the index is empty or stale; re-run index.py.

Remote replica (optional)

Everything above is local-only. deploy/lambda/ adds a read-only replica on AWS Lambda: the Mac stays the single writer, the scheduled refresh pushes the index to S3 on change, and the function serves the same four MCP tools behind a secret-path URL usable as a claude.ai custom connector (phone/web/desktop).

The same endpoint serves /search, an HTML page sized for phones with three modes behind one tab bar:

Search — semantic search with source chips and date/location filters; results render as cards (source badges, native per-source formatting), each expanding inline into the full chunk + context.
Ask — a prompt to a model that works the history tools and answers with citations linking into the reading views. Models are named presets ([ask.models], any OpenAI-compatible or Anthropic endpoint); the picker offers whichever presets have keys in the function env.
Browse — the window listing with date presets (Today · 7d · 30d) and a Summaries | Everything toggle: Summaries is a diary view of just the day-shape and digest rollups, day by day.

tools/hist.py gives the same search and ask from any terminal holding the URL. Code deploys ride GitHub Actions on pushes to main (OIDC role, no stored keys). Setup, secrets, ask presets, and runbook: deploy/lambda/README.md.

Notes

One chunk per Claude message, per unique shell command, and per day-per-app. For long assistant turns you may later want sliding-window chunking; per-message is fine to start.
Indexing is incremental and self-healing: a chunk is re-embedded only when its text changed, so growing app-usage totals stay current without a rebuild.
nomic-embed-text is the speed pick. To switch an existing index to a higher-quality model (e.g. mxbai-embed-large, dim 1024), don't --rebuild — that re-reads sources and loses archived chunks. Evaluate with tools/eval-model.py, switch with tools/migrate-model.py, then set [core] model/dim in the TOML. Other overrides: CLAUDE_RAG_DB, CLAUDE_RAG_OLLAMA; see config.py.
search_history returns {query, count, results[]}, ranked best-first, with a distance (L2; lower = closer) on each hit. history_stats reports the corpus. Filter a search with source= and trim noise with max_distance=.

This server cannot be installed

license - not found

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Related MCP Servers

Melchizedek
Knowledge & Memory RAG Systems Search
louis49
A
license
A
quality
B
maintenance
Persistent memory for Claude Code. Automatically indexes every conversation and provides production-grade hybrid search (BM25 + vectors + reranker) via MCP tools. 100% local, zero config, zero API keys, zero invoice.
Last updated 2026-06-21
16
43
7
MIT
lore
Knowledge & Memory RAG Systems Search
hyunjae-labs
A
license
A
quality
D
maintenance
Semantic search across Claude Code conversations. Hybrid vector + keyword search, fully local, background indexing.
Last updated 2026-05-13
6
94
7
MIT
ClaudeX
Knowledge & Memory Search Developer Tools
kunwar-shah
A
license
A
quality
A
maintenance
Persistent memory + FTS5 full-text search for Claude Code conversation history. Indexes ~/.claude/projects/ JSONL into SQLite, exposes 10 MCP tools (store/recall/search memories, browse sessions, get summaries) plus prompts. Includes a web UI for visual exploration
Last updated 2026-06-20
10
148
90
MIT
session-recall
Knowledge & Memory Search
AbsoluteMode
A
license
B
quality
B
maintenance
Provides local, agentic semantic recall over Claude Code session history, enabling the agent to search past discussions semantically, expand turns, and grep transcripts.
Last updated 2026-07-31
5
11
MIT

View all related MCP servers

Related MCP Connectors

LLMemory
Search your AI chat history (ChatGPT, Claude, Codex) from any MCP client. Remote, private, read-only
XMemo
Secure, user-owned long-term memory for AI agents over OAuth-protected remote MCP. Save, search, recall, update, and govern preferences, project context, decisions, and task state across ChatGPT, Claude, Copilot, IDEs, and CLIs.
Darwin RAG
Local-first RAG engine with MCP server for AI agent integration.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/standingwave/history-rag'

If you have feedback or need assistance with the MCP directory API, please join our Discord server