cortex

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`CORTEX_DIR`	No	State directory (PID files, search DB, logs)	~/.cortex
`CORTEX_SKIP_SYNC`	No	Bootstrap skips `uv sync` when set to `1` (dev only)	0
`CORTEX_OBSIDIAN_VAULT`	No	Vault location	~/obsidian-brain
`CORTEX_EXTRA_SESSION_DIRS`	No	Colon-separated extra session dirs to mine
`CORTEX_MINER_SETTLE_SECONDS`	No	Seconds to wait before mining a completed session	300

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
memory_recall	Search Cortex memories for durable knowledge relevant to a query. Use this when you need to recall prior decisions, conventions, bug fixes, user preferences, or lessons learned from past Claude Code sessions. It searches the full Obsidian-backed memory store via SQLite FTS5, ranks results with a multi-signal scorer (relevance + recency + usage + importance), and expands related-memory links. Behaviour: Read-only. Does not modify any memory, index, or sidecar file. Only side effect is a bump to the access-count telemetry sidecar (~/.cortex/telemetry.json), which influences future ranking. No authentication required. Cortex is local-first; there are no credentials, tokens, or API keys. No rate limits. Typical latency is under 100ms on corpora up to ~10k memories; pathological queries can take up to ~500ms. Data access scope: reads from ~/obsidian-brain/cortex/memories/ and ~/.cortex/search.db. Nothing leaves the local machine. Idempotent: calling twice with the same query returns the same results (modulo the access-count telemetry bump). Failure modes: returns "No memories found for: " on empty result sets. Never raises to the caller; internal errors fall back to a slower file-scan path. Use `memory_recall` when: You need specific facts ("what auth library did we pick?") You want to check if a topic has prior context before making a decision You're debugging and want to find if this bug was fixed before Do NOT use for: Session-level what-I-did-today logs (use `transcript_search` instead) On-demand query-tailored briefings (use `context_assemble` instead) Listing every memory (use `memory_list` instead) Returns: Markdown-formatted memory entries, grouped under a "### Memories" header. Each entry has the memory title and body excerpt. If no matches are found, returns "No memories found for: ". Example: memory_recall(query="jwt auth algorithm", limit=5) → returns the top 5 memories mentioning JWT auth, such as a memory documenting the decision to use RS256 in production.
memory_save	Save a durable lesson, decision, or convention to persistent memory. Use this to capture knowledge that should survive across Claude Code sessions: user preferences, architecture decisions with rationale, environment quirks, non-obvious bug fixes, or anything you'd otherwise have to re-explain in the next session. Behaviour: MUTATION. Writes a new markdown file under ~/obsidian-brain/cortex/memories/, appends to the FTS5 index (~/.cortex/search.db), updates ~/obsidian-brain/cortex/_index.md, and writes an append-only entry to ~/.cortex/events.jsonl. Writes are atomic (tmp + fsync + os.replace) and fcntl-locked. No authentication required. Local-first; no credentials. No rate limits. Typical latency 50-200ms including the security scan, dedup check, and write sync. Data access scope: writes stay entirely on the local filesystem. Nothing is sent over the network. Not idempotent: calling twice with identical content triggers the dedup check and the second call returns a "Memory already exists" error instead of a duplicate write. Failure modes: rejected inputs return a string error ("Memory already exists", "Memory rejected: "); they never raise to the caller. Every save goes through: Prompt-injection + credential-exfil security scan (rejects matches) Fuzzy deduplication against existing memories (word+bigram+trigram overlap — rejects near-duplicates with clear reason) Automatic related-memory linking (adds `related` frontmatter field) Write to Obsidian markdown file + FTS5 index + _index.md Use `memory_save` for: "We decided to use X because Y" (decision + rationale) "User prefers small focused PRs, not big bundled ones" (preference) "Database connection pool must be at least 20 for prod" (invariant) "bcrypt.compare is async — always await" (gotcha) Do NOT use for: "Today I worked on X" (session logs — use transcript_search to find those) Trivial facts easily re-discovered from reading code Speculative or unverified claims Duplicates of existing memories (the dedup check will reject them anyway) Returns: A confirmation like "Memory saved: 63d6570e... 'JWT auth uses RS256'" on success, or an error message (starting with "Memory already exists" or "Memory rejected") on failure. Example: memory_save( content="Use RS256 (not HS256) for JWT in production. HS256 " "requires sharing the signing secret across services " "which leaked via an env var export last quarter (#1247).", title="JWT algorithm — RS256 only in prod", tags="auth,jwt,security,postmortem", scope_id="my-webapp", )
memory_list	List every memory in a scope, with counts and source breakdown. Use this when you want to see the full inventory of what Cortex has stored — e.g. to audit which projects have the most memories, to check if a specific memory you wrote earlier is still present, or to find a memory whose exact title you remember but whose keywords are ambiguous. Behaviour: Read-only. No mutations at all, not even telemetry bumps. No authentication required. No rate limits. Latency scales linearly with memory count; typical sub-second on corpora up to ~10k. Data access scope: reads ~/obsidian-brain/cortex/memories/ markdown files via filesystem glob. Nothing leaves the machine. Idempotent and deterministic for a given filesystem state. Failure modes: returns "No memories in scope: " for an empty scope. Never raises. Use `memory_list` when: You want to see everything, not a ranked subset You need to audit the current memory inventory You're about to run a cleanup/purge operation and want a preflight You suspect `memory_recall` is missing something and want to confirm Do NOT use for: Searching for specific content (use `memory_recall` — faster, ranked) Assembling a context briefing (use `context_assemble`) Returns: A markdown listing with the total memory count, source type breakdown (mined/user/import), and one line per memory showing its short ID, title, and project. Memories are sorted newest-first by creation date. Example output: 47 memories Sources: mined:32, user:12, import:3 `- [63d6570e] JWT algorithm — RS256 only in prod \| project:my-webapp - [a8b12c44] Pytest fixtures must use tmp_path \| project:default - ...`
memory_import	Bulk-import memories from a file, directory, or chat export. Use this to seed Cortex with existing notes, CLAUDE.md content, documentation excerpts, or chat logs you want to make searchable. Each imported item runs through the same security scan and deduplication as `memory_save`, so clean imports even from messy sources. Behaviour: MUTATION. Writes one or more memory markdown files, updates FTS5 index, _index.md, and events.jsonl. Same atomic + fcntl-locked write path as `memory_save`. No authentication required. No rate limits, but latency scales with source size — importing a 100-item directory can take several seconds. Data access scope: reads the supplied `source_path` from the local filesystem. Guarded against path traversal: the resolved path must be inside $HOME; anything outside is rejected. Nothing is sent over the network. Not idempotent: re-importing the same source triggers the dedup check, which rejects duplicates with a summary count. Failure modes: invalid or non-existent source paths return a string error. Individual rejected items are counted in the summary and do not abort the whole import. Use `memory_import` for: Initial bootstrap from an existing `CLAUDE.md` or notes folder Absorbing a team-wide decision log into a project scope One-off batch captures from a conversation export Do NOT use for: Incremental per-conversation saves (use `memory_save` for single items) Mining Claude Code session logs (the background miner handles that automatically; no manual import needed) Returns: A summary like "Imported 12 memories from 18 candidates (rejected 6 duplicates)". Errors are returned as human-readable messages. Example: memory_import( source_path="/home/alice/notes/team-decisions.md", scope_id="my-webapp", )
transcript_search	Search raw Claude Code session transcripts for past conversation excerpts. This is the DIFFERENT from `memory_recall` — it searches the raw session JSONL files under `~/.claude/projects/`, not the mined memory corpus. Use it when you need to find the actual back-and-forth of a prior conversation, not the distilled lesson from it. Behaviour: Read-only. Does not modify transcripts, memories, or any index. No authentication required. No rate limits. Latency scales with transcript corpus size; typical 100-500ms across a year of daily sessions. Data access scope: reads ~/.claude/projects/*/.jsonl via direct filesystem access. Does NOT read ~/obsidian-brain/ (that's what `memory_recall` and `memory_list` are for). Nothing is sent over the network. Idempotent and deterministic for a given filesystem state. Failure modes: returns "No matching sessions" on empty result sets. Sessions older than Claude Code's 30-day retention window are not searchable (they've been deleted). Use `transcript_search` when: You want to recall "what did I actually say three weeks ago about X" A mined memory references a session and you want the full context You want to find all sessions that touched a specific file or topic You're verifying a memory's source_session or debugging extraction Do NOT use for: Looking up durable knowledge (use `memory_recall` — mined, ranked, faster) Listing memories (use `memory_list`) Query-tailored context briefings (use `context_assemble`) Returns: Markdown-formatted session excerpts with the session id, date, and matched text, or "No matching sessions" if nothing matches. Transcripts older than Claude Code's 30-day retention are not searchable. Example: transcript_search(query="redis connection pool size", limit=3)
context_assemble	Assemble a query-tailored context briefing from all available knowledge. This is the highest-value Cortex tool. It gathers the relevant subset of memories, the project's playbook, and related session transcripts, then uses Claude Haiku to synthesise a focused markdown briefing for the given query. The result is a ready-to-read summary, NOT a raw memory dump — usually 300-800 tokens of distilled relevant knowledge. Behaviour: Read-only with respect to the Cortex memory store. Bumps access telemetry on memories it reads (same as `memory_recall`). No authentication required by Cortex itself. The optional Haiku synthesis step shells out to the local `claude` CLI, which may use Claude Code credentials the user already has signed in — Cortex does not handle those credentials directly. Rate limits: depend on the `claude` CLI backend in the healthy path. In degraded mode (claude CLI missing), there are no rate limits at all — Cortex just returns raw materials. Data access scope: reads ~/obsidian-brain/cortex/memories/, ~/obsidian-brain/cortex/playbooks/.md, ~/.cortex/search.db, and ~/.claude/projects/ transcripts. If the `claude` CLI is invoked, the gathered materials (up to 50KB) are sent to Haiku via that subprocess — which in turn sends them to Anthropic's API under the user's existing Claude Code session. In degraded mode nothing leaves the machine. Latency: 3-15 seconds with Haiku; <500ms in degraded mode. Not idempotent at the Haiku level: the same query can produce slightly different briefings across calls due to Haiku sampling. The underlying memory retrieval step IS deterministic. Failure modes: returns "" on genuinely empty vaults. Never raises to the caller; Haiku failures silently fall back to returning the raw materials. Use `context_assemble` when: Starting a new session and you want the assistant loaded with context before the first real question (the auto-recall hook does this on `UserPromptSubmit`, but you can also call it manually) Onboarding to a project mid-session — ask "what do I know about X?" Before making a decision in an area where prior decisions exist Do NOT use for: Simple keyword lookups (use `memory_recall` — faster, no LLM call) Listing memories (use `memory_list`) Finding a specific past conversation (use `transcript_search`) Degraded mode: if the `claude` CLI is not available on the host, this tool falls back to returning the raw materials (playbook + ranked memories) without Haiku synthesis, so it always returns SOMETHING useful. Returns: A markdown briefing tailored to the query. Length is typically 300-800 tokens, with headers, bullet lists, and cross-references to memory IDs where relevant. Example: context_assemble( query="help me fix the auth flow on staging", project="my-webapp", ) → returns a brief covering: the RS256 JWT decision, the known bcrypt.compare gotcha, a link to the staging-specific env var issue from last month, etc.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Latest Blog Posts

Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security
Open Source Has a Bot Problem
By punkpeye on March 19, 2026.
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/TT-Wang/cortex-plugin'

If you have feedback or need assistance with the MCP directory API, please join our Discord server