Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
CORTEX_DIRNoState directory (PID files, search DB, logs)~/.cortex
CORTEX_SKIP_SYNCNoBootstrap skips `uv sync` when set to `1` (dev only)0
CORTEX_OBSIDIAN_VAULTNoVault location~/obsidian-brain
CORTEX_EXTRA_SESSION_DIRSNoColon-separated extra session dirs to mine
CORTEX_MINER_SETTLE_SECONDSNoSeconds to wait before mining a completed session300

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
memory_recall

Search Cortex memories for durable knowledge relevant to a query.

Use this when you need to recall prior decisions, conventions, bug fixes, user preferences, or lessons learned from past Claude Code sessions. It searches the full Obsidian-backed memory store via SQLite FTS5, ranks results with a multi-signal scorer (relevance + recency + usage + importance), and expands related-memory links.

Behaviour:

  • Read-only. Does not modify any memory, index, or sidecar file. Only side effect is a bump to the access-count telemetry sidecar (~/.cortex/telemetry.json), which influences future ranking.

  • No authentication required. Cortex is local-first; there are no credentials, tokens, or API keys.

  • No rate limits. Typical latency is under 100ms on corpora up to ~10k memories; pathological queries can take up to ~500ms.

  • Data access scope: reads from ~/obsidian-brain/cortex/memories/ and ~/.cortex/search.db. Nothing leaves the local machine.

  • Idempotent: calling twice with the same query returns the same results (modulo the access-count telemetry bump).

  • Failure modes: returns "No memories found for: " on empty result sets. Never raises to the caller; internal errors fall back to a slower file-scan path.

Use memory_recall when:

  • You need specific facts ("what auth library did we pick?")

  • You want to check if a topic has prior context before making a decision

  • You're debugging and want to find if this bug was fixed before

Do NOT use for:

  • Session-level what-I-did-today logs (use transcript_search instead)

  • On-demand query-tailored briefings (use context_assemble instead)

  • Listing every memory (use memory_list instead)

Returns: Markdown-formatted memory entries, grouped under a "### Memories" header. Each entry has the memory title and body excerpt. If no matches are found, returns "No memories found for: ".

Example: memory_recall(query="jwt auth algorithm", limit=5) → returns the top 5 memories mentioning JWT auth, such as a memory documenting the decision to use RS256 in production.

memory_save

Save a durable lesson, decision, or convention to persistent memory.

Use this to capture knowledge that should survive across Claude Code sessions: user preferences, architecture decisions with rationale, environment quirks, non-obvious bug fixes, or anything you'd otherwise have to re-explain in the next session.

Behaviour:

  • MUTATION. Writes a new markdown file under ~/obsidian-brain/cortex/memories/, appends to the FTS5 index (~/.cortex/search.db), updates ~/obsidian-brain/cortex/_index.md, and writes an append-only entry to ~/.cortex/events.jsonl. Writes are atomic (tmp + fsync + os.replace) and fcntl-locked.

  • No authentication required. Local-first; no credentials.

  • No rate limits. Typical latency 50-200ms including the security scan, dedup check, and write sync.

  • Data access scope: writes stay entirely on the local filesystem. Nothing is sent over the network.

  • Not idempotent: calling twice with identical content triggers the dedup check and the second call returns a "Memory already exists" error instead of a duplicate write.

  • Failure modes: rejected inputs return a string error ("Memory already exists", "Memory rejected: "); they never raise to the caller.

Every save goes through:

  1. Prompt-injection + credential-exfil security scan (rejects matches)

  2. Fuzzy deduplication against existing memories (word+bigram+trigram overlap — rejects near-duplicates with clear reason)

  3. Automatic related-memory linking (adds related frontmatter field)

  4. Write to Obsidian markdown file + FTS5 index + _index.md

Use memory_save for:

  • "We decided to use X because Y" (decision + rationale)

  • "User prefers small focused PRs, not big bundled ones" (preference)

  • "Database connection pool must be at least 20 for prod" (invariant)

  • "bcrypt.compare is async — always await" (gotcha)

Do NOT use for:

  • "Today I worked on X" (session logs — use transcript_search to find those)

  • Trivial facts easily re-discovered from reading code

  • Speculative or unverified claims

  • Duplicates of existing memories (the dedup check will reject them anyway)

Returns: A confirmation like "Memory saved: 63d6570e... 'JWT auth uses RS256'" on success, or an error message (starting with "Memory already exists" or "Memory rejected") on failure.

Example: memory_save( content="Use RS256 (not HS256) for JWT in production. HS256 " "requires sharing the signing secret across services " "which leaked via an env var export last quarter (#1247).", title="JWT algorithm — RS256 only in prod", tags="auth,jwt,security,postmortem", scope_id="my-webapp", )

memory_list

List every memory in a scope, with counts and source breakdown.

Use this when you want to see the full inventory of what Cortex has stored — e.g. to audit which projects have the most memories, to check if a specific memory you wrote earlier is still present, or to find a memory whose exact title you remember but whose keywords are ambiguous.

Behaviour:

  • Read-only. No mutations at all, not even telemetry bumps.

  • No authentication required.

  • No rate limits. Latency scales linearly with memory count; typical sub-second on corpora up to ~10k.

  • Data access scope: reads ~/obsidian-brain/cortex/memories/ markdown files via filesystem glob. Nothing leaves the machine.

  • Idempotent and deterministic for a given filesystem state.

  • Failure modes: returns "No memories in scope: " for an empty scope. Never raises.

Use memory_list when:

  • You want to see everything, not a ranked subset

  • You need to audit the current memory inventory

  • You're about to run a cleanup/purge operation and want a preflight

  • You suspect memory_recall is missing something and want to confirm

Do NOT use for:

  • Searching for specific content (use memory_recall — faster, ranked)

  • Assembling a context briefing (use context_assemble)

Returns: A markdown listing with the total memory count, source type breakdown (mined/user/import), and one line per memory showing its short ID, title, and project. Memories are sorted newest-first by creation date.

Example output: 47 memories Sources: mined:32, user:12, import:3

- [63d6570e] JWT algorithm — RS256 only in prod | project:my-webapp
- [a8b12c44] Pytest fixtures must use tmp_path | project:default
- ...
memory_import

Bulk-import memories from a file, directory, or chat export.

Use this to seed Cortex with existing notes, CLAUDE.md content, documentation excerpts, or chat logs you want to make searchable. Each imported item runs through the same security scan and deduplication as memory_save, so clean imports even from messy sources.

Behaviour:

  • MUTATION. Writes one or more memory markdown files, updates FTS5 index, _index.md, and events.jsonl. Same atomic + fcntl-locked write path as memory_save.

  • No authentication required.

  • No rate limits, but latency scales with source size — importing a 100-item directory can take several seconds.

  • Data access scope: reads the supplied source_path from the local filesystem. Guarded against path traversal: the resolved path must be inside $HOME; anything outside is rejected. Nothing is sent over the network.

  • Not idempotent: re-importing the same source triggers the dedup check, which rejects duplicates with a summary count.

  • Failure modes: invalid or non-existent source paths return a string error. Individual rejected items are counted in the summary and do not abort the whole import.

Use memory_import for:

  • Initial bootstrap from an existing CLAUDE.md or notes folder

  • Absorbing a team-wide decision log into a project scope

  • One-off batch captures from a conversation export

Do NOT use for:

  • Incremental per-conversation saves (use memory_save for single items)

  • Mining Claude Code session logs (the background miner handles that automatically; no manual import needed)

Returns: A summary like "Imported 12 memories from 18 candidates (rejected 6 duplicates)". Errors are returned as human-readable messages.

Example: memory_import( source_path="/home/alice/notes/team-decisions.md", scope_id="my-webapp", )

transcript_search

Search raw Claude Code session transcripts for past conversation excerpts.

This is the DIFFERENT from memory_recall — it searches the raw session JSONL files under ~/.claude/projects/, not the mined memory corpus. Use it when you need to find the actual back-and-forth of a prior conversation, not the distilled lesson from it.

Behaviour:

  • Read-only. Does not modify transcripts, memories, or any index.

  • No authentication required.

  • No rate limits. Latency scales with transcript corpus size; typical 100-500ms across a year of daily sessions.

  • Data access scope: reads ~/.claude/projects/**/*.jsonl via direct filesystem access. Does NOT read ~/obsidian-brain/ (that's what memory_recall and memory_list are for). Nothing is sent over the network.

  • Idempotent and deterministic for a given filesystem state.

  • Failure modes: returns "No matching sessions" on empty result sets. Sessions older than Claude Code's 30-day retention window are not searchable (they've been deleted).

Use transcript_search when:

  • You want to recall "what did I actually say three weeks ago about X"

  • A mined memory references a session and you want the full context

  • You want to find all sessions that touched a specific file or topic

  • You're verifying a memory's source_session or debugging extraction

Do NOT use for:

  • Looking up durable knowledge (use memory_recall — mined, ranked, faster)

  • Listing memories (use memory_list)

  • Query-tailored context briefings (use context_assemble)

Returns: Markdown-formatted session excerpts with the session id, date, and matched text, or "No matching sessions" if nothing matches. Transcripts older than Claude Code's 30-day retention are not searchable.

Example: transcript_search(query="redis connection pool size", limit=3)

context_assemble

Assemble a query-tailored context briefing from all available knowledge.

This is the highest-value Cortex tool. It gathers the relevant subset of memories, the project's playbook, and related session transcripts, then uses Claude Haiku to synthesise a focused markdown briefing for the given query. The result is a ready-to-read summary, NOT a raw memory dump — usually 300-800 tokens of distilled relevant knowledge.

Behaviour:

  • Read-only with respect to the Cortex memory store. Bumps access telemetry on memories it reads (same as memory_recall).

  • No authentication required by Cortex itself. The optional Haiku synthesis step shells out to the local claude CLI, which may use Claude Code credentials the user already has signed in — Cortex does not handle those credentials directly.

  • Rate limits: depend on the claude CLI backend in the healthy path. In degraded mode (claude CLI missing), there are no rate limits at all — Cortex just returns raw materials.

  • Data access scope: reads ~/obsidian-brain/cortex/memories/, ~/obsidian-brain/cortex/playbooks/.md, ~/.cortex/search.db, and ~/.claude/projects/ transcripts. If the claude CLI is invoked, the gathered materials (up to 50KB) are sent to Haiku via that subprocess — which in turn sends them to Anthropic's API under the user's existing Claude Code session. In degraded mode nothing leaves the machine.

  • Latency: 3-15 seconds with Haiku; <500ms in degraded mode.

  • Not idempotent at the Haiku level: the same query can produce slightly different briefings across calls due to Haiku sampling. The underlying memory retrieval step IS deterministic.

  • Failure modes: returns "" on genuinely empty vaults. Never raises to the caller; Haiku failures silently fall back to returning the raw materials.

Use context_assemble when:

  • Starting a new session and you want the assistant loaded with context before the first real question (the auto-recall hook does this on UserPromptSubmit, but you can also call it manually)

  • Onboarding to a project mid-session — ask "what do I know about X?"

  • Before making a decision in an area where prior decisions exist

Do NOT use for:

  • Simple keyword lookups (use memory_recall — faster, no LLM call)

  • Listing memories (use memory_list)

  • Finding a specific past conversation (use transcript_search)

Degraded mode: if the claude CLI is not available on the host, this tool falls back to returning the raw materials (playbook + ranked memories) without Haiku synthesis, so it always returns SOMETHING useful.

Returns: A markdown briefing tailored to the query. Length is typically 300-800 tokens, with headers, bullet lists, and cross-references to memory IDs where relevant.

Example: context_assemble( query="help me fix the auth flow on staging", project="my-webapp", ) → returns a brief covering: the RS256 JWT decision, the known bcrypt.compare gotcha, a link to the staging-specific env var issue from last month, etc.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/TT-Wang/cortex-plugin'

If you have feedback or need assistance with the MCP directory API, please join our Discord server