Skip to main content
Glama

Cérebro - MCP Knowledge Base

Cérebro is a RAG (Retrieval-Augmented Generation) engine that connects your local markdown notes to AI agents via MCP (Model Context Protocol).

Instead of your AI agent forgetting what you wrote between conversations, Cérebro indexes your .md files and serves them as searchable context on demand. Works with opencode, Claude Desktop, Cursor, and any MCP-compatible client.

Cérebro powers Pink - a TUI agent that uses Cérebro as its knowledge backend.

Features

Component

What it does

RAG Engine

Hybrid search (vector embeddings + BM25) over your markdown vault

MCP Server

Exposes search, context, stats, and index tools to any MCP client

ChromaDB

Local vector database - your data never leaves your machine

Metrics Pipeline

Tracks RAG queries, vault growth, skills usage over time (timeline.jsonl)

RTK

CLI output compressor by rtk-ai - ~89% savings on tool outputs

Headroom

Context window proxy by headroomlabs-ai - 60–95% compression

KM Structure

Organized knowledge management: inbox → knowledge → patterns → glossary

Healthcheck

CLI dashboard showing vault health, orphans, stale projects, and trends


Tutorial

Prerequisites

Before you start, make sure you have:

  • Python 3.10+ - check with python --version

  • A folder of markdown files - your notes, docs, journal, anything .md

  • Git - to clone the repo (git --version)

1. Download

git clone https://github.com/ricardopiresqa/cerebro.git
cd cerebro

2. Install dependencies

pip install -r requirements.txt

This installs ChromaDB, sentence-transformers, and everything Cérebro needs.

If you're on Windows and get encoding errors, try:

$env:PYTHONUTF8 = "1"
pip install -r requirements.txt

3. Configure

Copy the environment template:

cp .env.example .env

Or on Windows:

copy .env.example .env

Open .env and set at least CEREBRO_VAULT_PATH to the folder with your .md files:

CEREBRO_VAULT_PATH=C:/Users/you/Documents/notes

Tip: Use forward slashes (/) even on Windows.

4. Index your knowledge base

This reads every .md file, splits it into chunks, and stores vector embeddings in ChromaDB:

python src/rag_core.py --vault "%CEREBRO_VAULT_PATH%" --action index

On PowerShell:

python src/rag_core.py --vault "$env:USERPROFILE\Documents\notes" --action index

First run - may take a few minutes depending on how many files you have. Subsequent runs - incremental, only processes new/changed files.

5. Start the MCP server

python src/rag_mcp.py

You won't see much output - that's normal. The server is waiting for MCP requests on stdin/stdout.

To stop it: press Ctrl+C.

6. Connect from your AI agent

opencode

Add to your opencode.jsonc:

{
  "mcp": {
    "cerebro": {
      "type": "local",
      "command": ["python", "C:/path/to/cerebro/src/rag_mcp.py"],
      "environment": {
        "CEREBRO_VAULT_PATH": "C:/Users/you/Documents/notes"
      }
    }
  }
}

Windows note: Use the full path to rag_mcp.py and forward slashes. Replace CEREBRO_VAULT_PATH with your vault folder.

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "cerebro": {
      "command": "python",
      "args": ["C:/path/to/cerebro/src/rag_mcp.py"],
      "env": {
        "CEREBRO_VAULT_PATH": "C:/Users/you/Documents/notes"
      }
    }
  }
}

Cursor

In Cursor settings > Features > MCP Servers, add a new server:

Field

Value

Name

cerebro

Type

command

Command

python C:/path/to/cerebro/src/rag_mcp.py

Environment

CEREBRO_VAULT_PATH=C:/Users/you/Documents/notes

7. Use it

Once connected, ask your agent to search your notes. Examples:

"What did I write about authentication?"
"Search my notes for React patterns"
"Context: what was the last decision about the database?"

These map to the MCP tools:

Tool

Description

Example

search("query")

Find relevant chunks

search("how does auth work?")

context("topic")

RAG + last session merged

context("what we decided about X")

stats()

Number of indexed chunks

stats()

index()

Reindex on demand

index()

8. Keep it updated

Reindex whenever you add or change files:

python src/rag_core.py --vault "%CEREBRO_VAULT_PATH%" --action index

Or from your AI agent via the index() tool.


Related MCP server: Speakeasy Docs MCP

Configuration reference

Variable

Required

Default

Description

CEREBRO_VAULT_PATH

Yes

-

Path to your markdown folder

CEREBRO_CHROMA_DB_PATH

No

~/.cerebro/db

Where ChromaDB stores vectors

CEREBRO_PYTHON

No

python

Python path for MCP config


Troubleshooting

"python not found" on Windows

Use the full path or check if Python is in your PATH:

where python

If missing, reinstall Python and check "Add to PATH".

Server starts but agent can't connect

Make sure CEREBRO_VAULT_PATH points to an existing folder with .md files. Run index first before starting the server.

Indexing takes too long

First index on a large vault can take 5-10 minutes. Subsequent runs are incremental and fast.

Port already in use

Cérebro uses stdin/stdout (not TCP), so there's no port conflict. If you're using a TCP-based MCP transport, check the port.


Requirements

  • Python 3.10+

  • A folder of .md files

  • ~2GB disk space for ChromaDB (varies with vault size)


FAQ

Do I need a GPU?
No. Embeddings run on CPU.

Will this upload my data?
No. Everything runs locally. Your notes never leave your machine.

Can I use it with any LLM?
Yes. Cérebro serves context to your agent - it doesn't care which LLM the agent uses.

Is it only for Obsidian vaults?
Any folder with .md files works. Obsidian, Foam, Logseq, or plain markdown.

What is RTK?
RTK (Runtime Token Kompressor) by rtk-ai - CLI tool in Rust that compresses command outputs before they enter context. ~89% noise removal across 2,900+ real commands.

What is Headroom?
Headroom by headroomlabs-ai - API proxy that applies lossy/lossless compression on prompts and tool outputs (47%–92% reduction). Works with any agent.

Does Cérebro track usage metrics?
Yes. Cérebro logs RAG queries, vault growth, skill usage, and health snapshots automatically in _metrics/. These power the healthcheck CLI and can feed into external dashboards.

Can I use Cérebro without Obsidian?
Yes. Any folder of .md files works. The KM structure (inbox → knowledge → patterns → glossary) is optional but recommended for organizing knowledge at scale.


License

MIT

A
license - permissive license
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ricardopiresqa/cerebro'

If you have feedback or need assistance with the MCP directory API, please join our Discord server