How do I use local-rag-core?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@local-rag-core Search for how to handle background tasks in FastAPI" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

local-rag-core

by linkwut-create

Overview Schema Related Servers Score Discussions

Python

Local

local-rag-core

Headless local knowledge library and RAG substrate. It is designed to act as a secure, local, structure-preserving data layer for coding assistants (like Claude Code, Codex, and OpenCode), browsers, translators, and agentic workflows.

1. Core Principles

Headless & UI-less: This is not a chat application. It manages file ingestion, document chunking, indexing, hybrid retrieval, reranking, and source attribution.
Strict logical separation:
- private_kb: User's long-term private notes and documentation.
- packs: Modular, shareable documentation packages that can be registered, enabled, or disabled.
- private_project: Local source code repositories with filters protecting sensitive data.
Always Attributed: All retrieved chunks specify their source_type, pack_id/project_id, chunk ID, file path, section context, and normalized relevance scores.
Read-Only MCP Server: The MCP tool interface is strictly read-only (kb.search, kb.get_chunk, kb.list_packs, kb.health). Data modifications must occur via the CLI or HTTP API.

Related MCP server: Modular RAG MCP Server

2. Technology Stack

Python: >=3.10
SQLite / FTS5: Relational registries, metadata index, and BM25 full-text search.
Zvec: Embedded in-process vector database for semantic indexing.
MCP: Low-level mcp.server.Server stdio server; read-only by design.
FastAPI & Uvicorn: High-performance HTTP microservice.
Typer: Type-safe CLI builder.

3. Installation

Minimal Install

Core engine with CLI, SQLite FTS5 search, and mock fallbacks (no ML deps):

pip install -e .

Development Install

Core + testing, linting, API/HTTP test deps:

pip install -e ".[dev]"

This is sufficient for running the default fast test lane on a clean machine. Tests marked real_model require locally cached model dependencies and are reserved for local maintenance verification.

For day-to-day development, use the fast lane:

ruff check .
pytest -q -m "not real_model"

GitHub Actions runs this fast lane on push and pull request. Use the full real-backend lane before a maintenance handoff on a machine with cached models:

pytest
rag health --strict
rag integrity --deep
python scripts/run_eval.py
python scripts/ops_verify.py

scripts/run_eval.py prewarms the retriever by default and reports the warmup latency separately from per-query latency. Use --no-warmup when you explicitly want to measure process/model cold start.

Full Install

All backends and interfaces — real embeddings, vector store, reranker, MCP, HTTP API:

pip install -e ".[all]"

Selective Install

Install only the extras you need:

# MCP Server (stdio-based, for Claude Code / Codex / Gemini)
pip install -e ".[mcp]"

# HTTP FastAPI Server
pip install -e ".[api]"

# Real neural embeddings (BAAI/bge-m3 via sentence-transformers)
pip install -e ".[embedding]"

# Real cross-encoder reranking (BAAI/bge-reranker-large via CrossEncoder)
pip install -e ".[rerank]"

# Native Zvec vector index
pip install -e ".[vector]"

# ModelScope (MotSE) model download source
pip install -e ".[modelscope]"

4. Interfaces & Usage

4.1 CLI Interface (`rag` and `rag-project`)

System Health Check

rag health

Pack Registry Management

# List all registered packs and code projects
rag pack list

# Enable or disable specific packs
rag pack enable fastapi_docs
rag pack disable python_docs

# Build, export, and import packs
rag pack build my_pack /path/to/docs --name "My Documentation Pack" --domain "tech"
rag pack export my_pack --output /path/to/my_pack.tar.gz
rag pack import /path/to/my_pack.tar.gz

General Document Ingestion

rag ingest <file-or-dir> --pack <id> --source-type <private|pack>

Code Project Ingestion (Standalone or Subcommand)

Code ingestion supports common extensions (.py, .js, .ts, .go, .rs, etc.) and respects exclusions (.git, node_modules, .env).

# Using subcommand
rag project ingest /path/to/my-code --project my-app-id

# Using standalone tool
rag-project /path/to/my-code --project my-app-id

Search

rag search "how to handle background tasks" --limit 5 --mode hybrid --rerank bge

Notes on scope:

If no --packs scope is given, the engine searches all enabled packs.
--packs pack1,pack2 restricts results to those packs plus any user private notes (source_type='private'); other private_project packs are not automatically included.
Add --no-private to exclude user private notes while still honoring the listed packs (useful for targeting a single project).

Retrieve Chunk Details

rag chunk get <chunk_id>

4.2 HTTP API Interface

Configure a write token and the filesystem roots that HTTP write operations may access. On Windows, separate multiple roots with ;; on Unix, use :.

$env:LOCAL_RAG_API_TOKEN = "replace-with-a-long-random-token"
$env:LOCAL_RAG_ALLOWED_ROOTS = "C:\Users\Zero\Documents;D:\Knowledge"

Run the FastAPI microservice:

uvicorn local_rag_core.interfaces.api:app --host 127.0.0.1 --port 8000

All mutating endpoints require the token in the X-Local-RAG-Token request header. Path-based write endpoints reject paths outside LOCAL_RAG_ALLOWED_ROOTS. Read-only health, pack listing, search, and chunk retrieval endpoints do not require the token.

Endpoints Summary

GET /health: Returns database health check and chunk count metadata.
GET /packs: Returns registered documentation packages and projects list.
POST /packs/{pack_id}/enable: Enables a registered pack.
POST /packs/{pack_id}/disable: Disables a registered pack.
GET /chunks/{chunk_id}: Retrieves full text content and metadata of a specific chunk.
POST /project/ingest: Ingests a local project repository.
- Body: {"path": "/absolute/path", "project": "project_id"}
POST /packs/build: Compiles a documentation source directory into a self-contained pack.
- Body: {"pack_id": "my_pack_id", "source_dir": "/path/to/docs", "name": "My Pack", "domain": "tech", "description": "desc"}
POST /packs/export: Bundles an installed pack directory into a tarball archive.
- Body: {"pack_id": "my_pack_id", "output_path": "/path/to/my_pack_id.tar.gz"}
POST /packs/import: Decompresses and registers a pack archive into the local library.
- Body: {"archive_path": "/path/to/my_pack_id.tar.gz"}

POST /search: Queries the RAG substrate using keyword, semantic, or hybrid configurations.

Body:

{
  "query": "FastAPI background workers",
  "limit": 8,
  "mode": "hybrid",
  "scope_private": true,
  "scope_packs": ["fastapi_docs"],
  "rerank": "bge"
}

4.3 MCP Tool Interface

The MCP server connects local-rag-core directly to LLM clients (like Claude Desktop or Claude Code) using stdio.

Run the MCP server directly:

C:\Users\Zero\AppData\Local\hermes\hermes-agent\venv\Scripts\python.exe -m local_rag_core.interfaces.mcp_server

Use a Python runtime that can import both local_rag_core and the mcp SDK. On a migrated machine, update the AI-tool MCP configs to that machine's verified Python path.

Registered MCP Tools

kb.health: Checks database health and indexes (No arguments).
kb.list_packs: Lists registered packages/projects (No arguments).
kb.get_chunk: Fetches the full contents of a single chunk (Args: chunk_id).
kb.search: Performs semantic, keyword, or hybrid query search.
- Args: query (str), limit (int), mode (str), no_private (bool), scope_packs (List[str]), rerank (str).

Claude Desktop Configuration

Add this to your claude_desktop_config.json:

{
  "mcpServers": {
    "local-rag-core": {
      "command": "python",
      "args": [
        "-m",
        "local_rag_core.interfaces.mcp_server"
      ]
    }
  }
}

5. Security & Exclusions

In order to prevent indexing credentials or build outputs, the scanner ignores files matching patterns defined in should_ignore():

Directories: .git, .venv, venv, node_modules, build, dist, __pycache__
Files: .env, .pem, .key, registry.sqlite, .log

Pack IDs are restricted to ASCII letters, digits, dots, underscores, and hyphens. Pack archives reject absolute paths, traversal entries, links, special files, excessive member counts, and excessive extracted sizes.

6. Embedding Backend Modes

local-rag-core supports two embedding backends, controlled by the LOCAL_RAG_EMBEDDING_BACKEND environment variable.

mock (default when `sentence-transformers` is not installed)

A deterministic word-overlap hash-based embedding generator. Each word is hashed to a pseudo-random unit vector; the document vector is the L2-normalized sum of its word vectors.

Purpose: structural testing, CI, and lightweight development.
Does NOT provide true semantic retrieval.
Forced by: LOCAL_RAG_EMBEDDING_BACKEND=mock

bge-m3 (requires `sentence-transformers`)

Uses the BAAI/bge-m3 model loaded via sentence-transformers. This is the real semantic embedding backend.

Install: pip install -e ".[embedding]" (add ,modelscope to prioritize ModelScope downloads)
Model: BAAI/bge-m3 (overridable via EMBEDDING_MODEL)
Cache: HuggingFace default cache (~/.cache/huggingface/hub/) or LOCAL_RAG_MODEL_CACHE if set. ModelScope uses a modelscope/ subdir under the same cache root when selected.
Offline by default: set ALLOW_MODEL_DOWNLOAD=true to allow downloading the model on first use.
Forced by: LOCAL_RAG_EMBEDDING_BACKEND=bge-m3

Model Download Sources

By default (LOCAL_RAG_MODEL_SOURCE=auto), local-rag-core tries to download models from ModelScope (魔搭) first when the modelscope package is installed, and falls back to HuggingFace Hub otherwise. Model IDs are identical on both platforms (BAAI/bge-m3, BAAI/bge-reranker-large).

Source	Behavior
`auto`	ModelScope first if installed, else HuggingFace
`modelscope`	Force ModelScope; error if `modelscope` is not installed
`huggingface`	Force HuggingFace Hub

Set USE_MODELSCOPE=true as a shorthand for LOCAL_RAG_MODEL_SOURCE=modelscope. If both are set, LOCAL_RAG_MODEL_SOURCE wins.

auto (default)

When LOCAL_RAG_EMBEDDING_BACKEND is unset or set to auto, the system selects bge-m3 if sentence-transformers is importable, otherwise falls back to mock.

Configuration Reference

Env Var	Values	Default	Effect
`LOCAL_RAG_EMBEDDING_BACKEND`	`auto`, `mock`, `bge-m3`	`auto`	Selects embedding backend
`LOCAL_RAG_MODEL_SOURCE`	`auto`, `modelscope`, `huggingface`	`auto`	Primary model download source
`USE_MODELSCOPE`	`true`, `false`, `1`, `0`	`false`	Shorthand for `LOCAL_RAG_MODEL_SOURCE=modelscope`
`LOCAL_RAG_MODEL_CACHE`	any path	(HF default)	Custom model cache directory
`EMBEDDING_MODEL`	HF/MS model name	`BAAI/bge-m3`	Which model to load
`ALLOW_MODEL_DOWNLOAD`	`true`, `false`, `1`, `0`	`false`	Allow first-time model download
`DEVICE`	`cpu`, `cuda`	`cpu`	Torch device

Health Output

$ rag health
Embedding backend: mock       # sentence-transformers not installed, or forced
Embedding backend: bge-m3     # sentence-transformers available and model loaded
Model source: auto            # auto / modelscope / huggingface
ModelScope available: false   # true when modelscope package is installed
Registered packs: 22
Indexed chunks: 69341
Disabled packs: 2
Pack status counts: {"disabled": 2, "enabled": 20}
Pack source type counts: {"pack": 12, "private_project": 6, "source_code_index": 4}
Chunk source type counts: {"pack": 58437, "private_project": 10841, "source_code_index": 63}
Latest audit action: ingest_path

7. Mock / Fallback Modes

To support light-weight local development and testing, local-rag-core provides built-in fallback modes if machine learning dependencies or databases are missing:

Mock Embedding: See §6 above for full details.
Simple Flat Vector Store: If the zvec binary extension is not installed, the engine uses SimpleFlatVectorStore, a pure-Python in-process cosine similarity engine storing vectors in JSON files.
Mock Reranking: If sentence-transformers / CrossEncoder cannot be used, the reranker falls back to a simple query-document word-overlap heuristic.
Verification: Always run rag health --strict to inspect which backends are active (mock vs. real). Production readiness requires installing the corresponding extras (embedding, rerank, vector) and locally cached models.

Embedding model loading is offline by default. Even when sentence-transformers is installed, only locally cached model files are used. Set ALLOW_MODEL_DOWNLOAD=true only when a model download has been explicitly approved.

8. Knowledge Governance

Long-term use of local-rag-core depends on clear boundaries between different kinds of knowledge. See KNOWLEDGE_GOVERNANCE.md for the full policy. The summary below describes the source types and basic rules.

Source Types

`source_type`	Purpose	Example Content
`private_project`	Documentation of an active project you maintain.	`README.md`, `CLAUDE.md`, `PROJECT_STATUS.md`, `CHANGELOG.md`, `docs/*/.md`
`pack`	Reusable, shareable documentation package.	Tutorials, framework guides, methodology docs
`translator_pack`	Translation terminology, profiles, and history.	Glossaries, profiles, translation memory
`browser_saved`	Web pages explicitly saved by the user.	Curated web articles, reference pages
`browser_context`	Temporary context from the current web page.	Current page summary, used once and discarded
`source_code_index`	Indexed source code (disabled by default).	`src/*/.py` only when explicitly enabled
`scratch`	Temporary experimental material.	Drafts, quick tests, one-off notes

Pack ID Conventions

Active project docs: <project_name>_project
- Example: local_rag_core_project, local_llm_pipeline_project
Reusable doc package: <topic>_pack
- Example: fastapi_docs_pack
Translation assets: translator_pack
Saved web pages: browser_saved

Ingestion Rules

Project docs go into private_project.
Reusable tutorials / frameworks go into pack.
Translation assets go into translator_pack.
Web pages are temporary by default; manual save is required for browser_saved.
Source code indexing is explicit-scope only. Curated source indexes use source_code_index and should be queried intentionally rather than mixed into broad documentation retrieval by default.
MCP tools are read-only. Ingestion, export, import, delete, and reindex must use the CLI or authenticated HTTP API.

Exclusions

The scanner ignores:

.git/, .venv/, venv/, __pycache__/
node_modules/, dist/, build/
storage/, data/packs/
.env, .pem, .key, .log
Large binaries and generated lock files unless explicitly required

9. Verification & Testing Commands

To run basic checks, test suites, and inspect RAG health:

# 1. Install development tools and run test suite
pip install -e ".[dev]"
pytest
ruff check .

# 2. Run system health check and verify status
rag health

# 3. Test pack building and list packages
rag pack list
rag pack build my_docs_pack ./docs --name "Documentation Pack"
rag pack list

# 4. Ingest and query
rag ingest README.md --pack readme_pack --source-type pack
rag search "automatic query routing" --limit 3 --mode keyword
rag search "local knowledge library for code assistants" --limit 3 --mode hybrid

# 5. Verify MCP Tool interface (if mcp extra installed)
C:\Users\Zero\AppData\Local\hermes\hermes-agent\venv\Scripts\python.exe -m local_rag_core.interfaces.mcp_server

# 6. Verify AI-tool entrypoint readiness
#    Checks Codex, Claude Code, Gemini CLI, OpenCode config, and stdio launch.
PYTHONIOENCODING=utf-8 python scripts/ops_verify.py

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

1Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Related MCP Servers

ragi
RAG Systems Search
SusuTawar
A
license
A
quality
B
maintenance
Local-first RAG indexing and semantic search MCP server. Enables document retrieval and context-aware queries using local embedding models.
Last updated 2026-07-23
3
25
MIT
Modular RAG MCP Server
RAG Systems Vector Databases
wuwux666
A
license
-
quality
B
maintenance
Converts unstructured documents into a searchable knowledge base and exposes retrieval tools via MCP protocol for AI agents to query.
Last updated 2026-06-28
MIT
opencode-docs
Documentation Access Search Web Scraping
salmenkhelifi1
F
license
-
quality
D
maintenance
Scrapes, stores, and searches documentation locally, enabling AI assistants to access and query documentation via MCP.
Last updated 2026-01-14
4
kb-mcp
Knowledge & Memory Search
HelloTomBruce
A
license
-
quality
B
maintenance
Provides LLM agents with a structured, queryable, local-first knowledge base with typed documents and full-text search via MCP.
Last updated 2026-07-31
MIT

View all related MCP servers

Related MCP Connectors

Darwin RAG
Local-first RAG engine with MCP server for AI agent integration.
Dewey
Agentic search over your Dewey document collections from any MCP-compatible client.
ContextLattice
Private-by-default, local-first memory/context/task orchestrator for MCP apps and agents.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/linkwut-create/local-rag-core'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

local-rag-core

1. Core Principles

2. Technology Stack

3. Installation

Minimal Install

Development Install

Full Install

Selective Install

4. Interfaces & Usage

4.1 CLI Interface (rag and rag-project)

System Health Check

Pack Registry Management

General Document Ingestion

Code Project Ingestion (Standalone or Subcommand)

Search

Retrieve Chunk Details

4.2 HTTP API Interface

Endpoints Summary

4.3 MCP Tool Interface

Registered MCP Tools

Claude Desktop Configuration

5. Security & Exclusions

6. Embedding Backend Modes

mock (default when sentence-transformers is not installed)

bge-m3 (requires sentence-transformers)

Model Download Sources

auto (default)

Configuration Reference

Health Output

7. Mock / Fallback Modes

8. Knowledge Governance

Source Types

Pack ID Conventions

Ingestion Rules

Exclusions

9. Verification & Testing Commands

Maintenance

Resources

Looking for Admin?

Related MCP Servers

ragi

Modular RAG MCP Server

opencode-docs

kb-mcp

Related MCP Connectors

Latest Blog Posts

MCP directory API

4.1 CLI Interface (`rag` and `rag-project`)

mock (default when `sentence-transformers` is not installed)

bge-m3 (requires `sentence-transformers`)