Which integrations are available for this server?

Provides semantic search and RAG over markdown documentation using Ollama for embeddings and chat.

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@rag query how to set up the project" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

rag

by FrameMuse

Overview Schema Related Servers Score Discussions

TypeScript

Local

rag

rag is a CLI tool and MCP server that turns codebases and documentation into a searchable, queryable knowledge base with vector search, RAG, and a structural knowledge graph.

Prerequisites

Bun runtime
Ollama running locally with embedding model (auto-pulled if missing)

Minimum hardware

Component	Requirement
RAM	4 GB (8 GB for larger doc sets)
CPU	Any x86-64 or ARM64, 2+ cores
GPU	Optional. Any NVIDIA GPU with 2+ GB VRAM. CPU-only fallback is functional but slower
Disk	100 MB for index (scales with doc count)

Indexing 5000 chunks: ~25s on RTX 3060, ~3min on CPU-only.

Related MCP server: Knowledge Base MCP Server

Install

git clone https://github.com/FrameMuse/llm-rag.git
cd llm-rag
bun install

Add shell alias:

alias rag='bun /path/to/llm-rag/scripts/cli.ts'

Quick start

cd my-project
rag init              # create .rag/ project scope
rag index             # chunk, embed, index all files
rag mcp search "..."  # search indexed content
rag mcp graph "..."   # query knowledge graph
rag serve             # start MCP server

Commands

Command	Description
`rag init`	Create .rag/ config, mcp.json, .gitignore
`rag index`	Chunk files, embed via Ollama, store in LanceDB
`rag serve`	Start MCP server (STDIO) for current .rag/ scope
`rag graph build`	Build knowledge graph from code and docs
`rag mcp <tool>`	One-shot CLI proxy for MCP tools
`rag info`	Show index statistics

rag mcp tools

Tool	Usage	Description
`search`	`rag mcp search "query" [--chunks N] [--limit N]`	Semantic search
`graph`	`rag mcp graph "topic" [--signature] [--limit N]`	Knowledge graph query
`get-document`	`rag mcp get-document <path>`	Read file content
`list-documents`	`rag mcp list-documents`	List indexed files
`config`	`rag mcp config`	Print opencode.json snippet

Project scope (.rag/)

project/
├── .rag/
│   ├── config.json       # { name, embedModel, ragModel, pattern, chunks, temperature }
│   ├── mcp.json          # MCP config snippet for opencode.json
│   ├── .gitignore        # *
│   ├── data/
│   │   ├── lancedb/      # Vector index (generated by rag index)
│   │   └── graph.json    # Knowledge graph (generated by rag index)
├── *.md
├── src/
└── ...

Each project keeps its index and graph local. rag discovers .rag/ by walking up from current directory (like git).

MCP integration

{
  "mcp": {
    "my-project": {
      "type": "local",
      "command": ["rag", "serve"],
      "cwd": "/path/to/project",
      "enabled": true
    }
  }
}

The MCP server exposes 8 tools:

Tool	Purpose
`search`	Vector search
`graph_find`	Search graph nodes
`graph_neighbors`	Node connections
`graph_god_refs`	Core abstractions
`graph_path`	Shortest path
`graph_communities`	List communities
`list_documents`	List indexed files
`get_document`	Read file content

Run rag mcp config from project directory to print the snippet with cwd pre-filled.

Architecture

flowchart LR
  MD[.md files] --> Chunker
  MD2[.ts/.js files] --> AST
  AST -->|declarations| Graph
  MD -->|headings + links| Graph
  Chunker -->|heading split| Chunks
  Chunks -->|Ollama embed| Vectors
  Vectors -->|store| LanceDB
  Query -->|embed| LanceDB
  LanceDB -->|search| Results
  Question -->|embed + search| Context
  Context -->|Ollama chat| Answer
  Graph -->|structural context| Answer

Vector RAG: chunks embedded → vector search → top K → LLM synthesis
Knowledge graph: TS/JS AST and MD headings/links → nodes + edges → structural queries

Knowledge graph

The knowledge graph extracts structural relationships from TypeScript, JavaScript, and Markdown files:

TS/JS: functions, classes, interfaces, types, enums, imports, extends, class members
MD: headings, frontmatter titles, cross-document links

Two-tier design

Free-form — shows everything the graph knows about a topic in one report:

rag mcp graph "render"
→ Matching references + top match detail + connections + community + god rank + surprises

Subcommands — focused queries when you know what you need:

Subcommand	Description
`rag mcp graph god-refs [--limit N]`	Most connected core abstractions
`rag mcp graph communities`	List all directory-based communities
`rag mcp graph community <id>`	Show all references in a community
`rag mcp graph surprises [--limit N]`	Cross-community surprising connections
`rag mcp graph cycles`	Detect circular imports
`rag mcp graph neighbors <node>`	Connections for a node
`rag mcp graph path <from> <to>`	Shortest path between two nodes
`rag mcp graph list`	Reference and edge counts

Flags:

--signature — show declaration signatures (e.g., function render(ctx: CanvasCtx): void)
--limit N — max results to show (default 10)
--dir in|out|both — direction for neighbors (default both)
--type <edgeType> — filter edges by type

Built automatically at the end of each rag index. Incrementally updated during --watch mode.

Vision (image captioning)

Images are captioned via qwen3-vl during index phase 2 (text first, then images in parallel with 4 workers). The caption text is embedded and stored alongside text chunks, making images searchable by description.

Supported: .png .jpg .jpeg .gif .webp .svg (SVG via sharp).

Requires qwen3-vl pulled in Ollama.

Configuration

.rag/config.json:

{
  "name": "my-project",
  "embedModel": "mxbai-embed-large",
  "ragModel": "llama3.2:3b",
  "visionModel": "qwen3-vl",
  "pattern": "",
  "chunks": 8,
  "temperature": 0.3
}

Models auto-pull if missing. --chunks overrides per query.

License

MIT

This server cannot be installed

license - not found

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/FrameMuse/llm-rag'

If you have feedback or need assistance with the MCP directory API, please join our Discord server