Skip to main content
Glama
wannabidr

EyeLevel RAG MCP Server

by wannabidr

EyeLevel RAG MCP Server

A local Retrieval-Augmented Generation (RAG) system implemented as an MCP (Model Context Protocol) server. This server allows you to ingest markdown files into a local knowledge base and perform semantic search to retrieve relevant context for LLM queries.

Features

  • Local RAG Implementation: No external dependencies or paid services required

  • Markdown File Support: Ingest and search through .md files

  • Semantic Search: Uses sentence transformers for embedding-based similarity search

  • Persistent Storage: Automatically saves and loads the vector index using FAISS

  • Chunk Management: Intelligently splits documents into searchable chunks

  • Multiple Documents: Support for ingesting and searching across multiple markdown files

Installation

  1. Clone this repository

  2. Install dependencies using uv:

    uv sync

Dependencies

  • sentence-transformers: For creating text embeddings

  • faiss-cpu: For efficient vector similarity search

  • numpy: For numerical operations

  • mcp[cli]: For the MCP server framework

Available Tools

1. search_doc_for_rag_context(query: str)

Searches the knowledge base for relevant context based on a user query.

Parameters:

  • query (str): The search query

Returns:

  • Relevant text chunks with relevance scores

2. ingest_markdown_file(local_file_path: str)

Ingests a markdown file into the knowledge base.

Parameters:

  • local_file_path (str): Path to the markdown file to ingest

Returns:

  • Status message indicating success or failure

3. list_indexed_documents()

Lists all documents currently in the knowledge base.

Returns:

  • Summary of indexed files and chunk counts

4. clear_knowledge_base()

Clears all documents from the knowledge base.

Returns:

  • Confirmation message

Usage

  1. Start the server:

    python main.py
  2. Ingest markdown files: Use the ingest_markdown_file tool to add your .md files to the knowledge base.

  3. Search for context: Use the search_doc_for_rag_context tool to find relevant information for your queries.

How It Works

  1. Document Processing: Markdown files are split into chunks based on paragraphs and sentence boundaries

  2. Embedding Creation: Text chunks are converted to embeddings using the all-MiniLM-L6-v2 model

  3. Vector Storage: Embeddings are stored in a FAISS index for fast similarity search

  4. Retrieval: User queries are embedded and matched against the stored vectors to find relevant content

File Structure

  • main.py: Main server implementation with RAG functionality

  • pyproject.toml: Project dependencies and configuration

  • rag_index.faiss: FAISS vector index (created automatically)

  • rag_documents.pkl: Serialized documents and metadata (created automatically)

Configuration

The RAG system uses the all-MiniLM-L6-v2 sentence transformer model by default. This model provides a good balance between speed and quality for semantic search tasks.

Example Workflow

  1. Prepare your markdown files with the content you want to search

  2. Use ingest_markdown_file to add each file to the knowledge base

  3. Use search_doc_for_rag_context to find relevant context for your questions

  4. The retrieved context can be used by an LLM to provide informed answers

Notes

  • The first time you run the server, it will download the sentence transformer model

  • The vector index is automatically saved and loaded between sessions

  • Long documents are automatically chunked to optimize search performance

  • The system supports multiple markdown files and maintains source file metadata

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wannabidr/test-for-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server