We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/wannabidr/test-for-MCP'
If you have feedback or need assistance with the MCP directory API, please join our Discord server
README.md•3.62 KiB
# EyeLevel RAG MCP Server
A local Retrieval-Augmented Generation (RAG) system implemented as an MCP (Model Context Protocol) server. This server allows you to ingest markdown files into a local knowledge base and perform semantic search to retrieve relevant context for LLM queries.
## Features
- **Local RAG Implementation**: No external dependencies or paid services required
- **Markdown File Support**: Ingest and search through `.md` files
- **Semantic Search**: Uses sentence transformers for embedding-based similarity search
- **Persistent Storage**: Automatically saves and loads the vector index using FAISS
- **Chunk Management**: Intelligently splits documents into searchable chunks
- **Multiple Documents**: Support for ingesting and searching across multiple markdown files
## Installation
1. Clone this repository
2. Install dependencies using uv:
```bash
uv sync
```
## Dependencies
- `sentence-transformers`: For creating text embeddings
- `faiss-cpu`: For efficient vector similarity search
- `numpy`: For numerical operations
- `mcp[cli]`: For the MCP server framework
## Available Tools
### 1. `search_doc_for_rag_context(query: str)`
Searches the knowledge base for relevant context based on a user query.
**Parameters:**
- `query` (str): The search query
**Returns:**
- Relevant text chunks with relevance scores
### 2. `ingest_markdown_file(local_file_path: str)`
Ingests a markdown file into the knowledge base.
**Parameters:**
- `local_file_path` (str): Path to the markdown file to ingest
**Returns:**
- Status message indicating success or failure
### 3. `list_indexed_documents()`
Lists all documents currently in the knowledge base.
**Returns:**
- Summary of indexed files and chunk counts
### 4. `clear_knowledge_base()`
Clears all documents from the knowledge base.
**Returns:**
- Confirmation message
## Usage
1. **Start the server:**
```bash
python main.py
```
2. **Ingest markdown files:**
Use the `ingest_markdown_file` tool to add your `.md` files to the knowledge base.
3. **Search for context:**
Use the `search_doc_for_rag_context` tool to find relevant information for your queries.
## How It Works
1. **Document Processing**: Markdown files are split into chunks based on paragraphs and sentence boundaries
2. **Embedding Creation**: Text chunks are converted to embeddings using the `all-MiniLM-L6-v2` model
3. **Vector Storage**: Embeddings are stored in a FAISS index for fast similarity search
4. **Retrieval**: User queries are embedded and matched against the stored vectors to find relevant content
## File Structure
- `main.py`: Main server implementation with RAG functionality
- `pyproject.toml`: Project dependencies and configuration
- `rag_index.faiss`: FAISS vector index (created automatically)
- `rag_documents.pkl`: Serialized documents and metadata (created automatically)
## Configuration
The RAG system uses the `all-MiniLM-L6-v2` sentence transformer model by default. This model provides a good balance between speed and quality for semantic search tasks.
## Example Workflow
1. Prepare your markdown files with the content you want to search
2. Use `ingest_markdown_file` to add each file to the knowledge base
3. Use `search_doc_for_rag_context` to find relevant context for your questions
4. The retrieved context can be used by an LLM to provide informed answers
## Notes
- The first time you run the server, it will download the sentence transformer model
- The vector index is automatically saved and loaded between sessions
- Long documents are automatically chunked to optimize search performance
- The system supports multiple markdown files and maintains source file metadata