mcp-lancedb
🗄️ LanceDB MCP Server for LLMS
A Model Context Protocol (MCP) server that enables LLMs to interact directly the documents that they have on-disk through agentic RAG and hybrid search in LanceDB. Ask LLMs questions about the dataset as a whole or about specific documents.
✨ Features
- 🔍 LanceDB-powered serverless vector index and document summary catalog.
- 📊 Efficient use of LLM tokens. The LLM itself looks up what it needs when it needs.
- 📈 Security. The index is stored locally so no data is transferred to the Cloud when using a local LLM.
🚀 Quick Start
To get started, create a local directory to store the index and add this configuration to your Claude Desktop config file:
MacOS: ~/Library/Application\ Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json
Prerequisites
- Node.js 18+
- npx
- MCP Client (Claude Desktop App for example)
- Summarization and embedding models installed (see config.ts - by default we use Ollama models)
ollama pull snowflake-arctic-embed2
ollama pull llama3.1:8b
Demo
<img src="https://github.com/user-attachments/assets/90bfdea9-9edd-4cf6-bb04-94c9c84e4825" width="50%">Local Development Mode:
Use npm run build
to build the project.
Use npx @modelcontextprotocol/inspector dist/index.js PATH_TO_LOCAL_INDEX_DIR
to run the MCP tool inspector.
Seed Data
The seed script creates two tables in LanceDB - one for the catalog of document summaries, and another one - for vectorized documents' chunks. To run the seed script use the following command:
You can use sample data from the docs/ directory. Feel free to adjust the default summarization and embedding models in the config.ts file. If you need to recreate the index, simply rerun the seed script with the --overwrite
option.
Catalog
- Document summary
- Metadata
Chunks
- Vectorized document chunk
- Metadata
🎯 Example Prompts
Try these prompts with Claude to explore the functionality:
📝 Available Tools
The server provides these tools for interaction with the index:
Catalog Tools
catalog_search
: Search for relevant documents in the catalog
Chunks Tools
chunks_search
: Find relevant chunks based on a specific document from the catalogall_chunks_search
: Find relevant chunks from all known documents
📜 License
This project is licensed under the MIT License - see the LICENSE file for details.
A Model Context Protocol (MCP) server that enables LLMs to interact directly the documents that they have on-disk through agentic RAG and hybrid search in LanceDB. Ask LLMs questions about the dataset as a whole or about specific documents.