Local FAISS MCP Server

CLAUDE.md•7.19 kB

# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview This is an MCP (Model Context Protocol) server implementation that uses FAISS (Facebook AI Similarity Search) as a local vector database for Retrieval-Augmented Generation (RAG) applications. The goal is to create a self-contained, local RAG system where: - FAISS handles vector storage and similarity search operations - The MCP server exposes FAISS functionality as tools for AI agent interaction - Documents can be ingested, chunked, embedded, and stored locally - AI agents can query the vector store using natural language ## Architecture The system has three main components: 1. **FAISS Vector Store**: Local vector database that stores and indexes document embeddings - Supports in-memory or disk-persisted indexes - Uses similarity metrics (L2 distance, dot product, cosine similarity) 2. **MCP Server**: Provides tool interface for agent interaction - Tool: `ingest_document` - handles document chunking, embedding generation, and storage in FAISS - Tool: `query_rag_store` - performs similarity searches to retrieve relevant document chunks 3. **Agent Integration**: Enables natural language interaction with the vector store - AI agents use MCP tools to interact with FAISS-backed storage - Retrieved chunks augment agent responses for RAG ## Key Design Principles - **Local-first**: All storage and operations happen locally, no external vector DB services required - **MCP Protocol**: Follows Model Context Protocol specifications for tool definitions and agent interaction - **Embedding-based Search**: Uses vector embeddings for semantic similarity search rather than keyword matching ## Development Commands ### Setup ```bash # Create and activate virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install in development mode pip install -e . ``` ### Running the Server ```bash # Using the installed command (easiest) local-faiss-mcp --index-dir /path/to/index/directory # With custom embedding model local-faiss-mcp --index-dir /path/to/index/directory --embed all-mpnet-base-v2 # As a Python module python -m local_faiss_mcp --index-dir /path/to/index/directory # Direct execution (for development) python local_faiss_mcp/server.py --index-dir /path/to/index/directory ``` **Command-line Arguments:** - `--index-dir`: Directory to store FAISS index and metadata (default: current directory) - `--embed`: Hugging Face embedding model name (default: all-MiniLM-L6-v2) ### Testing ```bash # Run tests pytest ``` ### Building and Publishing ```bash # Build package python -m build # Check package twine check dist/* # Publish to PyPI (see PUBLISHING.md for details) twine upload dist/* ``` ## Implementation Details ### File Structure - `local_faiss_mcp/`: Main package directory - `server.py`: MCP server implementation with FAISSVectorStore class - `__init__.py`: Package initialization - `__main__.py`: Entry point for `python -m local_faiss_mcp` - `test_standalone.py`: Standalone tests - `pyproject.toml`: Project configuration and dependencies - `requirements.txt`: Python dependencies - `PUBLISHING.md`: Guide for publishing to PyPI ### Data Files (Created at Runtime) - `faiss.index`: Persisted FAISS index (created in --index-dir) - `metadata.json`: Document metadata storage (created in --index-dir) ### Configuring Index Directory The server accepts an optional `--index-dir` argument to specify where to store the FAISS index and metadata files. This is particularly useful for: - Project-specific vector stores (each project has its own index) - Shared vector stores across multiple projects - Organizing indexes by topic or domain **Example MCP configurations:** User-wide (`~/.claude/.mcp.json`): ```json { "mcpServers": { "local-faiss-mcp": { "command": "local-faiss-mcp", "args": [ "--index-dir", "/path/to/vector_indexes/general" ] } } } ``` Project-specific (`./.mcp.json` in project root): ```json { "mcpServers": { "project-faiss": { "command": "local-faiss-mcp", "args": [ "--index-dir", "./.vector_store" ] } } } ``` Alternative using Python module (if command not in PATH): ```json { "mcpServers": { "local-faiss-mcp": { "command": "python", "args": [ "-m", "local_faiss_mcp", "--index-dir", "./.vector_store" ] } } } ``` ### FAISSVectorStore Class (server.py) The core class managing vector operations: - **Embedding Model**: Configurable via `embedding_model_name` parameter (default: `all-MiniLM-L6-v2`) - Supports any Hugging Face sentence-transformers model - Automatically detects embedding dimensions - Validates dimension compatibility when loading existing indexes - **Index Type**: `IndexFlatL2` - exact L2 distance search - **Chunking Strategy**: 500 words per chunk with 50-word overlap - **Persistence**: Auto-saves index and metadata after ingestion - **Directory Management**: Automatically creates index directory if it doesn't exist Key methods: - `__init__(index_path, metadata_path, embedding_model_name)`: Initializes store with custom model - `chunk_text()`: Splits documents into overlapping chunks - `ingest()`: Embeds and stores document chunks in FAISS - `query()`: Performs similarity search and retrieves top-k results - `save()`: Persists index and metadata to disk ### MCP Tools Two tools are defined in `server.py`: 1. **ingest_document** - Accepts `document` (text) and optional `source` (identifier) - Returns success status and chunk count - Automatically saves to disk in configured index directory 2. **query_rag_store** - Accepts `query` (text) and optional `top_k` (number) - Returns ranked results with distance scores - Handles empty index gracefully ### MCP Prompts Two prompts are defined in `server.py` for answer extraction and summarization: 1. **extract-answer** - Helps extract relevant answers from retrieved document chunks - Arguments: `query` (the original question), `chunks` (JSON array of retrieved documents) - Provides structured answers with source citations - Use after `query_rag_store` to get well-formatted responses 2. **summarize-documents** - Summarizes information from multiple document chunks - Arguments: `topic`, `chunks` (JSON array), optional `max_length` - Creates focused summaries with proper attribution - Useful for synthesizing information across multiple sources **Typical workflow:** 1. Use `query_rag_store` tool to retrieve relevant chunks 2. Use `extract-answer` or `summarize-documents` prompt with the results 3. Get structured, citation-backed responses ### Server Initialization (server.py:main) The server parses command-line arguments and initializes the vector store with the specified index directory: - Resolves the index directory path (creates if needed) - Constructs full paths for `faiss.index` and `metadata.json` - Initializes FAISSVectorStore with these paths - Starts MCP server via stdio transport

Latest Blog Posts

What Is Context Bloat in MCP?
By Om-Shree-0709 on December 16, 2025.
mcp
Context Bloat
MCP Moves to the Linux Foundation: Neutral Stewardship for Agentic Infrastructure
By Om-Shree-0709 on December 15, 2025.
mcp
anthropic
Linux Foundation
Code Execution with MCP: Architecting Agentic Efficiency
By Om-Shree-0709 on December 14, 2025.
mcp
Token bloat

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/nonatofabio/local_faiss_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server