Enables indexing and semantic search of local files (PDF, TXT, CSV, Markdown, etc.) using embeddings and optional FAISS indexing for fast similarity-based document retrieval
MCP Data Server — Local file search you can call from Claude (or CLI)
MCP Data Server indexes files on your machine (PDF, TXT, CSV, Markdown, etc.) and lets you search them with embeddings. You can use it from:
- a friendly CLI (
ls
,index
,search
) - an MCP server over stdio (so Claude Desktop/Cursor can call your tools)
Works great on Windows 11. Also tested on macOS/Linux (see notes).
Table of contents
- Features
- Prerequisites
- Quick start (Windows)
- Quick start (macOS/Linux)
- Usage (CLI)
- Use with Claude Desktop (MCP)
- Configuration
- Project structure
- Development (lint, type, test)
- Troubleshooting
- Contributing
- License
Features
- 🔎 Local search with SentenceTransformers embeddings (cosine similarity)
- ⚡ Optional FAISS index for fast Top-K search
- 🧰 Simple CLI:
ls
,index
,search
- 🔌 MCP server so Claude Desktop can call tools:
list_docs_tool
,index_docs_tool
,search_chunks_tool
,read_doc_tool
- 🧩 Extensible loaders/chunkers; add new formats easily
- ✅ Batteries-included dev setup: Ruff, Black, MyPy, PyTest, pre-commit
Prerequisites
- Python 3.11+ (3.11 recommended)
- Windows 11 (PowerShell) macOS/Linux are fine too (bash)
- ~3 GB free disk space on first run (model cache)
- (Optional) FAISS CPU wheels installed automatically via
faiss-cpu
Quick start (Windows)
Folder in this repo where you put files to index:
./data/
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
Indexes local files (PDF, TXT, CSV, Markdown) with embeddings for semantic search. Provides both CLI and MCP server interfaces so Claude Desktop can search and read your local documents.
Related MCP Servers
- AsecurityAlicenseAqualityThe Search MCP Server enables seamless integration of network and local search capabilities in tools like Claude Desktop and Cursor, utilizing the Brave Search API for high-concurrency and asynchronous requests.Last updated -172MIT License
- -securityFlicense-qualityAn MCP server that integrates with Claude to provide smart documentation search capabilities across multiple AI/ML libraries, allowing users to retrieve and process technical information through natural language queries.Last updated -
- -securityAlicense-qualityAn MCP server that enables AI assistants like Claude Desktop to search and retrieve information from custom search indexes created with Searchcraft.Last updated -5Apache 2.0
- AsecurityAlicenseAqualityAn MCP server that allows users to efficiently search and reference user-configured documents through document listing, grep searching, semantic searching with OpenAI Embeddings, and full document retrieval.Last updated -43MIT License