Files-DB-MCP

Files-DB-MCP: Vector Search for Code Projects

A local vector database system that provides LLM coding agents with fast, efficient search capabilities for software projects via the Message Control Protocol (MCP).

Features

Zero Configuration - Auto-detects project structure with sensible defaults
Real-Time Monitoring - Continuously watches for file changes
Vector Search - Semantic search for finding relevant code
MCP Interface - Compatible with Claude Code and other LLM tools
Open Source Models - Uses Hugging Face models for code embeddings

Installation

Option 1: Clone and Setup (Recommended)

# Using SSH (recommended if you have SSH keys set up with GitHub)
git clone git@github.com:randomm/files-db-mcp.git ~/.files-db-mcp && bash ~/.files-db-mcp/install/setup.sh

# Using HTTPS (if you don't have SSH keys set up)
git clone https://github.com/randomm/files-db-mcp.git ~/.files-db-mcp && bash ~/.files-db-mcp/install/setup.sh

Option 2: Automated Installation Script

curl -fsSL https://raw.githubusercontent.com/randomm/files-db-mcp/main/install/install.sh | bash

Usage

After installation, run in any project directory:

files-db-mcp

The service will:

Detect your project files
Start indexing in the background
Begin responding to MCP search queries immediately

Requirements

Docker
Docker Compose

Configuration

Files-DB-MCP works without configuration, but you can customize it with environment variables:

EMBEDDING_MODEL - Change the embedding model (default: 'jinaai/jina-embeddings-v2-base-code' or project-specific model)
FAST_STARTUP - Set to 'true' to use a smaller model for faster startup (default: 'false')
QUANTIZATION - Enable/disable quantization (default: 'true')
BINARY_EMBEDDINGS - Enable/disable binary embeddings (default: 'false')
IGNORE_PATTERNS - Comma-separated list of files/dirs to ignore

First-Time Startup

On first run, Files-DB-MCP will download embedding models which may take several minutes depending on:

The size of the selected model (300-500MB for high-quality models)
Your internet connection speed

Subsequent startups will be much faster as models are cached in a persistent Docker volume. For faster initial startup, you can:

# Use a smaller, faster model (90MB)
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 files-db-mcp

# Or enable fast startup mode
FAST_STARTUP=true files-db-mcp

Model Caching

Files-DB-MCP automatically persists downloaded embedding models, so you only need to download them once:

Models are stored in a Docker volume called model_cache
This volume persists between container restarts and across different projects
The cache is shared for all projects using Files-DB-MCP on your machine
You don't need to download the model again for each project

Claude Code Integration

Add to your Claude Code configuration:

{
  "mcpServers": {
    "files-db-mcp": {
      "command": "python",
      "args": ["/path/to/src/claude_mcp_server.py", "--host", "localhost", "--port", "6333"]
    }
  }
}

For details, see Claude MCP Integration.

Documentation

Installation Guide - Detailed setup instructions
API Reference - Complete API documentation
Configuration Guide - Configuration options

Repository Structure

/src - Source code
/tests - Unit and integration tests
/docs - Documentation
/scripts - Utility scripts
/install - Installation scripts
/.docker - Docker configuration
/config - Configuration files
/ai-assist - AI assistance files

License

MIT License

Contributing

Contributions welcome! Please feel free to submit a pull request.

This server cannot be installed

security - not tested

license - permissive license

quality - not tested

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

A local vector database system that provides LLM coding agents with fast, efficient semantic search capabilities for software projects via the Message Control Protocol.

Related MCP Servers

MCP Server for OpenSearch
ibrooksSDX
-
security
A
license
-
quality
Provides a semantic memory layer that integrates LLMs with OpenSearch, enabling storage and retrieval of memories within the OpenSearch engine.
Last updated -
Python
Apache 2.0
MCPunk
jurasofish
A
security
A
license
A
quality
Chat with your codebase through intelligent code searching without embeddings by breaking files into logical chunks, giving the LLM tools to search these chunks, and letting it find specific code needed to answer your questions.
Last updated -
8
51
Python
MIT License
MCP Server for Milvus
zilliztech
-
security
A
license
-
quality
An integration server implementing the Model Context Protocol that enables LLM applications to interact with Milvus vector database functionality, allowing vector search, collection management, and data operations through natural language.
Last updated -
133
Python
Apache 2.0
Memory MCP Server
tomschell
-
security
-
license
-
quality
A long-term memory storage system for LLMs that helps them remember context across multiple sessions using semantic search with embeddings to provide relevant historical information from past interactions and development decisions.
Last updated -
3
TypeScript
MIT License

View all related MCP servers