Files-DB-MCP

Files-DB-MCP: Vector Search for Code Projects

A local vector database system that provides LLM coding agents with fast, efficient search capabilities for software projects via the Message Control Protocol (MCP).

Features

Zero Configuration - Auto-detects project structure with sensible defaults
Real-Time Monitoring - Continuously watches for file changes
Vector Search - Semantic search for finding relevant code
MCP Interface - Compatible with Claude Code and other LLM tools
Open Source Models - Uses Hugging Face models for code embeddings

Installation

Option 1: Clone and Setup (Recommended)

# Using SSH (recommended if you have SSH keys set up with GitHub)
git clone git@github.com:randomm/files-db-mcp.git ~/.files-db-mcp && bash ~/.files-db-mcp/install/setup.sh

# Using HTTPS (if you don't have SSH keys set up)
git clone https://github.com/randomm/files-db-mcp.git ~/.files-db-mcp && bash ~/.files-db-mcp/install/setup.sh

Option 2: Automated Installation Script

curl -fsSL https://raw.githubusercontent.com/randomm/files-db-mcp/main/install/install.sh | bash

Usage

After installation, run in any project directory:

files-db-mcp

The service will:

Detect your project files
Start indexing in the background
Begin responding to MCP search queries immediately

Requirements

Docker
Docker Compose

Configuration

Files-DB-MCP works without configuration, but you can customize it with environment variables:

EMBEDDING_MODEL - Change the embedding model (default: 'jinaai/jina-embeddings-v2-base-code' or project-specific model)
FAST_STARTUP - Set to 'true' to use a smaller model for faster startup (default: 'false')
QUANTIZATION - Enable/disable quantization (default: 'true')
BINARY_EMBEDDINGS - Enable/disable binary embeddings (default: 'false')
IGNORE_PATTERNS - Comma-separated list of files/dirs to ignore

First-Time Startup

On first run, Files-DB-MCP will download embedding models which may take several minutes depending on:

The size of the selected model (300-500MB for high-quality models)
Your internet connection speed

Subsequent startups will be much faster as models are cached in a persistent Docker volume. For faster initial startup, you can:

# Use a smaller, faster model (90MB)
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 files-db-mcp

# Or enable fast startup mode
FAST_STARTUP=true files-db-mcp

Model Caching

Files-DB-MCP automatically persists downloaded embedding models, so you only need to download them once:

Models are stored in a Docker volume called model_cache
This volume persists between container restarts and across different projects
The cache is shared for all projects using Files-DB-MCP on your machine
You don't need to download the model again for each project

Claude Code Integration

Add to your Claude Code configuration:

{
  "mcpServers": {
    "files-db-mcp": {
      "command": "python",
      "args": ["/path/to/src/claude_mcp_server.py", "--host", "localhost", "--port", "6333"]
    }
  }
}

For details, see Claude MCP Integration.

Documentation

Installation Guide - Detailed setup instructions
API Reference - Complete API documentation
Configuration Guide - Configuration options

Repository Structure

/src - Source code
/tests - Unit and integration tests
/docs - Documentation
/scripts - Utility scripts
/install - Installation scripts
/.docker - Docker configuration
/config - Configuration files
/ai-assist - AI assistance files

License

MIT License

Contributing

Contributions welcome! Please feel free to submit a pull request.

This server cannot be installed

security - not tested

license - permissive license

quality - not tested

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

A local vector database system that provides LLM coding agents with fast, efficient semantic search capabilities for software projects via the Message Control Protocol.

Related MCP Servers

MCP File Context Server
bsmi021
A
security
A
license
A
quality
A Model Context Protocol server that enables LLMs to read, search, and analyze code files with advanced caching and real-time file watching capabilities.
Last updated -
6
20
20
JavaScript
MIT License
MCPunk
jurasofish
A
security
A
license
A
quality
Chat with your codebase through intelligent code searching without embeddings by breaking files into logical chunks, giving the LLM tools to search these chunks, and letting it find specific code needed to answer your questions.
Last updated -
8
50
Python
MIT License
MCP Code Indexer
zxfgds
-
security
F
license
-
quality
A smart code retrieval tool based on Model Context Protocol that provides efficient and accurate code repository search capabilities for large language models.
Last updated -
20
Python
code-index-mcp
johnhuang316
A
security
A
license
A
quality
A Model Context Protocol (MCP) server that helps large language models index, search, and analyze code repositories with minimal setup
Last updated -
11
106
Python
MIT License

View all related MCP servers