Semantic Search MCP
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Semantic Search MCPfind notes related to machine learning algorithms"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Semantic Search
Semantic search over markdown files. Find related notes by meaning, not just keywords. Detect duplicates before creating new notes.
Supports two server transports:
stdio MCP — For Claude Code integration (one process per session)
HTTP — Combined MCP-over-HTTP + REST on one port; one warm process shared by all clients
Features
Semantic search using sentence-transformers
Duplicate/similar note detection
Auto-updating index with file watcher
Multi-directory support
Inline tag extraction (
#tag-name)
Install
CPU-only install — recommended for macOS (any Mac, Apple Silicon or Intel) and Linux/Windows without an NVIDIA GPU. Saves ~5GB of CUDA binaries. On macOS, Apple GPU (MPS) is still auto-detected and used via PyTorch's built-in MPS backend — the "CPU" label refers only to the absence of CUDA, not to the compute device at runtime.
uv tool install --index https://download.pytorch.org/whl/cpu \
git+https://github.com/bborbe/semantic-searchCUDA install — only for Linux/Windows with a dedicated NVIDIA GPU. Not applicable to macOS (NVIDIA CUDA is not supported on Mac).
uv tool install git+https://github.com/bborbe/semantic-searchUpgrade
uv tool upgrade semantic-searchServer Modes
stdio MCP (per-session Claude Code)
Spawns one process per Claude Code session. Simple, but each session loads its own ~400 MB–1 GB model copy.
claude mcp add -s project semantic-search \
--env CONTENT_PATH=/path/to/vault \
-- \
uvx --from git+https://github.com/bborbe/semantic-search semantic-search-mcpTools available:
search_related(query, top_k=5)— Find semantically related notescheck_duplicates(file_path)— Detect duplicate/similar notes
HTTP (shared across all clients)
Single long-running process serves MCP-over-HTTP at /mcp plus REST at /search, /duplicates, /health, /reindex. All Claude Code sessions and REST clients share one warm indexer.
CONTENT_PATH=/path/to/vault semantic-search-http --host 127.0.0.1 --port 8321Point Claude Code at it via MCP config:
{
"mcpServers": {
"semantic-search": {
"type": "http",
"url": "http://127.0.0.1:8321/mcp"
}
}
}REST endpoints:
Endpoint | Method | Description |
| POST | MCP-over-HTTP (Claude Code) |
| GET | Semantic search |
| GET | Find duplicate notes |
| GET | Health check with index stats |
| GET/POST | Force index rebuild |
Example queries:
# Search
curl 'http://127.0.0.1:8321/search?q=kubernetes+deployment'
# Find duplicates
curl 'http://127.0.0.1:8321/duplicates?file=notes/my-note.md'
# Health check
curl 'http://127.0.0.1:8321/health'Run in Background
For production-style usage, run semantic-search-http as a background service so every Claude Code session (and any REST client) shares one warm process.
Platform | Guide |
macOS (launchd) | |
Linux (systemd) |
Quick example (macOS):
launchctl load ~/Library/LaunchAgents/com.github.bborbe.semantic-search-http.plistQuick example (Linux):
systemctl --user enable --now semantic-search-http.serviceCLI Commands
One-shot commands without running a server:
# Search
CONTENT_PATH=/path/to/vault semantic-search search "kubernetes deployment"
# Find duplicates
CONTENT_PATH=/path/to/vault semantic-search duplicates path/to/note.mdBinaries
Binary | Purpose |
| Combined HTTP server — MCP at |
| stdio MCP server — one per Claude Code session. Use when HTTP service is not set up. |
| CLI only — |
Configuration
Environment Variables
Variable | Description | Default |
| Directory to index (comma-separated for multiple) |
|
| Logging level (DEBUG, INFO, WARNING, ERROR) |
|
Multiple Directories
Index multiple directories by separating paths with commas:
CONTENT_PATH=/path/to/vault1,/path/to/vault2,/path/to/docsAll directories are indexed together and searched as one unified index.
How It Works
First run downloads a small embedding model (~90MB) and indexes your markdown files (<1s for typical vaults). The index auto-updates when files change via filesystem watcher.
Indexed Content
Each markdown file is indexed with weighted components:
Component | Weight | Notes |
Filename | 3x | |
Frontmatter | 3x | |
Frontmatter | 2x | Merged with inline tags |
Frontmatter | 2x | |
Inline tags ( | 2x | Extracted from body |
First H1 heading | 2x | |
Body content | 1x | First 500 words |
Development
# Clone
git clone https://github.com/bborbe/semantic-search
cd semantic-search
# Install dev dependencies
make install
# Run checks
make check
# Run tests
make testLicense
BSD 2-Clause License — see LICENSE.
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Appeared in Searches
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/bborbe/semantic-search-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server