Skip to main content
Glama
bborbe

Semantic Search MCP

by bborbe

Semantic Search

Semantic search over markdown files. Find related notes by meaning, not just keywords. Detect duplicates before creating new notes.

Supports two server modes:

  • MCP mode — For Claude Code integration

  • REST mode — For OpenClaw, scripts, and HTTP clients

Features

  • Semantic search using sentence-transformers

  • Duplicate/similar note detection

  • Auto-updating index with file watcher

  • Multi-directory support

  • Inline tag extraction (#tag-name)

Installation

# Install as a tool (creates ~/.local/bin/semantic-search-mcp) uv tool install git+https://github.com/bborbe/semantic-search

💡 No GPU? Use CPU-only PyTorch

The default install includes CUDA support (~7GB). If you don't have a dedicated GPU, install with CPU-only PyTorch to save ~5GB disk space:

uv tool install --index https://download.pytorch.org/whl/cpu \ git+https://github.com/bborbe/semantic-search

Performance is identical for typical vault sizes — embedding models run fine on CPU.

One-off usage

# Run directly with uvx (no install needed) uvx --from git+https://github.com/bborbe/semantic-search semantic-search-mcp serve

From PyPI (when published)

pip install semantic-search-mcp

Server Modes

MCP Mode (for Claude Code)

claude mcp add -s project semantic-search \ --env CONTENT_PATH=/path/to/vault \ -- \ uvx --from git+https://github.com/bborbe/semantic-search semantic-search-mcp serve

Tools available:

  • search_related(query, top_k=5) — Find semantically related notes

  • check_duplicates(file_path) — Detect duplicate/similar notes

REST Mode (for OpenClaw/HTTP)

# Start server CONTENT_PATH=/path/to/vault semantic-search-mcp serve --mode rest --port 8321 # Or with uvx CONTENT_PATH=/path/to/vault uvx --from git+https://github.com/bborbe/semantic-search \ semantic-search-mcp serve --mode rest --port 8321

Endpoints:

Endpoint

Method

Description

/search?q=...&top_k=5

GET

Semantic search

/duplicates?file=...&threshold=0.85

GET

Find duplicate notes

/health

GET

Health check with index stats

/reindex

GET/POST

Force index rebuild

Example queries:

# Search curl 'http://localhost:8321/search?q=kubernetes+deployment' # Find duplicates curl 'http://localhost:8321/duplicates?file=notes/my-note.md' # Health check curl 'http://localhost:8321/health'

CLI Commands

One-shot commands without running a server:

# Search CONTENT_PATH=/path/to/vault semantic-search-mcp search "kubernetes deployment" # Find duplicates CONTENT_PATH=/path/to/vault semantic-search-mcp duplicates path/to/note.md

Configuration

Environment Variables

Variable

Description

Default

CONTENT_PATH

Directory to index (comma-separated for multiple)

./content

LOG_LEVEL

Logging level (DEBUG, INFO, WARNING, ERROR)

INFO

Multiple Directories

Index multiple directories by separating paths with commas:

CONTENT_PATH=/path/to/vault1,/path/to/vault2,/path/to/docs

All directories are indexed together and searched as one unified index.

How It Works

First run downloads a small embedding model (~90MB) and indexes your markdown files (<1s for typical vaults). The index auto-updates when files change via filesystem watcher.

Indexed Content

Each markdown file is indexed with weighted components:

Component

Weight

Notes

Filename

3x

Frontmatter title

3x

Frontmatter tags

2x

Merged with inline tags

Frontmatter aliases

2x

Inline tags (#tag)

2x

Extracted from body

First H1 heading

2x

Body content

1x

First 500 words

Development

# Clone git clone https://github.com/bborbe/semantic-search cd semantic-search # Install dev dependencies make install # Run checks make check # Run tests make test

License

BSD 2-Clause License — see LICENSE.

-
security - not tested
A
license - permissive license
-
quality - not tested

Appeared in Searches

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bborbe/semantic-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server