Skip to main content
Glama
mix0z

Semantic Search MCP Server

by mix0z

Semantic Search MCP Server

A local Model Context Protocol (MCP) server that enables AI agents to perform semantic search over codebases using natural language queries. The server converts queries into efficient text search patterns (grep/ripgrep) and verifies relevance before returning results.

Quick Setup

Installation

pip install -e .

Environment Variables

Set the following environment variables:

  • REPO_PATH - Path to the repository to search (defaults to current directory)

  • SEARCHER_TYPE - Searcher implementation to use (default: sgr_gemini_flash_lite)

API Keys (choose one based on your searcher type):

  • For Claude-based searchers: CLAUDE_API_KEY or ANTHROPIC_API_KEY

  • For Gemini-based searchers: GOOGLE_API_KEY, GEMINI_API_KEY, AI_STUDIO, or VERTEX_AI_API_KEY

  • For OpenAI-based searchers: OPENAI_API_KEY

Available Searchers

SGR (Schema-Guided Reasoning) searchers - Production-ready implementations:

  • sgr / sgr_gemini_flash_lite - Default, recommended (Gemini Flash Lite)

  • sgr_gemini_flash - SGR with Gemini Flash

  • sgr_gemini_pro - SGR with Gemini Pro

  • sgr_gpt4o - SGR with GPT-4o

  • sgr_gpt4o_mini - SGR with GPT-4o Mini

Note: Other searcher types (ripgrep_claude, agent_claude, agent_gemini_flash_lite, etc.) are experimental implementations from earlier development phases and are not recommended for production use.

Running the MCP Server

Important: The MCP server is not meant to be run directly in a terminal. It communicates via STDIO using JSON-RPC protocol and must be launched by an IDE or MCP client.

Cursor Configuration

Add to your cursor-mcp-config.json:

{ "mcpServers": { "qure-semantic-search": { "command": "/path/to/.venv/bin/qure-semantic-search-mcp", "env": { "REPO_PATH": "/path/to/your/repo" } } } }

After configuring, restart Cursor. The server will be automatically launched when you use the semantic_search tool in Cursor's AI chat.

Note: If you see JSON parsing errors when running the command directly in terminal, this is expected - the server requires an MCP client (like Cursor) to communicate with it via JSON-RPC protocol.

Evaluation

Running Evaluation

Standard mode (single run per query):

python -m eval.run_eval

Stability mode (10 runs per query to measure consistency):

python -m eval.run_eval --stability

Stability mode with custom runs (e.g., 20 runs per query):

python -m eval.run_eval --stability --runs 20

Evaluate all searchers (compares different searcher implementations):

python -m eval.run_all_searchers --stability

Additional options:

  • --verbose / -v - Print detailed per-query statistics

  • --single-dataset - Use only main dataset (exclude easy dataset)

  • --output <path> - Export results to JSON file

Datasets

The evaluation uses two datasets:

  1. Main dataset (data/dataset.jsonl) - 12 challenging examples across different codebases (Django, Gin, CodeQL, QGIS, etc.) with non-trivial queries where simple keyword matching fails.

  2. Easy dataset (data/dataset_easy.jsonl) - 14 simpler examples designed for faster evaluation and testing. These queries are more straightforward but still require semantic understanding.

By default, both datasets are used together (26 queries total). Use --single-dataset to evaluate only the main dataset.

Metrics

For detailed metric definitions and mathematical proof of perfection, see METRICS_LOGIC.md.

Quick Summary:

  • Precision@K = TP / (TP + FP) - Fraction of returned results that are relevant

  • Recall@K = TP / (TP + FN) - Fraction of all relevant items that were returned

  • F1@K = Harmonic mean of Precision and Recall

  • File Discovery Rate = Files Found / Files Expected

  • Substring Coverage = Substrings Found / Substrings Required

The Logic Test: If all metrics score 1.0, the solution is mathematically perfect (see proof in METRICS_LOGIC.md).

See eval/metrics.py for detailed implementations.

Performance Results

Evaluation results for sgr_gemini_flash_lite searcher (10 runs per query, 26 queries total):

Overall Performance

Metric

Value

Stability

Precision@10

0.30 ± 0.38

⚠ High variance (CV=127%)

Recall@10

0.31 ± 0.41

⚠ High variance (CV=133%)

F1@10

0.29 ± 0.38

⚠ High variance (CV=130%)

Success Rate@10

0.40 ± 0.46

⚠ High variance (CV=114%)

File Discovery Rate

0.61 ± 0.40

⚠ Moderate variance (CV=66%)

Substring Coverage

0.35 ± 0.39

⚠ High variance (CV=111%)

Avg Latency

20.6s ± 7.9s

Range: 9.6s - 38.3s

Stability Score

73.9%

16/26 stable queries (61.5%)

Dataset Breakdown

Easy Dataset (14 examples)

  • Precision@10: 0.40 ± 0.44

  • Recall@10: 0.46 ± 0.49

  • F1@10: 0.42 ± 0.45

  • File Discovery Rate: 0.92 ± 0.13 ✓ (Good stability)

  • Avg Latency: 15.0s ± 4.8s

  • Stability Score: 85.9% ✓ (Good stability)

Main Dataset (12 examples)

  • Precision@10: 0.17 ± 0.25

  • Recall@10: 0.13 ± 0.18

  • F1@10: 0.14 ± 0.20

  • File Discovery Rate: 0.26 ± 0.30

  • Avg Latency: 27.2s ± 5.3s

  • Stability Score: 60.0% ⚠ (Moderate stability)

Notes

  • High variance in metrics is expected due to LLM non-determinism and the complexity of semantic search queries

  • File Discovery Rate shows better stability, especially on easier queries (92% success rate)

  • Latency varies significantly (9-38s) depending on query complexity and codebase size

  • Results are evaluated on non-trivial queries where simple keyword matching fails

Project Structure

  • src/ - Core MCP server and searcher implementations

  • eval/ - Evaluation scripts and metrics

  • data/ - Evaluation dataset and test repositories

  • scripts/ - Utility scripts for testing and debugging

Documentation

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mix0z/Semantic-Search-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server