Fast Embedding MCP SSE
Provides an OpenAI-compatible API for text embedding, similarity search, and document indexing using a fast static embedding model.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Fast Embedding MCP SSEfind similar documents to 'machine learning basics'"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Fast Embedding MCP / SSE — Stable Static Embedding server
Serve RikkaBotan/stable-static-embedding-fast-retrieval-mrl-en-v2
over an OpenAI-compatible HTTP API and an MCP server (stdio).
The model is a ~16M-parameter English static embedding model: 512D native with Matryoshka (MRL) truncation to 256 / 128 / 64 / 32. It is fast (no attention) and tiny.
Install
This project uses uv for environment management. Install uv first if you don't have it (instructions):
# macOS / Linux
curl -LsSf https://astral.sh/uv/install.sh | sh# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"Then clone and sync. uv sync creates a .venv, installs the pinned
dependencies from uv.lock, and installs the project itself:
git clone https://github.com/Rikka-Botan/Fast-Embedding-MCP-SSE.git
cd Fast-Embedding-MCP-SSE
uv syncuv picks a compatible Python (3.10+) automatically — no manual venv or
activation needed; prefix commands with uv run. The first server run
downloads the model from Hugging Face (~60 MB) and caches it.
Related MCP server: PocketMCP
HTTP API
uv run python -m sse_embedding.api # serves on http://0.0.0.0:8000
# or, equivalently: uv run sse-apiConfigurable via SSE_API_HOST / SSE_API_PORT.
Endpoints
Method | Path | Purpose |
POST |
| OpenAI-compatible embeddings (supports |
POST |
| Cosine similarity matrix between two text sets |
POST |
| Rank documents against a query (stateless) |
POST |
| Add documents to the in-memory index |
POST |
| Query the in-memory index |
GET |
| Index size |
POST |
| Empty the index |
GET |
| Health check |
OpenAI-compatible example
Works with the OpenAI SDK by pointing base_url at this server:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="not-needed")
resp = client.embeddings.create(
model="RikkaBotan/stable-static-embedding-fast-retrieval-mrl-en-v2",
input=["hello world", "good morning"],
dimensions=256, # MRL truncation: 512/256/128/64/32
)
print(len(resp.data[0].embedding)) # 256Or raw:
curl -X POST http://localhost:8000/v1/embeddings \
-H "Content-Type: application/json" \
-d '{"input": "hello world", "dimensions": 128}'Search / index example
curl -X POST http://localhost:8000/index/add \
-H "Content-Type: application/json" \
-d '{"documents": ["The cat sat on the mat", "Paris is in France"]}'
curl -X POST http://localhost:8000/index/query \
-H "Content-Type: application/json" \
-d '{"query": "Where is Paris?", "top_k": 1}'MCP server (stdio)
uv run python -m sse_embedding.mcp_server
# or, equivalently: uv run sse-mcpTools exposed: embed_text, similarity, search, index_add,
index_query, index_stats, index_clear.
Register with Claude Code
Requires the Claude Code CLI. If
claudeis not a recognized command, you are likely using the Claude Desktop app — use the Claude Desktop config below instead.
Run from the cloned project directory:
claude mcp add sse-embedding -- uv run python -m sse_embedding.mcp_serverTo make the registration work from any directory, pass the project path to uv
with --directory:
claude mcp add sse-embedding -- uv run --directory /path/to/Fast-Embedding-MCP-SSE python -m sse_embedding.mcp_serverRegister with Claude Desktop
Add to claude_desktop_config.json, replacing /path/to/... with the
absolute path where you cloned this repository. uv run resolves the project's
environment from the given directory.
macOS / Linux:
{
"mcpServers": {
"sse-embedding": {
"command": "uv",
"args": ["run", "--directory", "/path/to/Fast-Embedding-MCP-SSE", "python", "-m", "sse_embedding.mcp_server"]
}
}
}Windows:
{
"mcpServers": {
"sse-embedding": {
"command": "uv",
"args": ["run", "--directory", "C:\\path\\to\\Fast-Embedding-MCP-SSE", "python", "-m", "sse_embedding.mcp_server"]
}
}
}If Claude Desktop reports that uv was not found, replace "command": "uv"
with the absolute path to the uv executable (which uv on macOS/Linux,
(Get-Command uv).Source in PowerShell), or point command directly at the
.venv interpreter that uv sync created
(/path/to/Fast-Embedding-MCP-SSE/.venv/bin/python, or on Windows
C:\\path\\to\\Fast-Embedding-MCP-SSE\\.venv\\Scripts\\python.exe) with
"args": ["-m", "sse_embedding.mcp_server"].
Matryoshka dimensions
Valid dim / dimensions values are 512, 256, 128, 64, 32. Smaller
dimensions are faster and smaller with graceful quality degradation.
Truncation is applied to the full 512D vector and the result is renormalized,
so cosine similarity stays valid at any level.
License
Apache-2.0
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Rikka-Botan/Fast-Embedding-MCP-SSE'
If you have feedback or need assistance with the MCP directory API, please join our Discord server