How do I use vector-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@vector-mcp semantic search for 'climate change impacts'" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

vector-mcp

by markdevshop

Overview Schema Related Servers Score Discussions

Python

Hybrid

# Vector Mcp

CLI or API | MCP | Agent

PyPI - Version MCP Server PyPI - Downloads GitHub Repo stars GitHub forks GitHub contributors PyPI - License GitHub GitHub last commit (by committer) GitHub pull requests GitHub closed pull requests GitHub issues GitHub top language GitHub language count GitHub repo size GitHub repo file count (file type) PyPI - Wheel PyPI - Implementation

Version: 2.0.0

Related MCP server: RAG MCP Server

Overview

Vector Mcp is a production-grade Agent and Model Context Protocol (MCP) server designed to interface directly with Integrate RAG into AI Agents via MCP Server. Supports multiple Vector database technologies..

Key Features

Consolidated Action-Routed MCP Tools: Minimizes token overhead and eliminates tool bloat in LLM contexts by grouping methods into optimized, togglable tool modules.
Enterprise-Grade Security: Comprehensive support for Eunomia policies, OIDC token delegation, and granular execution context tracking.
Integrated Graph Agent: Built-in Pydantic AI agent supporting the Agent Control Protocol (ACP) and standard Web interfaces (AG-UI).
Native Telemetry & Tracing: Out-of-the-box OpenTelemetry exports and native Langfuse tracing.

CLI or API

This agent wraps the Integrate RAG into AI Agents via MCP Server. Supports multiple Vector database technologies. API. You can interact with it programmatically or via its integrated execution entrypoints.

Detailed instructions on how to use the underlying API wrappers, extended schema bindings, and developer SDK references are maintained in docs/index.md.

MCP

This server utilizes dynamic Action-Routed tools to optimize token overhead and maximize IDE compatibility.

Available MCP Tools

Auto-generated from the live MCP server — do not edit by hand.

Condensed action-routed tools (default — `MCP_TOOL_MODE=condensed`)

MCP Tool	Toggle Env Var	Description
`vector_collection_management`	`COLLECTION_MANAGEMENTTOOL`	Manage collection management operations.

Verbose 1:1 API-mapped tools (`MCP_TOOL_MODE=verbose` or `both`)

MCP Tool	Toggle Env Var	Description
`vector_add_documents`	`APITOOL`	Add documents.
`vector_create_collection`	`APITOOL`	Create a collection.
`vector_delete_collection`	`APITOOL`	Delete a collection.
`vector_lexical_search`	`APITOOL`	Perform lexical search.
`vector_list_collections`	`APITOOL`	List collections.
`vector_search`	`SEARCHTOOL`	Perform hybrid search.
`vector_semantic_search`	`APITOOL`	Perform semantic search.

1 action-routed tool(s) (default) · 7 verbose 1:1 tool(s). Each is enabled unless its <DOMAIN>TOOL toggle is set false; MCP_TOOL_MODE selects the surface (condensed default · verbose 1:1 · both). Auto-generated — do not edit.

Detailed tool schemas, parameter shapes, and validation constraints are preserved in docs/mcp.md.

Dynamic Tool Selection & Visibility

This MCP server supports dynamic toolset selection and visibility filtering at runtime. This allows you to restrict the set of exposed tools in order to prevent blowing up the LLM's context window.

You can configure tool filtering via multiple input channels:

CLI Arguments: Pass --tools or --toolsets (or their disabled counterparts --disabled-tools and --disabled-toolsets) during startup.
Environment Variables: Define standard environment variables:
- MCP_ENABLED_TOOLS / MCP_DISABLED_TOOLS
- MCP_ENABLED_TAGS / MCP_DISABLED_TAGS
HTTP SSE Request Headers: Pass custom headers during transport initialization:
- x-mcp-enabled-tools / x-mcp-disabled-tools
- x-mcp-enabled-tags / x-mcp-disabled-tags
HTTP SSE Request Query Parameters: Append query parameters directly to your transport connection URL:
- ?tools=tool1,tool2
- ?tags=tag1

When query strings or parameters are supplied, an LLM-free Knowledge Graph resolution layer (using DynamicToolOrchestrator) matches query intents against known tool tags, names, or descriptions, with safe fallback and automated 24-hour background cache refreshing.

MCP Configuration Examples

Install the slim [mcp] extra. All examples below install vector-mcp[mcp] — the MCP-server extra that pulls only the FastMCP / FastAPI tooling (agent-utilities[mcp]). It deliberately excludes the heavy agent runtime (the epistemic-graph engine, pydantic-ai, dspy, llama-index, tree-sitter), so uvx/container installs are dramatically smaller and faster. Use the full [agent] extra only when you need the integrated Pydantic AI agent (see Installation).

stdio Transport (Recommended for local IDEs e.g., Cursor, Claude Desktop)

Configure your IDE's mcp.json to launch the MCP server via uvx:

{
  "mcpServers": {
    "vector-mcp": {
      "command": "uvx",
      "args": [
        "--from",
        "vector-mcp[mcp]",
        "vector-mcp"
      ],
      "env": {
        "VECTOR_URL": "your_vector_url_here",
        "EMBEDDING_MODEL_ID": "your_embedding_model_id_here",
        "CHUNK_SIZE": "your_chunk_size_here",
        "VECTOR_API_KEY": "your_vector_api_key_here"
      }
    }
  }
}

Streamable-HTTP Transport (Recommended for production deployments)

Configure your client's mcp.json to launch the Streamable-HTTP server via uvx with explicit host and port definition:

{
  "mcpServers": {
    "vector-mcp": {
      "command": "uvx",
      "args": [
        "--from",
        "vector-mcp[mcp]",
        "vector-mcp"
      ],
      "env": {
        "TRANSPORT": "streamable-http",
        "HOST": "0.0.0.0",
        "PORT": "8000",
        "VECTOR_URL": "your_vector_url_here",
        "EMBEDDING_MODEL_ID": "your_embedding_model_id_here",
        "CHUNK_SIZE": "your_chunk_size_here",
        "VECTOR_API_KEY": "your_vector_api_key_here"
      }
    }
  }
}

Alternatively, connect to a pre-deployed remote or local Streamable-HTTP instance:

{
  "mcpServers": {
    "vector-mcp": {
      "url": "http://localhost:8000/vector-mcp/mcp"
    }
  }
}

Deploying the Streamable-HTTP server via Docker:

docker run -d \
  --name vector-mcp-mcp \
  -p 8000:8000 \
  -e TRANSPORT=streamable-http \
  -e PORT=8000 \
  -e VECTOR_URL="your_value" \
  -e EMBEDDING_MODEL_ID="your_value" \
  -e CHUNK_SIZE="your_value" \
  -e VECTOR_API_KEY="your_value" \
  blubird28/vector-mcp:mcp

The :mcp tag is the slim MCP-server image (built from docker/Dockerfile --target mcp, installing vector-mcp[mcp]). The default :latest tag is the full agent image (--target agent, vector-mcp[agent]) which also bundles the Pydantic AI agent and the epistemic-graph engine — use it when you run vector-agent (the agent), not just the MCP server. See Container images.

Additional Deployment Options

vector-mcp can also run as a local container (Docker / Podman / uv) or be consumed from a remote deployment. The Deployment guide has full, copy-paste mcp_config.json for all four transports — stdio, streamable-http, local container / uv, and remote URL:

Local container / uv — launch the server from mcp_config.json via uvx, docker run, or podman run, or point at a local streamable-http container by url.
Remote URL — connect to a server deployed behind Caddy at http://vector-mcp.arpa/mcp using the "url" key.

Environment Variables

Package environment variables

Variable	Example	Description
`HOST`	`0.0.0.0`
`PORT`	`8000`
`TRANSPORT`	`stdio`	options: stdio, streamable-http, sse
`ENABLE_OTEL`	`True`
`OTEL_EXPORTER_OTLP_ENDPOINT`	`http://localhost:8080/api/public/otel`
`OTEL_EXPORTER_OTLP_PUBLIC_KEY`	`pk-...`
`OTEL_EXPORTER_OTLP_SECRET_KEY`	`sk-...`
`OTEL_EXPORTER_OTLP_PROTOCOL`	`http/protobuf`
`EUNOMIA_TYPE`	`none`	options: none, embedded, remote
`EUNOMIA_POLICY_FILE`	`mcp_policies.json`
`EUNOMIA_REMOTE_URL`	`http://eunomia-server:8000`
`LLM_BASE_URL`	`http://localhost:8000/v1`	embedding/LLM API base url
`LLM_TOKEN`	—	bearer token for the embedding/LLM endpoint
`LLM_API_KEY`	—	alias accepted if LLM_TOKEN is unset
`LLM_SSL_VERIFY`	`False`	verify TLS for the embedding/LLM endpoint
`DOCUMENT_DIRECTORY`	`/documents`	default directory for ingested documents
`COLLECTION_MANAGEMENTTOOL`	`True`
`SEARCHTOOL`	`True`
`TEST_POSTGRES_CONNECTION_STRING`	`postgresql://postgres:password@localhost:5432/vectordb`
`TEST_MONGODB_HOST`	`localhost`
`TEST_MONGODB_PORT`	`27017`
`TEST_MONGODB_DB`	`vectordb`
`TEST_QDRANT_LOCATION`	`http://localhost:6333`
`TEST_COUCHBASE_CONNECTION`	`couchbase://localhost`
`TEST_COUCHBASE_USER`	`Administrator`
`TEST_COUCHBASE_PASSWORD`	`password`
`TEST_COUCHBASE_DB`	`vector_db`

Inherited agent-utilities variables (apply to every connector)

Variable	Example	Description
`MCP_TOOL_MODE`	`condensed`	Tool surface: `condensed`
`MCP_ENABLED_TOOLS`	—	Comma-separated tool allow-list
`MCP_DISABLED_TOOLS`	—	Comma-separated tool deny-list
`MCP_ENABLED_TAGS`	—	Comma-separated tag allow-list
`MCP_DISABLED_TAGS`	—	Comma-separated tag deny-list
`MCP_CLIENT_AUTH`	—	Outbound MCP auth (`oidc-client-credentials` for fleet calls)
`OIDC_CLIENT_ID`	—	OIDC client id (service-account auth)
`OIDC_CLIENT_SECRET`	—	OIDC client secret (service-account auth)
`DEBUG`	`False`	Verbose logging
`PYTHONUNBUFFERED`	`1`	Unbuffered stdout (recommended in containers)
`MCP_URL`	`http://localhost:8000/mcp`	URL of the MCP server the agent connects to
`PROVIDER`	`openai`	LLM provider for the agent
`MODEL_ID`	`gpt-4o`	Model id for the agent
`ENABLE_WEB_UI`	`True`	Serve the AG-UI web interface

27 package + 14 inherited variable(s). Auto-generated from .env.example + the shared agent-utilities set — do not edit.

Every variable the server reads, grouped by purpose.

Connection & Credentials

Variable	Description	Default
`VECTOR_URL`	Base URL of the vector database / embedding endpoint	`http://localhost:8000`
`VECTOR_API_KEY`	API key for the vector database / embedding provider	—
`EMBEDDING_MODEL_ID`	Embedding model id used for indexing & search	`text-embedding-nomic-embed-text-v2-moe`
`CHUNK_SIZE`	Document chunk size for ingestion	`512`

MCP server / transport

Variable	Description	Default
`TRANSPORT`	`stdio`, `streamable-http`, or `sse`	`stdio`
`HOST`	Bind host (HTTP transports)	`0.0.0.0`
`PORT`	Bind port (HTTP transports)	`8000`
`MCP_TOOL_MODE`	Tool surface: `condensed`, `verbose`, or `both`	`condensed`
`MCP_ENABLED_TOOLS` / `MCP_DISABLED_TOOLS`	Comma-separated tool allow/deny list	—
`MCP_ENABLED_TAGS` / `MCP_DISABLED_TAGS`	Comma-separated tag allow/deny list	—
`PYTHONUNBUFFERED`	Unbuffered stdout (recommended in containers)	`1`

Tool toggles

Each action-routed tool can be disabled individually via its toggle env var (set to false). The full list is in the Available MCP Tools table above.

Variable	Description	Default
`COLLECTION_MANAGEMENTTOOL`	Enable the collection-management tool	`True`
`SEARCHTOOL`	Enable the search tool	`True`

Telemetry & governance

Variable	Description	Default
`ENABLE_OTEL`	Enable OpenTelemetry export	`True`
`OTEL_EXPORTER_OTLP_ENDPOINT`	OTLP collector endpoint	—
`OTEL_EXPORTER_OTLP_PUBLIC_KEY` / `OTEL_EXPORTER_OTLP_SECRET_KEY`	OTLP auth keys	—
`OTEL_EXPORTER_OTLP_PROTOCOL`	OTLP protocol (e.g. `http/protobuf`)	—
`EUNOMIA_TYPE`	Authorization mode: `none`, `embedded`, `remote`	`none`
`EUNOMIA_POLICY_FILE`	Embedded policy file	`mcp_policies.json`
`EUNOMIA_REMOTE_URL`	Remote Eunomia server URL	—

Agent CLI (full `[agent]` runtime only)

Variable	Description	Default
`MCP_URL`	URL of the MCP server the agent connects to	`http://localhost:8000/mcp`
`PROVIDER`	LLM provider (e.g. `openai`)	`openai`
`MODEL_ID`	Model id (e.g. `gpt-4o`)	`gpt-4o`
`ENABLE_WEB_UI`	Serve the AG-UI web interface	`True`

See .env.example for a copy-paste starting point.

Agent

This repository features a fully integrated Pydantic AI Graph Agent. It communicates over the Agent Control Protocol (ACP) and interacts seamlessly with the Agent Web UI (AG-UI) and Terminal interface.

Running the Agent CLI

To start the interactive command-line agent:

# Set credentials
export VECTOR_URL="your_value"
export EMBEDDING_MODEL_ID="your_value"
export CHUNK_SIZE="your_value"
export VECTOR_API_KEY="your_value"

# Run the agent server
vector-agent --provider openai --model-id gpt-4o

Docker Compose Orchestration

The following docker/agent.compose.yml configures the Agent, Web UI, and Terminal Interface together:

version: '3.8'

services:
  vector-mcp-mcp:
    image: blubird28/vector-mcp:mcp
    container_name: vector-mcp-mcp
    hostname: vector-mcp-mcp
    restart: always
    env_file:
      - ../.env
    environment:
      - PYTHONUNBUFFERED=1
      - HOST=0.0.0.0
      - PORT=8000
      - TRANSPORT=streamable-http
    ports:
      - "8000:8000"
    healthcheck:
      test: ["CMD", "python3", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

  vector-mcp-agent:
    image: blubird28/vector-mcp:latest
    container_name: vector-mcp-agent
    hostname: vector-mcp-agent
    restart: always
    depends_on:
      - vector-mcp-mcp
    env_file:
      - ../.env
    command: [ "vector-agent" ]
    environment:
      - PYTHONUNBUFFERED=1
      - HOST=0.0.0.0
      - PORT=9023
      - MCP_URL=http://vector-mcp-mcp:8000/mcp
      - PROVIDER=${PROVIDER:-openai}
      - MODEL_ID=${MODEL_ID:-gpt-4o}
      - ENABLE_WEB_UI=True
      - ENABLE_OTEL=True
    ports:
      - "9023:9023"
    healthcheck:
      test: ["CMD", "python3", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:9023/health')"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

Detailed graph node architecture explanations, custom skill configurations, and agentic trace guides are available in docs/agent.md.

Security & Governance

Built directly upon the enterprise-ready agent-utilities core, standard security parameters are fully supported:

Access Control & Policy Enforcement

Eunomia Policies: Fine-grained, policy-driven tool authorization. Supports none, local embedded (mcp_policies.json), or centralized remote modes.
OIDC Token Delegation: Compliant with RFC 8693 token exchange for flowing authenticating user credentials from Web UI / ACP → Agent → MCP.
Scoped Credentials: Execution context runs restricted to the specific caller identity.

Runtime Security Grid

Feature	Functionality	Enablement
Tool Guard	Sensitivity inspection with human-in-the-loop validation	Enabled by default
Prompt Injection Defense	Input scanning, repetition monitoring, and recursive loop blocks	Enabled by default
Context Safety Guard	Stuck-loop detectors and contextual overflow preemptive alerts	Enabled by default

Installation

Pick the extra that matches what you want to run:

Extra	Installs	Use when
`vector-mcp[mcp]`	Slim MCP server only (`agent-utilities[mcp]` — FastMCP/FastAPI)	You only run the MCP server (smallest install / image)
`vector-mcp[agent]`	Full agent runtime (`agent-utilities[agent,logfire]` — Pydantic AI + the epistemic-graph engine)	You run the integrated agent
`vector-mcp[all]`	Everything (`mcp` + all vector backends + `agent`)	Development / both surfaces

# MCP server only (recommended for tool hosting — slim deps)
uv pip install "vector-mcp[mcp]"

# Full agent runtime (Pydantic AI + epistemic-graph engine)
uv pip install "vector-mcp[agent]"

# Everything (development)
uv pip install "vector-mcp[all]"      # or: python -m pip install "vector-mcp[all]"

Container images (`:mcp` vs `:agent`)

One multi-stage docker/Dockerfile builds two right-sized images, selected by --target:

Image tag	Build target	Contents	Entrypoint
`blubird28/vector-mcp:mcp`	`--target mcp`	`vector-mcp[mcp]` — slim, no engine/`pydantic-ai`/`dspy`/`llama-index`/`tree-sitter`	`vector-mcp`
`blubird28/vector-mcp:latest`	`--target agent` (default)	`vector-mcp[agent]` — full agent runtime + epistemic-graph engine	`vector-agent`

docker build --target mcp   -t blubird28/vector-mcp:mcp    docker/   # slim MCP server
docker build --target agent -t blubird28/vector-mcp:latest docker/   # full agent

docker/mcp.compose.yml runs the slim :mcp server; docker/agent.compose.yml runs the agent (:latest) with a co-located :mcp sidecar.

Knowledge-graph database (`epistemic-graph`)

The full agent ([agent] / :latest) embeds the epistemic-graph engine (pulled in transitively via agent-utilities[agent]). For production — or to share one knowledge graph across multiple agents — run epistemic-graph as its own database container and point the agent at it instead of embedding it. Deployment recipes (single-node + Raft HA), connection config, and the full database architecture (with diagrams) are documented in the epistemic-graph deployment guide. The slim [mcp] server does not require the database.

Contribute

Contributions are welcome! Please ensure code quality by executing local checks before submitting pull requests:

Format code using ruff format .
Lint code using ruff check .
Validate type-safety with mypy .
Execute test suites using pytest

Deploy with `agent-os-genesis`

This package can be provisioned for you — skill-guided — by the agent-os-genesis universal skill (its single-package deploy mode): it picks your install method, seeds secrets to OpenBao/Vault (or .env), trusts your enterprise CA, registers the MCP server, and verifies it — the same machinery that stands up the whole Agent OS, narrowed to just this package. Ask your agent to "deploy vector-mcp with agent-os-genesis".

Install mode	Command
Bare-metal, prod (PyPI)	`uvx vector-mcp` · or `uv tool install vector-mcp`
Bare-metal, dev (editable)	`uv pip install -e ".[all]"` · or `pip install -e ".[all]"`
Container, prod	deploy `blubird28/vector-mcp:latest` via docker-compose / swarm / podman / podman-compose / kubernetes
Container, dev (editable)	deploy `docker/compose.dev.yml` (source-mounted at `/src`; edits live on restart)

Secrets are read-existing + seeded via vault_sync — you are only prompted for what's missing.

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/markdevshop/vector-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server