Skip to main content
Glama

Vox MCP

Multi-model AI gateway for MCP clients.

Why

MCP clients like Claude Code, Claude Desktop, and Cursor are locked to their host model. Vox gives them access to every other model — Gemini, GPT, Grok, DeepSeek, Kimi, or your local Ollama — through a single chat tool.

The design is deliberately minimal: prompts go to providers unmodified, responses come back unmodified. No system prompt injection. No response formatting. No behavioral directives. The only value Vox adds is routing and conversation memory — everything else is pure passthrough.

What it does

Send a prompt, optionally attach files or images, pick a model (or let the agent pick), and get back the model's raw response. Conversation threads persist in memory via continuation_id for multi-turn exchanges across any provider — start a thread with Gemini, continue it with GPT. Threads are shadow-persisted to disk as JSONL for durability and can be exported as Markdown.

3 tools:

Tool

Description

chat

Send prompts to any configured AI model with optional file/image context

listmodels

Show available models, aliases, and capabilities

dump_threads

Export conversation threads as JSON or Markdown

8 providers:

Provider

Env Variable

Example Models

Google Gemini

GEMINI_API_KEY

gemini-2.5-pro

OpenAI

OPENAI_API_KEY

gpt-5.1, gpt-5, o3, o4-mini

Anthropic

ANTHROPIC_API_KEY

claude-4-opus, claude-4-sonnet

xAI

XAI_API_KEY

grok-3, grok-3-fast

DeepSeek

DEEPSEEK_API_KEY

deepseek-v4-pro

Moonshot (Kimi)

MOONSHOT_API_KEY

kimi-k2.6

OpenRouter

OPENROUTER_API_KEY

Any OpenRouter model

Custom

CUSTOM_API_URL

Ollama, vLLM, LM Studio, etc.

Quick start

git clone https://github.com/linxule/vox-mcp.git
cd vox-mcp
cp .env.example .env
# Edit .env — add at least one API key
uv sync
uv run python server.py

MCP client configuration

Vox runs as a stdio MCP server. Each client needs to know how to launch it.

Replace /path/to/vox-mcp with the absolute path to your cloned repo.

Claude Code (CLI)

claude mcp add vox-mcp \
  -e GEMINI_API_KEY=your-key-here \
  -- uv run --directory /path/to/vox-mcp python server.py

Or add to .mcp.json in your project root:

{
  "mcpServers": {
    "vox-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
      "env": {
        "GEMINI_API_KEY": "your-key-here"
      }
    }
  }
}

Claude Desktop

Add to claude_desktop_config.json:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "vox-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
      "env": {
        "GEMINI_API_KEY": "your-key-here"
      }
    }
  }
}

Cursor

Add to .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):

{
  "mcpServers": {
    "vox-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
      "env": {
        "GEMINI_API_KEY": "your-key-here"
      }
    }
  }
}

Windsurf

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "vox-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
      "env": {
        "GEMINI_API_KEY": "your-key-here"
      }
    }
  }
}

Any MCP client

The canonical stdio configuration:

{
  "mcpServers": {
    "vox-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
      "env": {
        "GEMINI_API_KEY": "your-key-here"
      }
    }
  }
}

Tips:

  • Paths must be absolute

  • You only need one API key to start — add more providers later via .env

  • The .env file in the vox-mcp directory is loaded automatically, so API keys can go there instead of in the client config

  • Use VOX_FORCE_ENV_OVERRIDE=true in .env if client-passed env vars conflict with your .env values

Configuration

Copy .env.example to .env and configure:

  • API keys — at least one provider key is required

  • DEFAULT_MODELauto (default, agent picks) or a specific model name

  • Model restrictionsGOOGLE_ALLOWED_MODELS, OPENAI_ALLOWED_MODELS, etc.

  • CONVERSATION_TIMEOUT_HOURS — thread TTL (default: 24h)

  • MAX_CONVERSATION_TURNS — thread length limit (default: 100)

See .env.example for the full reference.

Development

uv sync
uv run python -c "import server"   # smoke test
uv run pytest                       # run tests

See CONTRIBUTING.md for code style, project structure, and how to add providers.

License

Apache 2.0 — see LICENSE and NOTICE.

Derived from pal-mcp-server by Beehive Innovations.

A
license - permissive license
-
quality - not tested
B
maintenance

Maintenance

Maintainers
Response time
6wRelease cycle
3Releases (12mo)

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/linxule/vox-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server