Venice MCP Server
OfficialProvides a tool to proxy JSON-RPC calls (e.g., eth_call, eth_blockNumber) to the Ethereum network.
Provides a tool to proxy JSON-RPC calls (e.g., eth_call, eth_blockNumber) to the Optimism network.
Provides a tool to proxy JSON-RPC calls (e.g., eth_call, eth_blockNumber) to the Polygon network.
Provides a tool to transcribe YouTube video URLs into text.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Venice MCP Servergenerate an image of a dragon in a medieval castle"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
@veniceai/mcp-server
Model Context Protocol server for the Venice API - uncensored, private AI for any MCP host (Claude Desktop, Cursor, ChatGPT, LM Studio, Continue, LibreChat, Open WebUI, AnythingLLM, Jan, Le Chat).
Plug Venice's chat, image, video, audio, music, and character models into any agent in 30 seconds. 31 tools across all modalities, one config block.
Quick start
1. Get a key from venice.ai
See the API key guide for step-by-step instructions.
2. Add this to your MCP host config
Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json on macOS, %APPDATA%\Claude\claude_desktop_config.json on Windows), Cursor (~/.cursor/mcp.json), LM Studio, etc:
{
"mcpServers": {
"venice": {
"command": "npx",
"args": ["-y", "@veniceai/mcp-server@0.2.0"],
"env": { "VENICE_API_KEY": "<your-venice-api-key>" }
}
}
}3. Restart your MCP host
That's it. Type a prompt — your agent now has chat, image, video, music, TTS, ASR, and 25 more Venice tools.
What you get
31 tools spanning every Venice modality, 3 resources (venice://models, venice://styles, venice://voices) and 3 prompt templates (uncensored research, NSFW creative writing, image style explorer).
💬 Chat & embeddings
Tool | Description |
| OpenAI-compatible chat completion against Venice's uncensored LLM catalog (Claude, GPT-5, Llama, DeepSeek, Qwen, GLM, Kimi, Venice Uncensored, etc.). |
| OpenAI-compatible Responses API. Single-turn or multi-turn with tool support. |
| Compute embeddings for text input (OpenAI-compatible). |
| Chat with a Venice character by slug. |
🎨 Image
Tool | Description |
| Generate an image. Supports Flux 2 Pro/Max, Lustify SDXL, Anime (WAI), Qwen Image, GPT Image, Nano Banana Pro and others. |
| Edit an image with a prompt. Returns base64 PNG. |
| Edit multiple images together with a single prompt (multi-image composition / outpainting). |
| Upscale an image (1–4× scale). Returns base64 PNG. |
| Remove image background; returns a transparent PNG. |
| List image style presets available for |
🎬 Video
Tool | Description |
| Queue a video generation. Supports Sora 2, Veo 3.1, Kling, Wan, LTX 2, Seedance (incl. r2v video-to-video), Runway Gen-4, and others. Accepts image, video, audio, and reference image inputs depending on model. |
| Check status of a queued video job. Returns |
| Mark a completed video as downloaded; deletes server-side media. |
| Transcribe a YouTube video URL. |
| Get a price quote for a video generation BEFORE queuing. |
🔊 Audio (TTS / ASR)
Tool | Description |
| Convert text to speech. Supports cloned voices + emotion tags ( |
| Transcribe audio from a URL. |
| List built-in voices or clone a new voice from a sample audio URL. |
| Get a price quote for music generation BEFORE queuing. |
🎵 Music
Tool | Description |
| Queue music generation. Models: ace-step-15, elevenlabs-music, minimax-music-v2/v25/v26, stable-audio-25, mmaudio-v2, elevenlabs-sound-effects-v2. |
| Check status of a queued music job. |
| Mark a completed music job as downloaded. |
🌐 Web augment
Tool | Description |
| Search the web (Firecrawl-backed). Returns ranked results with snippets. |
| Scrape one URL into markdown text. |
| Extract text from a document URL (PDF, DOCX, EPUB, PPTX, XLSX, …). |
📚 Catalog
Tool | Description |
| List the live model catalog with capabilities and prices. |
| List public Venice characters. |
⛓️ Crypto
Tool | Description |
| Proxy a JSON-RPC call to a supported blockchain network ( |
💳 x402 wallet helpers
Optional — only needed if you authenticate with a wallet via x402 instead of an API key. See x402 — pay with a wallet.
Tool | Description |
| Check the prepaid x402 credit balance for a wallet address. |
| Fetch top-up requirements (network, USDC token address, receiver wallet, minimum amount). |
| List recent x402 top-up + debit transactions for a wallet. |
Configuration
Env var | Default | Notes |
| (none) | Your Venice API key. The simplest setup. |
|
| |
|
| |
|
| |
|
| |
|
| Set to |
|
| |
| (none) | x402 wallet-mode auth token — see . |
|
| HTTP-mode listener. |
|
| HTTP-mode bind address. Set to |
| (none) | Bearer token required by |
|
| Emergency escape hatch for unauthenticated exposed HTTP mode. Use only behind a trusted authenticated proxy. |
|
| Maximum active Streamable HTTP sessions. |
|
| Idle Streamable HTTP session lifetime before cleanup. |
Self-hosting (Streamable HTTP)
/mcp is a credential-backed tool execution endpoint: callers can spend the configured Venice API key or x402 balance. When HTTP mode binds outside loopback, startup fails unless VENICE_MCP_AUTH_TOKEN is set, or VENICE_MCP_ALLOW_UNAUTHENTICATED_HTTP=1 is explicitly set behind a trusted authenticated proxy.
docker run -p 3333:3333 \
-e VENICE_API_KEY=<your-venice-api-key> \
-e VENICE_MCP_AUTH_TOKEN=<choose-a-long-random-token> \
ghcr.io/veniceai/venice-mcp-server:latest
# server at http://localhost:3333/mcpClients should send Authorization: Bearer <choose-a-long-random-token> with HTTP MCP requests. HTTP clients should create new sessions without an mcp-session-id header and then reuse the server-issued session ID; unknown or malformed caller-provided session IDs are rejected. For reproducible production installs, pin the npm package version as shown in the examples instead of using an unversioned latest install path.
Or run from source — see Development below.
x402 — pay with a wallet, no account required
Skip this section if you're using
VENICE_API_KEY. Everything below is optional and only matters if you specifically want to pay with a crypto wallet instead of a Venice account.
Venice supports authenticating with a SIWE-signed wallet token (a.k.a. SIWX) backed by prepaid USDC credit on Base mainnet, in addition to the normal API key flow. This lets you use Venice with no email, phone, or KYC — your wallet is the only identity.
Two-line config
{
"mcpServers": {
"venice": {
"command": "npx",
"args": ["-y", "@veniceai/mcp-server@0.2.0"],
"env": { "VENICE_SIWX_TOKEN": "<base64 SIWE payload>" }
}
}
}The MCP server forwards VENICE_SIWX_TOKEN as the X-Sign-In-With-X header on every Venice API call.
How it works
ONE-TIME SETUP (per wallet)
Sign a SIWE message → produces a SIWX token (base64 JSON)
Set VENICE_SIWX_TOKEN in this MCP server's env
TOP UP (when balance is low)
POST /api/v1/x402/top-up (no payment header) → 402 + payment requirements
Sign a USDC EIP-3009 transferWithAuthorization in your wallet
POST /api/v1/x402/top-up with X-402-Payment: <signed> → Venice settles via
Coinbase CDP facilitator and credits your prepaid balance
EVERY INFERENCE CALL
MCP server sends X-Sign-In-With-X: <SIWX token>
Venice → wallet → credit account → debits and runs inferenceThis MCP server never sees your private key. SIWE signing and USDC authorization happen in your wallet (MetaMask, Coinbase Wallet, viem script, etc.) — the server is purely a header forwarder.
The helper tools venice_x402_balance, venice_x402_top_up_info, and venice_x402_transactions make balance + top-up flow inspectable from inside the agent.
Why prepaid instead of per-call?
⚡ Latency — once topped up, calls are sub-100ms (no on-chain settlement per call)
🧮 Throughput — Coinbase CDP facilitator settles top-ups in batches
🔒 Privacy — wallet ↔ credit account is the only identity link; no email/phone/KYC
🪙 DIEM shortcut — wallets linked to a Venice user with DIEM staked consume from staking balance, no USDC needed
💸 Min top-up $5 (anti-dust). Minimum balance to inference is $0.10.
Per-call HTTP 402 — not supported
Venice rejects X-402-Payment on inference routes. The header is only accepted on /api/v1/x402/top-up. This is by design — Venice settles top-ups in batches via the Coinbase CDP facilitator, then debits a fast off-chain credit account on inference. If you need per-call settlement semantics, you'll need a separate proxy that pays the credit account on demand.
Auth-mode coverage notes
Some Venice endpoints don't accept both auth modes:
Tool | API key | x402 | Notes |
| ✓ | ✗ | Characters endpoint is API-key only |
| ✗ | ✓ | Wallet-bound by design |
| ✗ | ✓ | Wallet-bound by design |
| ✓ | ✓ | Auth-free; same 402 response in both modes |
Hybrid
Set both VENICE_API_KEY AND VENICE_SIWX_TOKEN — API key wins. SIWX is only used when the key is absent.
Architecture
┌──────────────────────┐ stdio OR ┌────────────────────────┐
│ MCP host │ Streamable HTTP │ @veniceai/mcp-server │
│ (Claude / Cursor / ├────────────────────────▶│ - 31 tools │
│ ChatGPT / etc.) │ │ - 3 resources │
└──────────────────────┘ │ - 3 prompts │
│ - header forwarder │
└────────────┬───────────┘
│ HTTPS
│ Authorization: Bearer ***
│ OR
│ X-Sign-In-With-X: <SIWX>
▼
┌────────────────────────┐
│ Venice API │
│ api.venice.ai │
└────────────────────────┘Tool reference (endpoints + auth modes)
Inference (API key OR x402 wallet)
Tool | Endpoint |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Catalog & quotes (auth-free)
Tool | Endpoint |
|
|
|
|
|
|
|
|
Characters (API key only)
Tool | Endpoint |
|
|
|
|
x402 wallet helpers (SIWX only)
Tool | Endpoint |
|
|
|
|
|
|
Development
npm install
npm run build
npm test # full suite (71 tests across 10 suites, ~3s)
npm run test:unit # unit tests only
npm run test:integration # spawns dist/cli.js + a mock Venice over real stdio JSON-RPC
npm start # stdio mode
npm run start:http # http mode on :3333Test layout
test/
├── config.test.ts # env parsing, defaults, header precedence
├── format.test.ts # 402 formatter cases
├── venice-client.test.ts # HTTP client + real mock Venice
├── tools.test.ts # 31 tool registry + endpoint+method+body mappings
├── integration.test.ts # end-to-end JSON-RPC over stdio against a mock Venice
└── helpers/
├── stub-client.ts # in-process VeniceClient stub
└── mock-venice-server.ts # real http.Server fake of Venice for integration testsThe integration suite spawns the compiled CLI and speaks JSON-RPC on its stdin/stdout, exercising initialize → tools/list → tools/call → resources/list → resources/read against a real HTTP mock Venice in three auth scenarios (API key only, SIWX only, no auth).
End-to-end with live Venice + Base mainnet
test/e2e/ is a phased harness against the real Venice API and real Base mainnet — not a mock. It generates a throwaway wallet, signs SIWE + EIP-3009 payloads with viem, and drives the MCP server via JSON-RPC over stdio. The wallet is persisted at .e2e-wallet.json (chmod 600, gitignored — never commit).
Phase | npm script | Cost | What it tests |
|
| free | Generate / reload wallet, print address + balance |
|
| free | SIWX → MCP |
|
| $5 USDC + gas | Sign EIP-3009 → POST |
|
| ~$0.001 / call | SIWX → MCP |
|
| free | Read on-chain USDC + Venice prepaid via |
|
| free |
|
# Comprehensive — all 31 tools × both auth modes, side-by-side report
VENICE_API_KEY=<your-venice-api-key> npm run test:e2e:all-toolsFAQ
Do I have to deal with crypto?
No. The simple path is VENICE_API_KEY + a normal Venice account. x402 is an option for users who want a wallet-only flow.
Where does the wallet's private key live? Not in this server. You sign the SIWE message + USDC top-up authorizations in your own wallet (MetaMask, Coinbase Wallet, viem-script, etc.). The server only sees the resulting SIWX token and never sees a private key.
Minimum top-up? $5 USD (anti-dust). Minimum balance to call inference is $0.10. Default suggested top-up is $10.
Privacy guarantees?
No email, phone, or KYC if you go the SIWX path. The wallet ↔ credit account mapping is the only identity link. The MCP server itself does not log prompts or responses. Combine with X-Venice-TEE-Required: 1 (passed through by your client) to also run inference inside Intel TDX + NVIDIA NRAS confidential compute.
DIEM staking? If your wallet is linked to a Venice user with DIEM staked, calls consume from the staking balance instead of USDC credits — no top-up needed.
Getting 402 errors even though I have an API key?
The most common cause is that VENICE_API_KEY isn't being forwarded to the MCP server process. Most MCP hosts (Claude Desktop, Cursor, Codex, etc.) only pass environment variables that are explicitly listed in the "env" block of your MCP config — system-level env vars are not automatically inherited. Make sure your config looks like this:
{
"mcpServers": {
"venice": {
"command": "npx",
"args": ["-y", "@veniceai/mcp-server@0.2.0"],
"env": { "VENICE_API_KEY": "<your-venice-api-key>" }
}
}
}If the key is missing or blank, the server falls back to x402 mode and returns a 402 payment challenge.
Disclaimer
Community-maintained. Provided as-is, with no warranty or SLA from Venice AI. Use at your own risk.
License
MIT
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/veniceai/venice-mcp-server'
If you have feedback or need assistance with the MCP directory API, please join our Discord server