watsonx MCP Server

specs.html•21.1 KiB

<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Technical Specifications - watsonx MCP Server</title> <link rel="preconnect" href="https://fonts.googleapis.com"> <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin> <link href="https://fonts.googleapis.com/css2?family=IBM+Plex+Sans:wght@400;500;600;700&family=IBM+Plex+Mono:wght@400;500&display=swap" rel="stylesheet"> <style> :root { --ibm-blue: #0f62fe; --carbon-gray-100: #161616; --carbon-gray-90: #262626; --carbon-gray-80: #393939; --carbon-gray-70: #525252; --carbon-gray-50: #8d8d8d; --carbon-gray-30: #c6c6c6; --carbon-gray-10: #f4f4f4; --purple: #8a3ffc; --teal: #009d9a; --green: #24a148; } * { margin: 0; padding: 0; box-sizing: border-box; } body { font-family: 'IBM Plex Sans', sans-serif; background: var(--carbon-gray-100); color: var(--carbon-gray-10); line-height: 1.7; } .container { max-width: 900px; margin: 0 auto; padding: 4rem 2rem; } nav { background: var(--carbon-gray-90); border-bottom: 1px solid var(--carbon-gray-80); padding: 1rem 2rem; position: sticky; top: 0; z-index: 100; } nav a { color: var(--carbon-gray-30); text-decoration: none; margin-right: 2rem; font-size: 0.875rem; } nav a:hover, nav a.active { color: var(--ibm-blue); } h1 { font-size: 2.5rem; margin-bottom: 0.5rem; } h1 span { color: var(--ibm-blue); } .tagline { color: var(--carbon-gray-50); font-size: 1.125rem; margin-bottom: 3rem; } h2 { font-size: 1.5rem; margin: 3rem 0 1.5rem; padding-bottom: 0.5rem; border-bottom: 1px solid var(--carbon-gray-80); } h3 { font-size: 1.125rem; margin: 2rem 0 1rem; color: var(--ibm-blue); } p { margin-bottom: 1rem; } code { font-family: 'IBM Plex Mono', monospace; background: var(--carbon-gray-90); padding: 0.2em 0.4em; border-radius: 3px; font-size: 0.875em; } pre { background: var(--carbon-gray-90); border: 1px solid var(--carbon-gray-80); padding: 1.5rem; overflow-x: auto; margin: 1rem 0; font-family: 'IBM Plex Mono', monospace; font-size: 0.875rem; line-height: 1.6; } table { width: 100%; border-collapse: collapse; margin: 1rem 0; } th, td { text-align: left; padding: 0.75rem 1rem; border-bottom: 1px solid var(--carbon-gray-80); } th { background: var(--carbon-gray-90); font-weight: 600; } tr:hover { background: var(--carbon-gray-90); } .spec-grid { display: grid; grid-template-columns: repeat(auto-fit, minmax(250px, 1fr)); gap: 1rem; margin: 1rem 0; } .spec-card { background: var(--carbon-gray-90); border: 1px solid var(--carbon-gray-80); padding: 1.5rem; } .spec-card h4 { font-size: 0.875rem; color: var(--carbon-gray-50); margin-bottom: 0.5rem; text-transform: uppercase; letter-spacing: 0.05em; } .spec-card .value { font-size: 1.25rem; font-weight: 600; } .badge { display: inline-block; background: var(--carbon-gray-80); padding: 0.25rem 0.5rem; border-radius: 2px; font-size: 0.75rem; margin-right: 0.5rem; margin-bottom: 0.5rem; } .badge.blue { background: var(--ibm-blue); } .badge.green { background: var(--green); } .badge.purple { background: var(--purple); } ul { margin: 1rem 0; padding-left: 1.5rem; } li { margin-bottom: 0.5rem; } .callout { background: var(--carbon-gray-90); border-left: 4px solid var(--ibm-blue); padding: 1rem 1.5rem; margin: 1.5rem 0; } .callout.warning { border-color: #f1c21b; } a { color: var(--ibm-blue); } a:hover { text-decoration: none; } </style> </head> <body> <nav> <a href="index.html">Home</a> <a href="specs.html" class="active">Specifications</a> <a href="https://github.com/PurpleSquirrelMedia/watsonx-mcp-server">GitHub</a> </nav> <div class="container"> <h1>Technical <span>Specifications</span></h1> <p class="tagline">Complete documentation for the watsonx MCP Server integration</p> <h2>System Overview</h2> <div class="spec-grid"> <div class="spec-card"> <h4>Protocol</h4> <div class="value">MCP 1.0</div> <p style="font-size: 0.875rem; color: var(--carbon-gray-50); margin-top: 0.5rem;">Model Context Protocol</p> </div> <div class="spec-card"> <h4>Transport</h4> <div class="value">stdio</div> <p style="font-size: 0.875rem; color: var(--carbon-gray-50); margin-top: 0.5rem;">Standard I/O streams</p> </div> <div class="spec-card"> <h4>Runtime</h4> <div class="value">Node.js 18+</div> <p style="font-size: 0.875rem; color: var(--carbon-gray-50); margin-top: 0.5rem;">ES Modules</p> </div> <div class="spec-card"> <h4>Region</h4> <div class="value">us-south</div> <p style="font-size: 0.875rem; color: var(--carbon-gray-50); margin-top: 0.5rem;">Dallas, TX</p> </div> </div> <h2>Architecture</h2> <p>The watsonx MCP Server implements a <strong>two-agent architecture</strong> where Claude (Opus 4.5) acts as the primary reasoning agent and can delegate specific tasks to IBM watsonx.ai foundation models.</p> <pre> ┌─────────────────────────────────────────────────────────────────┐ │ CLIENT LAYER │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Claude Code CLI / Desktop │ │ │ │ (Claude Opus 4.5) │ │ │ └───────────────────────┬─────────────────────────────────┘ │ └──────────────────────────│──────────────────────────────────────┘ │ MCP Protocol (JSON-RPC over stdio) ▼ ┌─────────────────────────────────────────────────────────────────┐ │ MCP SERVER LAYER │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ watsonx-mcp-server (Node.js) │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ generate │ │ chat │ │ embeddings │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ └───────────────────────┬─────────────────────────────────┘ │ └──────────────────────────│──────────────────────────────────────┘ │ HTTPS + IAM Authentication ▼ ┌─────────────────────────────────────────────────────────────────┐ │ IBM CLOUD LAYER │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ watsonx.ai (us-south.ml.cloud.ibm.com) │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │ Granite │ │ Llama 3 │ │ Mistral │ │ Slate │ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ </pre> <h2>Tool Specifications</h2> <h3>watsonx_generate</h3> <p>Generate text completions using foundation models.</p> <table> <tr> <th>Parameter</th> <th>Type</th> <th>Required</th> <th>Default</th> <th>Description</th> </tr> <tr> <td><code>prompt</code></td> <td>string</td> <td>Yes</td> <td>-</td> <td>Input text prompt</td> </tr> <tr> <td><code>model_id</code></td> <td>string</td> <td>No</td> <td>ibm/granite-13b-chat-v2</td> <td>Foundation model identifier</td> </tr> <tr> <td><code>max_new_tokens</code></td> <td>number</td> <td>No</td> <td>500</td> <td>Maximum tokens to generate</td> </tr> <tr> <td><code>temperature</code></td> <td>number</td> <td>No</td> <td>0.7</td> <td>Sampling temperature (0-2)</td> </tr> <tr> <td><code>top_p</code></td> <td>number</td> <td>No</td> <td>1.0</td> <td>Nucleus sampling probability</td> </tr> <tr> <td><code>top_k</code></td> <td>number</td> <td>No</td> <td>50</td> <td>Top-k sampling</td> </tr> </table> <h3>watsonx_chat</h3> <p>Multi-turn conversation with chat models.</p> <table> <tr> <th>Parameter</th> <th>Type</th> <th>Required</th> <th>Description</th> </tr> <tr> <td><code>messages</code></td> <td>array</td> <td>Yes</td> <td>Array of {role, content} objects</td> </tr> <tr> <td><code>model_id</code></td> <td>string</td> <td>No</td> <td>Chat model to use</td> </tr> <tr> <td><code>max_new_tokens</code></td> <td>number</td> <td>No</td> <td>Maximum response length</td> </tr> <tr> <td><code>temperature</code></td> <td>number</td> <td>No</td> <td>Response randomness</td> </tr> </table> <p>Message roles:</p> <ul> <li><code>system</code> - System instructions</li> <li><code>user</code> - User input</li> <li><code>assistant</code> - Model response</li> </ul> <h3>watsonx_embeddings</h3> <p>Generate vector embeddings for semantic search and RAG.</p> <table> <tr> <th>Parameter</th> <th>Type</th> <th>Required</th> <th>Description</th> </tr> <tr> <td><code>texts</code></td> <td>array[string]</td> <td>Yes</td> <td>Texts to embed</td> </tr> <tr> <td><code>model_id</code></td> <td>string</td> <td>No</td> <td>Embedding model (default: slate-125m)</td> </tr> </table> <h3>watsonx_list_models</h3> <p>List all available foundation models.</p> <p>No parameters required. Returns array of model objects with id, name, provider, and tasks.</p> <h2>Available Models</h2> <h3>Text Generation Models</h3> <table> <tr> <th>Model ID</th> <th>Provider</th> <th>Parameters</th> <th>Use Case</th> </tr> <tr> <td><code>ibm/granite-13b-chat-v2</code></td> <td>IBM</td> <td>13B</td> <td>General chat, instruction following</td> </tr> <tr> <td><code>ibm/granite-3-8b-instruct</code></td> <td>IBM</td> <td>8B</td> <td>Latest Granite, fast inference</td> </tr> <tr> <td><code>meta-llama/llama-3-70b-instruct</code></td> <td>Meta</td> <td>70B</td> <td>Complex reasoning, high quality</td> </tr> <tr> <td><code>meta-llama/llama-3-8b-instruct</code></td> <td>Meta</td> <td>8B</td> <td>Fast, efficient generation</td> </tr> <tr> <td><code>mistralai/mistral-large</code></td> <td>Mistral AI</td> <td>-</td> <td>Multilingual, reasoning</td> </tr> <tr> <td><code>mistralai/mixtral-8x7b-instruct-v01</code></td> <td>Mistral AI</td> <td>8x7B MoE</td> <td>Efficient, high quality</td> </tr> </table> <h3>Embedding Models</h3> <table> <tr> <th>Model ID</th> <th>Dimensions</th> <th>Use Case</th> </tr> <tr> <td><code>ibm/slate-125m-english-rtrvr</code></td> <td>768</td> <td>English semantic search, RAG</td> </tr> <tr> <td><code>ibm/slate-30m-english-rtrvr</code></td> <td>384</td> <td>Lightweight embedding</td> </tr> </table> <h2>Configuration</h2> <h3>Environment Variables</h3> <table> <tr> <th>Variable</th> <th>Required</th> <th>Description</th> </tr> <tr> <td><code>WATSONX_API_KEY</code></td> <td>Yes</td> <td>IBM Cloud API key</td> </tr> <tr> <td><code>WATSONX_URL</code></td> <td>No</td> <td>Service URL (default: us-south)</td> </tr> <tr> <td><code>WATSONX_PROJECT_ID</code></td> <td>No</td> <td>Project ID for scoped operations</td> </tr> </table> <h3>Claude Code Configuration</h3> <pre> // ~/.claude.json { "mcpServers": { "watsonx": { "type": "stdio", "command": "node", "args": ["/path/to/watsonx-mcp-server/index.js"], "env": { "WATSONX_API_KEY": "your-api-key", "WATSONX_URL": "https://us-south.ml.cloud.ibm.com" } } } } </pre> <h2>Dependencies</h2> <table> <tr> <th>Package</th> <th>Version</th> <th>Purpose</th> </tr> <tr> <td><code>@modelcontextprotocol/sdk</code></td> <td>^1.24.3</td> <td>MCP server framework</td> </tr> <tr> <td><code>@ibm-cloud/watsonx-ai</code></td> <td>^1.7.5</td> <td>IBM watsonx.ai SDK</td> </tr> <tr> <td><code>ibm-cloud-sdk-core</code></td> <td>^5.4.5</td> <td>IAM authentication</td> </tr> </table> <h2>Use Cases</h2> <div class="callout"> <strong>When to delegate to watsonx:</strong> <ul style="margin-top: 0.5rem;"> <li>IBM-specific model capabilities (Granite enterprise features)</li> <li>Batch inference on large datasets</li> <li>Embedding generation for RAG pipelines</li> <li>Cost optimization with smaller models</li> <li>Regulatory compliance requiring IBM infrastructure</li> </ul> </div> <h3>Example: RAG Pipeline</h3> <pre> // 1. Generate embeddings with watsonx User: "Embed these documents for search" Claude: [calls watsonx_embeddings with document texts] // 2. Store in vector database // 3. Query with user question embedding // 4. Claude synthesizes final answer </pre> <h3>Example: Multi-Model Reasoning</h3> <pre> // Claude handles complex reasoning // Delegates specific tasks to specialized models User: "Analyze this legal document" Claude (Opus 4.5): → Understands context, plans analysis → Calls watsonx_generate with Granite for compliance check → Synthesizes results with own analysis </pre> <h2>Free Tier Limits</h2> <div class="callout warning"> <strong>watsonx.ai Lite Plan:</strong> <ul style="margin-top: 0.5rem;"> <li>Limited inference tokens per month</li> <li>Access to select foundation models</li> <li>Single user/project</li> <li>No SLA guarantees</li> </ul> </div> <h2>Error Handling</h2> <p>The server returns structured error messages:</p> <pre> { "content": [{ "type": "text", "text": "Error calling watsonx.ai: [error message]" }] } </pre> <p>Common errors:</p> <ul> <li><code>401 Unauthorized</code> - Invalid API key</li> <li><code>403 Forbidden</code> - Insufficient permissions</li> <li><code>429 Too Many Requests</code> - Rate limit exceeded</li> <li><code>Model not found</code> - Invalid model_id</li> </ul> <h2>Security</h2> <ul> <li>API keys stored in environment variables, not code</li> <li>IAM token-based authentication</li> <li>All traffic over HTTPS</li> <li>No data persistence in MCP server</li> </ul> <hr style="margin: 3rem 0; border: none; border-top: 1px solid var(--carbon-gray-80);"> <p style="text-align: center; color: var(--carbon-gray-50);"> <a href="https://github.com/PurpleSquirrelMedia/watsonx-mcp-server">View Source</a> | <a href="https://www.ibm.com/watsonx">IBM watsonx.ai</a> | <a href="https://modelcontextprotocol.io">MCP Protocol</a> </p> </div> </body> </html>

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PurpleSquirrelMedia/watsonx-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

specs.html•21.1 KiB