What can you do with this server?

The IBM watsonx.data MCP Server enables AI assistants to interact with IBM watsonx.data lakehouses, providing tools across platform management, engine operations, catalog management, query execution, Spark applications, and data ingestion. Platform Management * Retrieve instance details including status, version, region, enabled features, and console URL. Engine Operations * List available Presto, Prestissimo, and Spark engines. * Create new Presto, Prestissimo, or Spark engines with custom configurations (including autoscaling). * Pause, resume, and restart Presto, Prestissimo, and Spark engines. * Scale engines by adjusting node counts and types. * Update engine configuration, display name, description, and tags. Catalog & Schema Management * List schemas within a catalog and tables within a schema. * Describe tables with detailed column definitions, types, and metadata. * Create new schemas in object storage catalogs (Presto/Prestissimo only). * Add columns to existing tables, rename columns, and rename tables. Query Execution * Execute SELECT, INSERT, and UPDATE queries. * Explain query execution plans (logical, distributed, validate, IO) without running them. * Analyze queries with detailed execution statistics and performance data. Spark Application Management * Submit JAR, Python, or R Spark applications with configurable resources and credentials. * List, monitor status of, and stop Spark applications. Data Ingestion * Create ingestion jobs from object storage (CSV, Parquet, JSON, ORC, Avro) into lakehouse tables, with configurable write modes, delimiters, encoding, and Spark resources. * List, check status of, and cancel ingestion jobs.

Which integrations are available for this server?

Provides tools for managing Presto engines, executing queries, and exploring data in IBM watsonx.data lakehouses.

How do I use IBM watsonx.data MCP Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@IBM watsonx.data MCP Server show tables in the finance catalog" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

IBM watsonx.data MCP Server

Official

by IBM

Overview Schema Related Servers Score Discussions

Python

Remote

IBM watsonx.data MCP Server

Overview

The IBM watsonx.data MCP Server enables AI assistants to interact seamlessly with IBM watsonx.data lakehouses using natural language. It provides specialized tools across 6 categories for comprehensive lakehouse operations:

Platform Management: Instance status and configuration
Engine Operations: Manage and monitor Presto and Spark engines
Catalog Management: Browse schemas, tables, and metadata; modify table structures
Query Execution: Run SELECT, INSERT, UPDATE queries with query plan analysis
Spark Applications: Submit, monitor, and manage Spark jobs
Data Ingestion: Load data from object storage into lakehouse tables

Currently, it supports stdio transport for local subprocess and streamable HTTP. For comprehensive details on transport options, including implementation guidelines and security best practices, refer to the MCP Transports Specification.

Note: IBM watsonx.data also provides a hosted remote MCP server that requires no installation. For details on using the remote server, see the Remote Querying Documentation.

Related MCP server: SQL Server MCP Server

Supported Features

Core Capabilities

Multiple Tools organized into 6 functional categories (see TOOLS.md)
Platform Tools: Instance details and status
Engine Tools: Lifecycle management for Presto and Spark engines
Catalog Tools: Schema and table discovery, metadata operations, DDL operations
Query Tools: SELECT, INSERT, UPDATE execution with query plan analysis
Spark Application Tools: Submit and manage Spark applications
Ingestion Tools: Data loading from object storage (CSV, Parquet, JSON)

Security & Authentication

IBM Cloud IAM authentication with automatic token refresh
Read and write operations with appropriate access controls

Transport & Integration

Current: stdio transport, streamable HTTP
Compatible with Claude Desktop, IBM Bob, and other MCP-enabled AI assistants

Architecture Overview

flowchart LR
    User --> Assistant[AI Assistant]
    Assistant -->|stdio/JSON-RPC| Server[watsonx.data MCP Server]
    Server -->|IAM Auth + API Calls| WX[watsonx.data Service]
    WX --> Engines[Presto & Spark Engines]
    Engines --> Lakehouse[Lakehouse Storage]

    style Server fill:#f3e5f5,stroke:#4a148c
    style WX fill:#e0f2f1,stroke:#00695c
    style Engines fill:#fff3e0,stroke:#ef6c00

Query Execution Flow

sequenceDiagram
    participant A as AI Assistant
    participant S as MCP Server
    participant I as IBM Cloud IAM
    participant W as watsonx.data API
    participant E as Presto/Spark Engines

    A->>S: Natural-language request (MCP)
    S->>I: Request IAM token
    I-->>S: IAM access token
    S->>W: API request (catalog, SQL, schema...)
    W->>E: Query execution / metadata ops
    E-->>W: Results
    W-->>S: Response
    S-->>A: Structured MCP result

Getting Started

1. Prerequisites

Before installation, ensure you have:

Python 3.11 or higher (Download)
uv package manager (Install)
IBM Cloud account (Create Account)
watsonx.data instance (Provision Instance) and (Setup)
IBM Cloud API key (Create API Key)
Gather Instance details:
- Base URL: Obtain from your watsonx.data instance:
  - Option 1: Copy the hostname from your browser's address bar when accessing the instance, then append /lakehouse/api
  - Option 2: Navigate to instance details → Data Access Service (DAS) endpoint
  - Example format: https://us-south.lakehouse.cloud.ibm.com/lakehouse/api
- Instance CRN (e.g., crn:v1:bluemix:public:lakehouse:us-south:a/...)
- IAM API Key with access to watsonx.data instance, catalog and engines

2. Installation

Option 1: Using pip / pipx

pipx install ibm-watsonxdata-mcp-server

If pipx is not installed, you can install the MCP server using pip:

pip install --user ibm-watsonxdata-mcp-server

Option 2: Development Setup

# Clone repository
git clone https://github.com/IBM/ibm-watsonxdata-mcp-server.git
cd ibm-watsonxdata-mcp-server

# Install dependencies
uv sync

# Copy example configuration
cp examples/.env.example .env

# Edit with your credentials
export WATSONX_DATA_BASE_URL=https://us-south.lakehouse.cloud.ibm.com/lakehouse/api
export WATSONX_DATA_API_KEY=your_ibm_cloud_api_key_here
export WATSONX_DATA_INSTANCE_ID=crn:v1:bluemix:public:lakehouse:us-south:a/...

# Verify installation
uv run ibm-watsonxdata-mcp-server --transport stdio

3. Configure your AI Assistants

Integration with Claude Desktop

Find your Claude Desktop configuration file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

Add this to claude_desktop_config.json:

Option 1: Using pip/pipx install

First, find the full path to the installed command:

# macOS/Linux
which ibm-watsonxdata-mcp-server

# Windows (PowerShell)
where.exe ibm-watsonxdata-mcp-server

Common installation paths:

macOS/Linux with pipx or pip --user: ~/.local/bin/ibm-watsonxdata-mcp-server
Windows with pipx: %USERPROFILE%\.local\bin\ibm-watsonxdata-mcp-server.exe
System-wide install: /usr/local/bin/ibm-watsonxdata-mcp-server

Then use the full path in your config:

{
  "mcpServers": {
    "IBM watsonx.data MCP Server": {
      "command": "/path/from/which/command/ibm-watsonxdata-mcp-server",
      "args": ["--transport", "stdio"],
      "env": {
        "WATSONX_DATA_BASE_URL": "https://us-south.lakehouse.cloud.ibm.com/lakehouse/api",
        "WATSONX_DATA_API_KEY": "your_api_key_here",
        "WATSONX_DATA_INSTANCE_ID": "crn:v1:bluemix:public:lakehouse:us-south:a/..."
      }
    }
  }
}

Option 2: Using development installation

{
  "mcpServers": {
    "IBM watsonx.data MCP Server": {
      "command": "/absolute/path/to/uv",
      "args": [
        "--directory",
        "/absolute/path/to/mcp-watsonx-data",
        "run",
        "ibm-watsonxdata-mcp-server"
      ],
      "env": {
        "WATSONX_DATA_BASE_URL": "https://us-south.lakehouse.cloud.ibm.com/lakehouse/api",
        "WATSONX_DATA_API_KEY": "your_api_key_here",
        "WATSONX_DATA_INSTANCE_ID": "crn:v1:bluemix:public:lakehouse:us-south:a/..."
      }
    }
  }
}

Integration with IBM Bob

Find your mcp_settings.json configuration file: ~/Library/Application Support/IBM Bob/User/globalStorage/ibm.bob-code/settings/mcp_settings.json

Different version will have different path. The exact path will be found in Views and More Actions... -> MCP Servers -> Edit MCP

Option 1: Using pip/pipx install

First, find the full path:

which ibm-watsonxdata-mcp-server

Then use that path in your config:

{
  "mcpServers": {
    "IBM watsonx.data MCP Server": {
      "command": "/path/from/which/command/ibm-watsonxdata-mcp-server",
      "args": ["--transport", "stdio"],
      "env": {
        "WATSONX_DATA_BASE_URL": "https://us-south.lakehouse.cloud.ibm.com/lakehouse/api",
        "WATSONX_DATA_API_KEY": "your_api_key_here",
        "WATSONX_DATA_INSTANCE_ID": "crn:v1:bluemix:public:lakehouse:us-south:a/..."
      }
    }
  }
}

Option 2: Using development installation

{
  "mcpServers": {
    "IBM watsonx.data MCP Server": {
      "command": "/absolute/path/to/uv",
      "args": [
        "--directory",
        "/absolute/path/to/mcp-watsonx-data",
        "run",
        "ibm-watsonxdata-mcp-server"
      ],
      "env": {
        "WATSONX_DATA_BASE_URL": "https://us-south.lakehouse.cloud.ibm.com/lakehouse/api",
        "WATSONX_DATA_API_KEY": "your_api_key_here",
        "WATSONX_DATA_INSTANCE_ID": "crn:v1:bluemix:public:lakehouse:us-south:a/..."
      }
    }
  }
}

Running with Streamable HTTP Transport

For HTTP-based integrations, bind to 127.0.0.1 (localhost only) unless you explicitly need network access:

uv run python -m lakehouse_mcp --transport streamable-http --host 127.0.0.1 --port 9000

Security note: The streamable-http transport has no built-in authentication layer. The server uses your WATSONX_DATA_API_KEY to authenticate all requests to IBM Cloud on your behalf. If you bind to 0.0.0.0 (all interfaces), any client that can reach that port can invoke all tools under your IAM identity without supplying any credentials. Only use --host 0.0.0.0 in a trusted network environment and place a reverse proxy (e.g. nginx) with authentication in front of the MCP endpoint. For single-user local use, always prefer --host 127.0.0.1 or the default stdio transport.

Available Tools

Quick Reference

Goal	Tool to Use	Example
Check connectivity	`get_instance_details`	"What's my instance status?"
Find engines	`list_engines`	"List all engines"
Discover data	`list_schemas`	"What schemas exist?"
Find tables	`list_tables`	"Show tables in sales_db"
Understand schema	`describe_table`	"Describe customers table"
Query data	`execute_select`	"SELECT * FROM tpch.tiny.customer LIMIT 10"

See TOOLS.md for currently supported tools and advanced usage patterns.

Example Usage

Once configured with Claude Desktop, you can interact with your watsonx.data instance using natural language:

1. Check Instance Status

You ask:

What watsonx.data instance am I connected to?

Claude responds with:

You're connected to watsonx.data instance:
- Instance ID: crn:v1:bluemix:public:lakehouse:us-south:a/abc123...
- Region: us-south
- Version: 2.0.0
- Status: active
- Serverless Spark: enabled
- Presto: enabled

2. Discover Available Data

You ask:

What schemas are available in tpch catalog?

Claude responds:

Available Schemas:
- `tiny` - TPC-H benchmark data (smallest scale)
- `sf1` through `sf100000` - Various scale factors (1, 10, 100, 300, 1000, 3000, 10000, 30000, 100000)
- `information_schema` - System schema

**Tables in the "tiny" schema:**
1. customer
2. lineitem
3. nation
4. orders
5. part
6. partsupp
7. region
8. supplier

These are the standard TPC-H benchmark tables. Would you like me to:
- Describe the schema of one of these tables?
- Run a sample query on one of the tables?
- Explore tables from a different scale factor (sf1, sf100, etc.)?

3. Explore Table Structure

You ask:

Show me the schema for the customer table in tiny schema

Claude responds:

Here's the schema for the **customer** table in the **tpch.tiny** schema:

Columns (8 total):

1. `c_custkey` - bigint (nullable)
2. `c_name` - varchar(25) (nullable)
3. `c_address` - varchar(40) (nullable)
4. `c_nationkey` - bigint (nullable)
5. `c_phone` - varchar(15) (nullable)
6. `c_acctbal` - double (nullable)
7. `c_mktsegment` - varchar(10) (nullable)
8. `c_comment` - varchar(117) (nullable)

Table Properties:
- Type: TABLE
- Primary Keys: None
- Partitions: None

This is a standard TPC-H customer table with customer ID, name, address, nation reference, phone, account balance, market segment, and comments. All columns are nullable.

Would you like me to run a query on this table or explore other tables?

4. Query Data

You ask:

Show me the top 10 customers by account balance

Claude executes:

SELECT c_custkey, c_name, c_address, c_phone, c_acctbal, c_mktsegment FROM customer ORDER BY c_acctbal DESC LIMIT 10

And presents results in a formatted table Here are the top 10 customers by account balance:

Customer ID	Customer Name	Account Balance
213	Customer#000000213	$9,987.71
45	Customer#000000045	$9,983.38
1106	Customer#000001106	$9,977.62
200	Customer#000000200	$9,967.60
140	Customer#000000140	$9,963.15
381	Customer#000000381	$9,931.71
43	Customer#000000043	$9,904.28
100	Customer#000000100	$9,889.89
780	Customer#000000780	$9,874.12
518	Customer#000000518	$9,871.66

The query executed in 42.7 seconds and returned 10 rows. Customer#000000213 has the highest account balance at $9,987.71.

5. Multi-Step Analysis

You ask:

I need to analyze customer data. What's available and what does it look like?

Claude:

Let me check what's in your watsonx.data instance
I'll explore the tpch catalog
Here's the customer table schema
Runs a query to get customer data
Runs few more queries based on the columns in the customer table
Looks at the related tables
Presents a summary

See TOOLS.md for the complete list of tools, more usage patterns, and advanced examples.

Development

Setting Up Development Environment

Clone the repository

git clone https://github.com/your-org/mcp-watsonx-data.git
cd mcp-watsonx-data

Install dependencies (including dev dependencies)
```
uv sync --extra dev
```

Set up environment variables

cp examples/.env.example .env
# Edit .env with your credentials
export WATSONX_DATA_BASE_URL=https://us-south.lakehouse.cloud.ibm.com/lakehouse/api
export WATSONX_DATA_API_KEY=your_ibm_cloud_api_key_here
export WATSONX_DATA_INSTANCE_ID=crn:v1:bluemix:public:lakehouse:us-south:a/...

Running Tests

Run the full test suite with coverage:

uv run pytest

Run tests with verbose output:

uv run pytest -v

Run specific test file:

uv run pytest tests/test_client.py

Run tests with coverage report:

uv run pytest --cov=lakehouse_mcp --cov-report=html

View coverage report:

open htmlcov/index.html  # macOS
xdg-open htmlcov/index.html  # Linux
start htmlcov/index.html  # Windows

Code Quality

Run linting and formatting:

uv run ruff check .
uv run ruff format .

Run type checking:

uv run mypy src/

Run pre-commit hooks:

uv run pre-commit run --all-files

Troubleshooting

See TROUBLESHOOTING.md for common issues, diagnostics, and solutions.

Useful Links

IBM watsonx.data Docs: https://cloud.ibm.com/docs/watsonxdata?topic=watsonxdata-getting-started
IBM Cloud API Keys: https://cloud.ibm.com/iam/apikeys
MCP Specification: https://modelcontextprotocol.io/

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

5wRelease cycle

5Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

View all tools

Related MCP Servers

EdgeLake MCP Server
Databases Data Platforms Search
tom-viviano
A
license
-
quality
D
maintenance
Enables AI assistants to query and explore distributed data across EdgeLake nodes through SQL operations, resource discovery, and schema inspection. Supports complex queries with joins, aggregations, and metadata fields across multiple databases and tables.
Last updated 2025-10-16
Mozilla Public 2.0
SQL Server MCP Server
Databases Developer Tools
Nebhg
F
license
-
quality
D
maintenance
Enables AI assistants to interact with SQL Server databases through natural language, providing capabilities for executing queries, exploring schemas, analyzing performance, backing up tables, and managing data with built-in safety limits.
Last updated 2025-06-27
Watsonx.data Document Libraryofficial
RAG Systems Search Knowledge & Memory
IBM
A
license
-
quality
C
maintenance
Bridges AI agents with watsonx.data document libraries, enabling natural language queries and intelligent retrieval from document libraries.
Last updated 2025-09-18
4
Apache 2.0
databricks-sql-mcp
Databases Search Cloud Platforms
benkeeley9
A
license
-
quality
F
maintenance
Enables AI assistants to execute SQL queries and explore databases, tables, and catalogs on Databricks using Unity Catalog.
Last updated 2026-02-12
1
MIT

View all related MCP servers

Related MCP Connectors

mcp
The grounded data layer for any LLM: governed SQL, metrics, lineage and catalog over your data.
mcp
Connect your AI assistants to Keboola and expose your data, transformations, SQL queries, ...
mcp
GibsonAI MCP server: manage your databases with natural language

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/IBM/ibm-watsonxdata-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

IBM watsonx.data MCP Server

Overview

Supported Features

Core Capabilities

Security & Authentication

Transport & Integration

Architecture Overview

Query Execution Flow

Getting Started

1. Prerequisites

2. Installation

Option 1: Using pip / pipx

Option 2: Development Setup

3. Configure your AI Assistants

Integration with Claude Desktop

Integration with IBM Bob

Running with Streamable HTTP Transport

Available Tools

Quick Reference

Example Usage

1. Check Instance Status

2. Discover Available Data

3. Explore Table Structure

4. Query Data

5. Multi-Step Analysis

Development

Setting Up Development Environment

Running Tests

Code Quality

Troubleshooting

Useful Links

Maintenance

Resources

Looking for Admin?

Tools

Related MCP Servers

EdgeLake MCP Server

SQL Server MCP Server

Watsonx.data Document Libraryofficial

databricks-sql-mcp

Related MCP Connectors

Latest Blog Posts

MCP directory API