IBM watsonx.data MCP Server
OfficialServer Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| WATSONX_DATA_API_KEY | Yes | IBM Cloud API key for authentication | |
| WATSONX_DATA_BASE_URL | Yes | The base URL of the watsonx.data instance, e.g., https://us-south.lakehouse.cloud.ibm.com/lakehouse/api | |
| WATSONX_DATA_INSTANCE_ID | Yes | The CRN (Cloud Resource Name) of the watsonx.data instance |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| logging | {} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| extensions | {
"io.modelcontextprotocol/ui": {}
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| add_columnsA | Add one or more columns to a table in a watsonx.data schema. Note: Data types are typically specified in lowercase (e.g., "varchar", "int", "decimal"). While the API pattern allows both cases, lowercase is recommended for compatibility. Args: catalog_name: Catalog containing the table (e.g., "iceberg_data") schema_name: Schema containing the table table_name: Table to add columns to columns: List of column definitions, each with: name (required), type (required, lowercase recommended), comment (optional), extra (optional), precision (optional), scale (optional) engine_id: Engine ID to use for the operation (from list_engines) Returns: Dict with: - columns: List of added column details - total_count: Number of columns added |
| create_schemaA | Create a new schema in a watsonx.data catalog. IMPORTANT RESTRICTIONS: - Engine must be a Presto or Prestissimo engine (Spark engines are not supported) - Catalog must be an object storage catalog (Iceberg or Hive format) - Datasource catalogs (e.g., from external databases) are NOT supported for schema creation Args: catalog_id: Catalog identifier - must be an object storage catalog (e.g., "iceberg_data", "hive_data") schema_name: Name for the new schema engine_id: Engine ID to use for the operation - must be a Presto or Prestissimo engine (from list_engines) custom_path: Path within bucket where schema will be created (REQUIRED, must be at least 1 character). If unsure, use the schema_name as the custom_path. storage_name: Storage/bucket name for the schema (REQUIRED for object storage catalogs) Returns: Dict with: - name: Name of the created schema - catalog_name: Parent catalog name - custom_path: Custom path used - storage_name: Storage name if specified Raises: Error if: - Engine is not Presto or Prestissimo type - Catalog is a datasource catalog (not object storage) - custom_path is empty or storage_name is not provided |
| describe_tableA | Get detailed schema and metadata for a watsonx.data table. Args: catalog_name: Catalog containing the table (e.g., "iceberg_data", "tpch") schema_name: Schema containing the table (from list_schemas) table_name: Table to describe (from list_tables) engine_id: Engine ID for metadata retrieval (from list_engines) Returns: Dict with: - name: Table name - catalog_name, schema_name: Echo of inputs - columns: List of column objects with: - name: Column name (required) - type: SQL data type (required) - comment: Optional comment/description - extra: Optional extra attributes (e.g., AUTO_INCREMENT) - length: Optional length for VARCHAR/CHAR types - precision: Optional precision for DECIMAL types - scale: Optional scale for DECIMAL types - column_count: Total number of columns - engine_id: Echo of input |
| list_schemasA | List database schemas in a watsonx.data catalog. Args: catalog_name: Catalog to list schemas from (e.g., "iceberg_data", "hive_data", "tpch") engine_id: Engine ID for metadata queries (from list_engines) Returns: Dict with: - schemas: List of schema objects with schema_name, catalog_name - total_count: Number of schemas found - catalog_name, engine_id: Echo of inputs |
| list_tablesA | List tables in a watsonx.data schema. Args: catalog_name: Catalog containing the schema (e.g., "iceberg_data", "hive_data") schema_name: Schema/database containing tables (from list_schemas) engine_id: Engine ID for metadata queries (from list_engines) Returns: Dict with: - tables: List of table names (strings) - total_count: Number of tables in schema - catalog_name, schema_name, engine_id: Echo of inputs |
| rename_columnA | Rename a column in a table in a watsonx.data schema. Args: catalog_name: Catalog containing the table (e.g., "iceberg_data") schema_name: Schema containing the table table_name: Table containing the column column_name: Current column name new_column_name: New name for the column engine_id: Engine ID to use for the operation (from list_engines) Returns: Dict with column details including the new name |
| rename_tableA | Rename a table in a watsonx.data schema. Args: catalog_name: Catalog containing the table (e.g., "iceberg_data") schema_name: Schema containing the table table_name: Current table name new_table_name: New name for the table engine_id: Engine ID to use for the operation (from list_engines) Returns: Dict with table details including the new name |
| create_presto_engineA | Create a new Presto engine in watsonx.data. EXAMPLE PAYLOAD: { "origin": "native", "display_name": "My-Presto-Engine", "description": "Presto engine with autoscaling", "tags": [], "associated_catalogs": [], "configuration": { "size_config": "custom", "coordinator": { "node_type": "starter", "quantity": 1 }, "worker": { "node_type": "starter", "quantity": 1 }, "autoscaling_enabled": true, "autoscaling_config": { "type": "cpu", "target": 40, "min_worker_quantity": 1, "max_worker_quantity": 18, "query_termination_grace_period_min": 1, "scale_in_stabilization_window_min": 5, "scaling_step_size": 1 } } } Args: origin: (required) "native" display_name: (required) Display name for the engine configuration: (required) Engine configuration with required fields: - size_config: (required) "custom" (recommended) or predefined options (may be supported) - coordinator: (required) {"node_type": typically "starter" or "cache_optimized", "quantity": 1} - worker: (required) {"node_type": typically "starter" or "cache_optimized", "quantity": 1-18 recommended} - autoscaling_enabled: (optional) boolean to enable autoscaling - autoscaling_config: (required if autoscaling_enabled is true) autoscaling configuration object (see AUTOSCALING section) associated_catalogs: (optional) List of catalog names to associate description: (optional) Engine description 50 characters max engine_id: (optional) Custom engine ID (must match pattern: presto-0 through presto-1000) tags: (optional) Tags for the engine AUTOSCALING (OPTIONAL): To enable autoscaling, include these fields in the configuration:
PREDEFINED SIZE CONFIGS: If using predefined configs, exact node types and quantities must match:
Returns: Dict with created engine details including engine_id |
| create_prestissimo_engineA | Create a new Prestissimo engine in watsonx.data. EXAMPLE PAYLOAD: { "origin": "native", "display_name": "My-Prestissimo-Engine", "description": "Prestissimo engine for fast queries", "tags": [], "associated_catalogs": [], "configuration": { "size_config": "custom", "coordinator": { "node_type": "starter", "quantity": 1 }, "worker": { "node_type": "starter", "quantity": 1 } } } Args: origin: "native" display_name: (required) Display name for the engine configuration: (required) Engine configuration with required fields: - size_config: (required) "custom" (recommended) or predefined options (may be supported) - coordinator: (required) {"node_type": typically "starter" or "cache_optimized", "quantity": 1} - worker: (required) {"node_type": typically "starter" or "cache_optimized", "quantity": 1-18 recommended} associated_catalogs: (optional) List of catalog names to associate description: (optional) Engine description 50 characters max engine_id: (optional) Custom engine ID (must match pattern: prestissimo-0 to prestissimo-1000) tags: (optional) Tags for the engine CUSTOM SIZE CONFIG (RECOMMENDED):
PREDEFINED SIZE CONFIGS: If using predefined configs, exact node types and quantities must match:
Returns: Dict with created engine details including engine_id |
| create_spark_engineB | Create a new Spark engine in watsonx.data. Args: origin: Engine origin - "native", "external", or "discover" display_name: Display name for the engine storage_name: Storage/bucket name for engine_home (REQUIRED) associated_catalogs: List of catalog IDs to associate description: Engine description default_version: Spark version (default: "3.5") default_config: Additional engine configuration tags: Tags for the engine Returns: Dict with created engine details including engine_id |
| list_enginesA | List available Presto, Prestissimo, and Spark compute engines in watsonx.data. Args: engine_type: Optional filter - "presto", "prestissimo", "spark", or None for all engines Returns: Dict with: - engines: List of engine objects with engine_id, display_name, type, status, size (if applicable), created_on, created_by, associated_catalogs - summary: Counts by type and status (total_count, presto_count, prestissimo_count, spark_count, by_status) |
| pause_presto_engineB | Pause a running Presto engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with pause operation status and engine state transition |
| pause_prestissimo_engineB | Pause a running Prestissimo engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with pause operation status and engine state transition |
| pause_spark_engineA | Pause a running Spark engine in watsonx.data (SAAS only). Args: engine_id: Engine identifier force: Force pause even if applications are running (default: False) Returns: Dict with pause operation status |
| restart_presto_engineC | Restart a Presto engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with engine_id, status, message, response |
| restart_prestissimo_engineC | Restart a Prestissimo engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with engine_id, status, message, response |
| resume_presto_engineC | Resume a paused Presto engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with resume operation status and engine state transition |
| resume_prestissimo_engineB | Resume a paused Prestissimo engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with resume operation status and engine state transition |
| resume_spark_engineB | Resume a paused Spark engine in watsonx.data (SAAS only). Args: engine_id: Engine identifier Returns: Dict with resume operation status |
| scale_presto_engineA | Scale a Presto engine by adjusting coordinator and worker node counts in watsonx.data. RECOMMENDED NODE TYPES: "starter" or "cache_optimized" (other types may be available) SCALING CAPABILITIES:
API REQUIREMENT: Must provide BOTH coordinator AND worker configurations together. Args: engine_id: Engine identifier coordinator_node_type: Typically "starter" or "cache_optimized". Can be different from current type. coordinator_quantity: Number of coordinator nodes (must be 1 for Presto) worker_node_type: Typically "starter" or "cache_optimized". Can be different from coordinator_node_type. worker_quantity: Number of worker nodes (1-18 recommended, up to 50 may be supported) Returns: Dict with scaling operation status and new node configuration |
| scale_prestissimo_engineA | Scale a Prestissimo engine by adjusting coordinator and worker node counts in watsonx.data. RECOMMENDED NODE TYPES: "starter" or "cache_optimized" (other types may be available) SCALING CAPABILITIES:
API REQUIREMENT: Must provide BOTH coordinator AND worker configurations together. Args: engine_id: Engine identifier coordinator_node_type: Typically "starter" or "cache_optimized". Can be different from current type. coordinator_quantity: Number of coordinator nodes (must be 1 for Prestissimo) worker_node_type: Typically "starter" or "cache_optimized". Can be different from coordinator_node_type. worker_quantity: Number of worker nodes (1-18 recommended, up to 50 may be supported) Returns: Dict with scaling operation status and new node configuration |
| scale_spark_engineA | Scale a Spark engine by adjusting node count in watsonx.data (SAAS only). Args: engine_id: Engine identifier number_of_nodes: Target node count (1-1000) Returns: Dict with scaling operation status (202 Accepted - asynchronous operation) |
| update_presto_engineB | Update Presto engine configuration in watsonx.data. Args: engine_id: Engine identifier description: Updated description display_name: Updated display name engine_properties: Engine configuration properties engine_restart: Set to "force" to trigger restart after update remove_engine_properties: Properties to remove tags: Updated tags Returns: Dict with updated engine configuration |
| update_prestissimo_engineA | Update Prestissimo engine configuration in watsonx.data. IMPORTANT: Due to a known API limitation, the Args: engine_id: Engine identifier. description: Updated description (1-50 characters). Must be accompanied by at least one other updatable field in the same request. display_name: Updated display name. properties: Engine configuration properties. restart_type: Set to "force" to trigger a restart after the update. remove_engine_properties: Properties to remove. tags: Updated tags. Returns: Dict with updated engine configuration |
| update_spark_engineA | Update Spark engine configuration in watsonx.data. Args: engine_id: Engine identifier description: Updated description display_name: Updated display name configuration: Engine configuration (default_config, default_version, engine_home) tags: Updated tags Returns: Dict with updated engine configuration Note: Spark engines do NOT support engine_restart parameter. Configuration changes may require manual restart. |
| cancel_ingestion_jobC | Cancel a data ingestion job. Args: job_id: Job identifier Returns: Dict with cancellation status |
| create_ingestion_jobA | Create a data ingestion job to load data into watsonx.data. ") escape_character: CSV escape character (default: "") header: Whether CSV has header row (default: true) encoding: File encoding (default: "UTF-8") driver_memory: Spark driver memory (default: "2G") driver_cores: Spark driver cores (default: 1) executor_memory: Spark executor memory (default: "2G") executor_cores: Spark executor cores (default: 1) num_executors: Number of Spark executors (default: 1) |
| get_ingestion_jobB | Get detailed status of a data ingestion job. Args: job_id: Job identifier Returns: Dict with detailed job status, configuration, and execution details |
| list_ingestion_jobsA | List data ingestion jobs on watsonx.data. Args: start: Offset for pagination (default: 0, 0-based) limit: Number of jobs per page (default: 10, max: 100, -1 for all) Returns: Dict with ingestion_jobs list containing job details |
| get_instance_detailsA | Get watsonx.data instance information including status, version, region, and enabled features. Returns: Dict with instance details: - instance_id: Instance CRN identifier - region: IBM Cloud region (e.g., "us-south") - status: Instance status ("active", "provisioning", "inactive", "failed") - version: watsonx.data version - account_type: Account type ("TRIAL", "ENTERPRISE", "LITE", "STANDARD") - serverless_spark_enabled: Whether Spark is available - public_endpoints_enabled: Public internet access enabled - console_url: Web console URL |
| execute_insertA | Execute INSERT queries against watsonx.data. Args: sql: SQL INSERT query to execute (must start with INSERT) catalog_name: Target catalog (e.g., "iceberg_data", "hive_data") schema_name: Default schema for unqualified table names engine_id: Engine to run query on (from list_engines, must be running) Returns: Dict with: - query_id: Unique query identifier - rows_inserted: Number of rows inserted (if available) - execution_time_ms: Query duration in milliseconds - status: Query execution status - catalog_name, schema_name: Echo of inputs |
| execute_selectA | Execute read-only SELECT queries against watsonx.data. Args: sql: SQL SELECT query to execute (must start with SELECT) catalog_name: Target catalog (e.g., "iceberg_data", "tpch") schema_name: Default schema for unqualified table names engine_id: Engine to run query on (from list_engines, must be running) limit: Max rows to return (default: 500 if no LIMIT in query). Note: Using high limits will consume more tokens. Returns: Dict with: - query_id: Unique query identifier - columns: List of {name, type} objects - rows: List of row data (list of lists) - row_count: Number of rows returned - execution_time_ms: Query duration in milliseconds - catalog_name, schema_name: Echo of inputs |
| execute_updateA | Execute UPDATE queries against watsonx.data. Args: sql: SQL UPDATE query to execute (must start with UPDATE) catalog_name: Target catalog (e.g., "iceberg_data", "hive_data") schema_name: Default schema for unqualified table names engine_id: Engine to run query on (from list_engines, must be running) Returns: Dict with: - query_id: Unique query identifier - rows_updated: Number of rows updated (if available) - execution_time_ms: Query duration in milliseconds - status: Query execution status - catalog_name, schema_name: Echo of inputs Note: Iceberg table updates require at least format version 2 and update mode must be merge-on-read. Use ALTER TABLE to configure: SET PROPERTIES 'format-version' = '2', 'write.update.mode' = 'merge-on-read' |
| explain_analyze_queryA | Get detailed query analysis with execution statistics in watsonx.data. Args: engine_id: Presto or Prestissimo engine identifier statement: SQL query to analyze. If query fails, consider using fully qualified table names (catalog.schema.table) engine_type: Engine type - "presto" or "prestissimo" (default: "presto") verbose: Include detailed statistics Returns: Dict with engine_id, engine_type, statement, analysis, and full response |
| explain_queryA | Get query execution plan without running the query in watsonx.data. Args: engine_id: Presto or Prestissimo engine identifier statement: SQL query to explain. If query fails, consider using fully qualified table names (catalog.schema.table) engine_type: Engine type - "presto" or "prestissimo" (default: "presto") format: Output format - "json" or "text" type: Explain type - "logical", "distributed", "validate", or "io" Returns: Dict with engine_id, engine_type, statement, plan, and full response |
| get_spark_application_statusA | Get detailed status of a Spark application. Args: engine_id: Spark engine identifier application_id: Application identifier Returns: Dict with detailed application status including: - application_id: Application identifier - state: Current state (e.g., "running", "finished", "failed") - start_time: Application start timestamp - end_time: Application end timestamp (if completed) - spark_version: Spark version used - application_details: Configuration and runtime details |
| list_spark_applicationsB | List Spark applications on a Spark engine. Args: engine_id: (required) Spark engine identifier state: (optional) Filter by application state (e.g., ["running", "finished", "failed"]) limit: (optional) Maximum number of applications to return (1-1000). Recommended to use smaller values (e.g., 10-50) to avoid exhausting tokens. Returns: Dict with applications list containing application details |
| stop_spark_applicationA | Stop and remove a Spark application. This will terminate a running application and remove it from the engine's history. Args: engine_id: Spark engine identifier application_id: Application identifier to stop Returns: Dict with operation status |
| submit_spark_applicationA | Submit a Spark application for execution on a Spark engine. Args: engine_id: Spark engine identifier application: Application file path (JAR, Python, R file) arguments: Application arguments array conf: Spark configuration properties (e.g., {"spark.executor.memory": "2g"}) env: Environment variables name: Application name (will be added to conf as spark.app.name) job_endpoint: External job endpoint service_instance_id: Service instance ID - "iae" or "emr" type: Engine type - "spark" or "gluten" context_type: Context type - "project", "git_project", or "space" volumes: Volume mounts (watsonx.data software only). List of dicts with: - name: volume name - mount_path: path in spark cluster (e.g., "/mount/path") - source_sub_path: path in volume to mount (e.g., "/source/path") - read_only: boolean flag Returns: Dict with application_id, state, and submission details Examples: Minimal configuration for IBM Cloud Object Storage using cos:// protocol: |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/IBM/ibm-watsonxdata-mcp-server'
If you have feedback or need assistance with the MCP directory API, please join our Discord server