Schema | IBM watsonx.data MCP Server

IBM watsonx.data MCP Server

Official

by IBM

Overview Schema Related Servers Score Discussions

Python

Remote

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description
`WATSONX_DATA_API_KEY`	Yes	IBM Cloud API key for authentication
`WATSONX_DATA_BASE_URL`	Yes	The base URL of the watsonx.data instance, e.g., https://us-south.lakehouse.cloud.ibm.com/lakehouse/api
`WATSONX_DATA_INSTANCE_ID`	Yes	The CRN (Cloud Resource Name) of the watsonx.data instance

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": true }
`logging`	{}
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`extensions`	{ "io.modelcontextprotocol/ui": {} }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
add_columnsA	Add one or more columns to a table in a watsonx.data schema. Note: Data types are typically specified in lowercase (e.g., "varchar", "int", "decimal"). While the API pattern allows both cases, lowercase is recommended for compatibility. Args: catalog_name: Catalog containing the table (e.g., "iceberg_data") schema_name: Schema containing the table table_name: Table to add columns to columns: List of column definitions, each with: name (required), type (required, lowercase recommended), comment (optional), extra (optional), precision (optional), scale (optional) engine_id: Engine ID to use for the operation (from list_engines) Returns: Dict with: - columns: List of added column details - total_count: Number of columns added
create_schemaA	Create a new schema in a watsonx.data catalog. IMPORTANT RESTRICTIONS: - Engine must be a Presto or Prestissimo engine (Spark engines are not supported) - Catalog must be an object storage catalog (Iceberg or Hive format) - Datasource catalogs (e.g., from external databases) are NOT supported for schema creation Args: catalog_id: Catalog identifier - must be an object storage catalog (e.g., "iceberg_data", "hive_data") schema_name: Name for the new schema engine_id: Engine ID to use for the operation - must be a Presto or Prestissimo engine (from list_engines) custom_path: Path within bucket where schema will be created (REQUIRED, must be at least 1 character). If unsure, use the schema_name as the custom_path. storage_name: Storage/bucket name for the schema (REQUIRED for object storage catalogs) Returns: Dict with: - name: Name of the created schema - catalog_name: Parent catalog name - custom_path: Custom path used - storage_name: Storage name if specified Raises: Error if: - Engine is not Presto or Prestissimo type - Catalog is a datasource catalog (not object storage) - custom_path is empty or storage_name is not provided
describe_tableA	Get detailed schema and metadata for a watsonx.data table. Args: catalog_name: Catalog containing the table (e.g., "iceberg_data", "tpch") schema_name: Schema containing the table (from list_schemas) table_name: Table to describe (from list_tables) engine_id: Engine ID for metadata retrieval (from list_engines) Returns: Dict with: - name: Table name - catalog_name, schema_name: Echo of inputs - columns: List of column objects with: - name: Column name (required) - type: SQL data type (required) - comment: Optional comment/description - extra: Optional extra attributes (e.g., AUTO_INCREMENT) - length: Optional length for VARCHAR/CHAR types - precision: Optional precision for DECIMAL types - scale: Optional scale for DECIMAL types - column_count: Total number of columns - engine_id: Echo of input
list_schemasA	List database schemas in a watsonx.data catalog. Args: catalog_name: Catalog to list schemas from (e.g., "iceberg_data", "hive_data", "tpch") engine_id: Engine ID for metadata queries (from list_engines) Returns: Dict with: - schemas: List of schema objects with schema_name, catalog_name - total_count: Number of schemas found - catalog_name, engine_id: Echo of inputs
list_tablesA	List tables in a watsonx.data schema. Args: catalog_name: Catalog containing the schema (e.g., "iceberg_data", "hive_data") schema_name: Schema/database containing tables (from list_schemas) engine_id: Engine ID for metadata queries (from list_engines) Returns: Dict with: - tables: List of table names (strings) - total_count: Number of tables in schema - catalog_name, schema_name, engine_id: Echo of inputs
rename_columnA	Rename a column in a table in a watsonx.data schema. Args: catalog_name: Catalog containing the table (e.g., "iceberg_data") schema_name: Schema containing the table table_name: Table containing the column column_name: Current column name new_column_name: New name for the column engine_id: Engine ID to use for the operation (from list_engines) Returns: Dict with column details including the new name
rename_tableA	Rename a table in a watsonx.data schema. Args: catalog_name: Catalog containing the table (e.g., "iceberg_data") schema_name: Schema containing the table table_name: Current table name new_table_name: New name for the table engine_id: Engine ID to use for the operation (from list_engines) Returns: Dict with table details including the new name
create_presto_engineA	Create a new Presto engine in watsonx.data. EXAMPLE PAYLOAD: { "origin": "native", "display_name": "My-Presto-Engine", "description": "Presto engine with autoscaling", "tags": [], "associated_catalogs": [], "configuration": { "size_config": "custom", "coordinator": { "node_type": "starter", "quantity": 1 }, "worker": { "node_type": "starter", "quantity": 1 }, "autoscaling_enabled": true, "autoscaling_config": { "type": "cpu", "target": 40, "min_worker_quantity": 1, "max_worker_quantity": 18, "query_termination_grace_period_min": 1, "scale_in_stabilization_window_min": 5, "scaling_step_size": 1 } } } Args: origin: (required) "native" display_name: (required) Display name for the engine configuration: (required) Engine configuration with required fields: - size_config: (required) "custom" (recommended) or predefined options (may be supported) - coordinator: (required) {"node_type": typically "starter" or "cache_optimized", "quantity": 1} - worker: (required) {"node_type": typically "starter" or "cache_optimized", "quantity": 1-18 recommended} - autoscaling_enabled: (optional) boolean to enable autoscaling - autoscaling_config: (required if autoscaling_enabled is true) autoscaling configuration object (see AUTOSCALING section) associated_catalogs: (optional) List of catalog names to associate description: (optional) Engine description 50 characters max engine_id: (optional) Custom engine ID (must match pattern: presto-0 through presto-1000) tags: (optional) Tags for the engine AUTOSCALING (OPTIONAL): To enable autoscaling, include these fields in the configuration: autoscaling_enabled: true (boolean) autoscaling_config: { "type": "cpu" or "memory", "target": 1-100 (target utilization percentage, e.g., 40), "min_worker_quantity": 1-18 (minimum workers), "max_worker_quantity": 1-18 (maximum workers), "query_termination_grace_period_min": 1-120 (grace period before terminating queries), "scale_in_stabilization_window_min": 5-60 (stabilization window for scale-in), "scaling_step_size": 1-18 (nodes to add/remove per scaling action) } PREDEFINED SIZE CONFIGS: If using predefined configs, exact node types and quantities must match: starter: 1 coordinator + 1 worker (both bx2.48x192) small: 1 coordinator + 3 workers (both ox2.16x128) medium: 1 coordinator + 6 workers (both ox2.16x128) large: 1 coordinator + 12 workers (both ox2.16x128) Returns: Dict with created engine details including engine_id
create_prestissimo_engineA	Create a new Prestissimo engine in watsonx.data. EXAMPLE PAYLOAD: { "origin": "native", "display_name": "My-Prestissimo-Engine", "description": "Prestissimo engine for fast queries", "tags": [], "associated_catalogs": [], "configuration": { "size_config": "custom", "coordinator": { "node_type": "starter", "quantity": 1 }, "worker": { "node_type": "starter", "quantity": 1 } } } Args: origin: "native" display_name: (required) Display name for the engine configuration: (required) Engine configuration with required fields: - size_config: (required) "custom" (recommended) or predefined options (may be supported) - coordinator: (required) {"node_type": typically "starter" or "cache_optimized", "quantity": 1} - worker: (required) {"node_type": typically "starter" or "cache_optimized", "quantity": 1-18 recommended} associated_catalogs: (optional) List of catalog names to associate description: (optional) Engine description 50 characters max engine_id: (optional) Custom engine ID (must match pattern: prestissimo-0 to prestissimo-1000) tags: (optional) Tags for the engine CUSTOM SIZE CONFIG (RECOMMENDED): Coordinator: 1 node (always), node_type: "starter" or "cache_optimized" Worker: 1-18 nodes (recommended), node_type: "starter" or "cache_optimized" Node types do NOT need to match (e.g., starter coordinator + cache_optimized worker is allowed) PREDEFINED SIZE CONFIGS: If using predefined configs, exact node types and quantities must match: starter: 1 coordinator + 1 worker (both bx2.48x192) small: 1 coordinator + 3 workers (both ox2.16x128) medium: 1 coordinator + 6 workers (both ox2.16x128) large: 1 coordinator + 12 workers (both ox2.16x128) Returns: Dict with created engine details including engine_id
create_spark_engineB	Create a new Spark engine in watsonx.data. Args: origin: Engine origin - "native", "external", or "discover" display_name: Display name for the engine storage_name: Storage/bucket name for engine_home (REQUIRED) associated_catalogs: List of catalog IDs to associate description: Engine description default_version: Spark version (default: "3.5") default_config: Additional engine configuration tags: Tags for the engine Returns: Dict with created engine details including engine_id
list_enginesA	List available Presto, Prestissimo, and Spark compute engines in watsonx.data. Args: engine_type: Optional filter - "presto", "prestissimo", "spark", or None for all engines Returns: Dict with: - engines: List of engine objects with engine_id, display_name, type, status, size (if applicable), created_on, created_by, associated_catalogs - summary: Counts by type and status (total_count, presto_count, prestissimo_count, spark_count, by_status)
pause_presto_engineB	Pause a running Presto engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with pause operation status and engine state transition
pause_prestissimo_engineB	Pause a running Prestissimo engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with pause operation status and engine state transition
pause_spark_engineA	Pause a running Spark engine in watsonx.data (SAAS only). Args: engine_id: Engine identifier force: Force pause even if applications are running (default: False) Returns: Dict with pause operation status
restart_presto_engineC	Restart a Presto engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with engine_id, status, message, response
restart_prestissimo_engineC	Restart a Prestissimo engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with engine_id, status, message, response
resume_presto_engineC	Resume a paused Presto engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with resume operation status and engine state transition
resume_prestissimo_engineB	Resume a paused Prestissimo engine in watsonx.data. Args: engine_id: Engine identifier Returns: Dict with resume operation status and engine state transition
resume_spark_engineB	Resume a paused Spark engine in watsonx.data (SAAS only). Args: engine_id: Engine identifier Returns: Dict with resume operation status
scale_presto_engineA	Scale a Presto engine by adjusting coordinator and worker node counts in watsonx.data. RECOMMENDED NODE TYPES: "starter" or "cache_optimized" (other types may be available) SCALING CAPABILITIES: Coordinator quantity: Always 1 (cannot be changed) Worker quantity: 1-18 (recommended), up to 50 may be supported Node types CAN be changed during scaling (e.g., from "starter" to "cache_optimized") Coordinator and worker do NOT need to match node types API REQUIREMENT: Must provide BOTH coordinator AND worker configurations together. Args: engine_id: Engine identifier coordinator_node_type: Typically "starter" or "cache_optimized". Can be different from current type. coordinator_quantity: Number of coordinator nodes (must be 1 for Presto) worker_node_type: Typically "starter" or "cache_optimized". Can be different from coordinator_node_type. worker_quantity: Number of worker nodes (1-18 recommended, up to 50 may be supported) Returns: Dict with scaling operation status and new node configuration
scale_prestissimo_engineA	Scale a Prestissimo engine by adjusting coordinator and worker node counts in watsonx.data. RECOMMENDED NODE TYPES: "starter" or "cache_optimized" (other types may be available) SCALING CAPABILITIES: Coordinator quantity: Always 1 (cannot be changed) Worker quantity: 1-18 (recommended), up to 50 may be supported Node types CAN be changed during scaling (e.g., from "starter" to "cache_optimized") Coordinator and worker do NOT need to match node types API REQUIREMENT: Must provide BOTH coordinator AND worker configurations together. Args: engine_id: Engine identifier coordinator_node_type: Typically "starter" or "cache_optimized". Can be different from current type. coordinator_quantity: Number of coordinator nodes (must be 1 for Prestissimo) worker_node_type: Typically "starter" or "cache_optimized". Can be different from coordinator_node_type. worker_quantity: Number of worker nodes (1-18 recommended, up to 50 may be supported) Returns: Dict with scaling operation status and new node configuration
scale_spark_engineA	Scale a Spark engine by adjusting node count in watsonx.data (SAAS only). Args: engine_id: Engine identifier number_of_nodes: Target node count (1-1000) Returns: Dict with scaling operation status (202 Accepted - asynchronous operation)
update_presto_engineB	Update Presto engine configuration in watsonx.data. Args: engine_id: Engine identifier description: Updated description display_name: Updated display name engine_properties: Engine configuration properties engine_restart: Set to "force" to trigger restart after update remove_engine_properties: Properties to remove tags: Updated tags Returns: Dict with updated engine configuration
update_prestissimo_engineA	Update Prestissimo engine configuration in watsonx.data. IMPORTANT: Due to a known API limitation, the `description` field cannot be updated on its own. If you want to update the description, you must include at least one additional field in the same call (e.g., `display_name`, `tags`, or `properties`). Calls that modify only `description` will fail. Args: engine_id: Engine identifier. description: Updated description (1-50 characters). Must be accompanied by at least one other updatable field in the same request. display_name: Updated display name. properties: Engine configuration properties. restart_type: Set to "force" to trigger a restart after the update. remove_engine_properties: Properties to remove. tags: Updated tags. Returns: Dict with updated engine configuration
update_spark_engineA	Update Spark engine configuration in watsonx.data. Args: engine_id: Engine identifier description: Updated description display_name: Updated display name configuration: Engine configuration (default_config, default_version, engine_home) tags: Updated tags Returns: Dict with updated engine configuration Note: Spark engines do NOT support engine_restart parameter. Configuration changes may require manual restart.
cancel_ingestion_jobC	Cancel a data ingestion job. Args: job_id: Job identifier Returns: Dict with cancellation status
create_ingestion_jobA	Create a data ingestion job to load data into watsonx.data. Args: job_id: Unique job identifier (e.g., "ingestion-1234567890") catalog: Target catalog name schema: Target schema name table: Target table name file_paths: Source file path (e.g., "s3://bucket-name/file.csv") file_type: Source file type - "csv", "parquet", "json", "orc", "avro" (default: "csv") bucket_name: S3 bucket name (extracted from file_paths if not provided) bucket_type: Bucket type - "amazon_s3", "aws_s3", "minio", "ibm_cos", "ibm_ceph", "adls_gen1", "adls_gen2", "google_cs", "ibm_storage_scale", "ozone" (default: "ibm_cos") write_mode: Write mode - "append", "overwrite" (default: "append") engine_id: Spark engine ID to use for ingestion field_delimiter: CSV field delimiter (default: ",") line_delimiter: CSV line delimiter (default: " ") escape_character: CSV escape character (default: "") header: Whether CSV has header row (default: true) encoding: File encoding (default: "UTF-8") driver_memory: Spark driver memory (default: "2G") driver_cores: Spark driver cores (default: 1) executor_memory: Spark executor memory (default: "2G") executor_cores: Spark executor cores (default: 1) num_executors: Number of Spark executors (default: 1) `Returns: Dict with job_id, status, and creation details`
get_ingestion_jobB	Get detailed status of a data ingestion job. Args: job_id: Job identifier Returns: Dict with detailed job status, configuration, and execution details
list_ingestion_jobsA	List data ingestion jobs on watsonx.data. Args: start: Offset for pagination (default: 0, 0-based) limit: Number of jobs per page (default: 10, max: 100, -1 for all) Returns: Dict with ingestion_jobs list containing job details
get_instance_detailsA	Get watsonx.data instance information including status, version, region, and enabled features. Returns: Dict with instance details: - instance_id: Instance CRN identifier - region: IBM Cloud region (e.g., "us-south") - status: Instance status ("active", "provisioning", "inactive", "failed") - version: watsonx.data version - account_type: Account type ("TRIAL", "ENTERPRISE", "LITE", "STANDARD") - serverless_spark_enabled: Whether Spark is available - public_endpoints_enabled: Public internet access enabled - console_url: Web console URL
execute_insertA	Execute INSERT queries against watsonx.data. Args: sql: SQL INSERT query to execute (must start with INSERT) catalog_name: Target catalog (e.g., "iceberg_data", "hive_data") schema_name: Default schema for unqualified table names engine_id: Engine to run query on (from list_engines, must be running) Returns: Dict with: - query_id: Unique query identifier - rows_inserted: Number of rows inserted (if available) - execution_time_ms: Query duration in milliseconds - status: Query execution status - catalog_name, schema_name: Echo of inputs
execute_selectA	Execute read-only SELECT queries against watsonx.data. Args: sql: SQL SELECT query to execute (must start with SELECT) catalog_name: Target catalog (e.g., "iceberg_data", "tpch") schema_name: Default schema for unqualified table names engine_id: Engine to run query on (from list_engines, must be running) limit: Max rows to return (default: 500 if no LIMIT in query). Note: Using high limits will consume more tokens. Returns: Dict with: - query_id: Unique query identifier - columns: List of {name, type} objects - rows: List of row data (list of lists) - row_count: Number of rows returned - execution_time_ms: Query duration in milliseconds - catalog_name, schema_name: Echo of inputs
execute_updateA	Execute UPDATE queries against watsonx.data. Args: sql: SQL UPDATE query to execute (must start with UPDATE) catalog_name: Target catalog (e.g., "iceberg_data", "hive_data") schema_name: Default schema for unqualified table names engine_id: Engine to run query on (from list_engines, must be running) Returns: Dict with: - query_id: Unique query identifier - rows_updated: Number of rows updated (if available) - execution_time_ms: Query duration in milliseconds - status: Query execution status - catalog_name, schema_name: Echo of inputs Note: Iceberg table updates require at least format version 2 and update mode must be merge-on-read. Use ALTER TABLE to configure: SET PROPERTIES 'format-version' = '2', 'write.update.mode' = 'merge-on-read'
explain_analyze_queryA	Get detailed query analysis with execution statistics in watsonx.data. Args: engine_id: Presto or Prestissimo engine identifier statement: SQL query to analyze. If query fails, consider using fully qualified table names (catalog.schema.table) engine_type: Engine type - "presto" or "prestissimo" (default: "presto") verbose: Include detailed statistics Returns: Dict with engine_id, engine_type, statement, analysis, and full response
explain_queryA	Get query execution plan without running the query in watsonx.data. Args: engine_id: Presto or Prestissimo engine identifier statement: SQL query to explain. If query fails, consider using fully qualified table names (catalog.schema.table) engine_type: Engine type - "presto" or "prestissimo" (default: "presto") format: Output format - "json" or "text" type: Explain type - "logical", "distributed", "validate", or "io" Returns: Dict with engine_id, engine_type, statement, plan, and full response
get_spark_application_statusA	Get detailed status of a Spark application. Args: engine_id: Spark engine identifier application_id: Application identifier Returns: Dict with detailed application status including: - application_id: Application identifier - state: Current state (e.g., "running", "finished", "failed") - start_time: Application start timestamp - end_time: Application end timestamp (if completed) - spark_version: Spark version used - application_details: Configuration and runtime details
list_spark_applicationsB	List Spark applications on a Spark engine. Args: engine_id: (required) Spark engine identifier state: (optional) Filter by application state (e.g., ["running", "finished", "failed"]) limit: (optional) Maximum number of applications to return (1-1000). Recommended to use smaller values (e.g., 10-50) to avoid exhausting tokens. Returns: Dict with applications list containing application details
stop_spark_applicationA	Stop and remove a Spark application. This will terminate a running application and remove it from the engine's history. Args: engine_id: Spark engine identifier application_id: Application identifier to stop Returns: Dict with operation status
submit_spark_applicationA	Submit a Spark application for execution on a Spark engine. Args: engine_id: Spark engine identifier application: Application file path (JAR, Python, R file) arguments: Application arguments array conf: Spark configuration properties (e.g., {"spark.executor.memory": "2g"}) env: Environment variables name: Application name (will be added to conf as spark.app.name) job_endpoint: External job endpoint service_instance_id: Service instance ID - "iae" or "emr" type: Engine type - "spark" or "gluten" context_type: Context type - "project", "git_project", or "space" volumes: Volume mounts (watsonx.data software only). List of dicts with: - name: volume name - mount_path: path in spark cluster (e.g., "/mount/path") - source_sub_path: path in volume to mount (e.g., "/source/path") - read_only: boolean flag Returns: Dict with application_id, state, and submission details Examples: Minimal configuration for IBM Cloud Object Storage using cos:// protocol: { "engine_id": "spark398", "application": "cos://bucket.instance/app.py", "arguments": ["cos://bucket.instance/data.csv"], "conf": { "spark.hadoop.fs.cos.instance.endpoint": "s3.direct.us-east.cloud-object-storage.appdomain.cloud", "spark.hadoop.fs.cos.instance.access.key": "your-access-key", "spark.hadoop.fs.cos.instance.secret.key": "your-secret-key" } } Minimal configuration for IBM Cloud Object Storage using s3a:// protocol: { "engine_id": "spark398", "application": "s3a://bucket/app.py", "arguments": ["s3a://bucket/data.csv"], "conf": { "spark.hadoop.fs.s3a.bucket.bucket.access.key": "your-access-key", "spark.hadoop.fs.s3a.bucket.bucket.secret.key": "your-secret-key", "spark.hadoop.fs.s3a.bucket.bucket.aws.credentials.provider": "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider" } } Optional conf parameters (uses engine defaults if not specified): - spark.app.name: Custom application name - ae.spark.driver.log.level / ae.spark.executor.log.level: Log levels - spark.driver.cores / spark.driver.memory: Driver resources - spark.executor.cores / spark.executor.memory: Executor resources

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/IBM/ibm-watsonxdata-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server