Skip to main content
Glama
kevintalbert

Cloudera Data Visualization MCP Server

by kevintalbert

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
CDV_API_KEYYesCDV API key
CDV_BASE_URLYesBase URL of your CDV deployment
MCP_TRANSPORTNoTransport protocol: stdio (default), http, or ssestdio

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}
logging
{}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
extensions
{
  "io.modelcontextprotocol/ui": {}
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
list_groupsA

List all groups defined in CDV.

get_groupA

Get a single CDV group by its numeric ID.

create_groupA

Create a new CDV group.

body fields:

  • name (str, required): group name

  • users (list[{id: int}], optional): list of user IDs to add

update_groupA

Update an existing CDV group by its numeric ID. Provide fields to change (e.g. name, users).

delete_groupB

Delete a CDV group by its numeric ID.

list_usersB

List all users defined in CDV.

get_userA

Get a single CDV user by their numeric ID.

update_userC

Update an existing CDV user by their numeric ID.

edit_user_profileB

Edit the profile of a CDV user identified by username (e.g. update email, name).

list_rolesB

List all roles defined in CDV.

get_roleB

Get a single CDV role by its numeric ID.

create_roleB

Create a new CDV role. body fields: name, permissions (list).

update_roleC

Update an existing CDV role by its numeric ID.

delete_roleB

Delete a CDV role by its numeric ID.

list_segmentsB

List all row-level security segments defined in CDV.

get_segmentA

Get a single CDV segment by its numeric ID.

create_segmentB

Create a new CDV row-level security segment. body fields: name, filter_definition.

update_segmentC

Update an existing CDV segment by its numeric ID.

delete_segmentB

Delete a CDV segment by its numeric ID.

list_filter_associationsA

List all filter associations (segment-to-user/group mappings) defined in CDV.

get_filter_associationB

Get a single CDV filter association by its numeric ID.

create_filter_associationC

Create a new CDV filter association linking a segment to users or groups.

update_filter_associationC

Update an existing CDV filter association by its numeric ID.

delete_filter_associationB

Delete a CDV filter association by its numeric ID.

list_workspacesA

STEP 4 of every visualization workflow — get the workspace_id required by create_smart_visual.

CDV workspaces organize visuals and dashboards (similar to folders). The workspace_id returned here is a REQUIRED parameter for create_smart_visual() and create_dashboard(). Without it, visual creation will fail.

Call this before creating any visual. Choose the workspace that matches the project context (e.g. a dedicated project workspace or "Public").

get_workspaceA

Get a single CDV workspace by its numeric ID.

create_workspaceB

Create a new CDV workspace.

body fields: name (str), desc (str), editable (bool), perms (list[str]), acl (list of [entry_type, permission, name] triplets).

update_workspaceC

Update an existing CDV workspace by its numeric ID.

delete_workspaceB

Delete a CDV workspace by its numeric ID.

list_datasetsA

STEP 2 of the workflow — list all datasets (named tables/views) available in CDV.

In CDV: Connection → Dataset → Visual → Dashboard A dataset is a named pointer to a specific table or SQL query within a connection. Visuals and dashboards are built on datasets (using their numeric dataset_id).

AFTER calling this tool, you will have dataset names and IDs, but NOT column names. ALWAYS follow up with: query_dataapi(dataconnection_id=<dc_id>, query="SELECT * FROM <schema.table> LIMIT 3") to discover the exact column names before creating any visual.

Workflow reminder:

  1. list_connections() → find dc_id (connection ID for SQL queries)

  2. list_datasets() → find dataset_id and table name (THIS TOOL)

  3. query_dataapi(...) → discover column names and sample data ← DO THIS NEXT

  4. list_workspaces() → find workspace_id for visual creation

  5. create_smart_visual(...)

  6. create_dashboard(...)

Only call create_dataset() if no suitable dataset exists AND the user explicitly confirms they want a new one. Always confirm with the user before creating anything.

get_datasetC

Get a single CDV dataset by its numeric ID.

create_datasetA

Create a new CDV dataset backed by an existing data connection.

A dataset points to a specific table or SQL query within a connection (dc_id). Visuals and dashboards are built on top of datasets.

IMPORTANT: Do NOT call this without first calling list_connections() to identify the right connection (dc_id) and list_datasets() to confirm no suitable dataset already exists. Always get explicit user confirmation before creating a new dataset.

body fields: dc_id (int), name (str), type (str), detail (str, e.g. schema.table), description (str), info (object), lvname (str), settings (object).

update_datasetC

Update an existing CDV dataset by its numeric ID.

delete_datasetB

Delete a CDV dataset by its numeric ID.

list_visualsA

List CDV visuals (dashboards). Optionally filter by dataset_id or workspace_id.

IMPORTANT — CDV API LIMITATION: This endpoint only returns dashboard-type visuals (type="dashboard"). Standalone chart visuals created via create_smart_visual() are NOT included in this listing even when they exist in the workspace.

To work with standalone chart visuals:

  • Use get_visual(object_id) with the id returned when the visual was created.

  • Use create_dashboard() to group multiple chart visuals into a visible dashboard.

After creating chart visuals with create_smart_visual(), always call create_dashboard() to combine them into a single navigable dashboard so the user can see them in the CDV UI.

get_visualA

Get a single CDV visual by its numeric ID.

create_visualA

Create a CDV visual using the raw admin API.

Required body fields: title (str), type (str), dataset_id (int), workspace_id (int). Optional: description (str), data (object — visual spec), perm (list[str]).

update_visualC

Update an existing CDV visual by its numeric ID.

delete_visualA

Delete a CDV visual or dashboard by its numeric ID.

WARNING — CASCADE DELETION: Deleting a dashboard (type="dashboard") also permanently deletes ALL chart visuals linked to it as widgets. If you need to preserve the chart visuals, note their IDs before deleting the dashboard.

create_dashboardA

Create a CDV dashboard that groups one or more chart visuals into a single visible view.

Use this tool after create_smart_visual() to make charts visible in the CDV workspace UI. Chart visuals created via the API are standalone artifacts; they only appear in the CDV workspace when placed inside a dashboard.

visual_ids: list of visual IDs to include (in display order, left-to-right, top-to-bottom). Record these IDs — deleting the dashboard also deletes all linked visuals. workspace_id: the workspace where the dashboard will be created (from list_workspaces()). dataset_id: optional — the primary dataset for the dashboard (for global filter context). Use the dataset_id shared by most of the included visuals. description: optional short description shown in the workspace.

Visuals are automatically tiled in a 2-column grid. Odd trailing visuals span full width.

WARNING: Deleting a dashboard (via delete_visual) permanently deletes all linked chart visuals. Always save the visual IDs before deleting a dashboard.

Returns the new dashboard's id and url.

Example workflow:

  1. call list_workspaces() → choose workspace_id

  2. call list_datasets() → confirm dataset_id with the user

  3. call create_smart_visual() × N → collect visual_ids

  4. call create_dashboard(title="My Dashboard", workspace_id=4, visual_ids=[131, 132, 133], dataset_id=12)

create_smart_visualA

Create a CDV chart visual that reliably renders data without SQL errors.

This tool only exposes visual types and column configurations that are confirmed to work through the CDV API. Unsupported patterns are rejected with a clear error and guidance on what to use instead.

SUPPORTED visual_type values:

  • "trellis-bars" Best for one measure vs. one dimension (horizontal bars).

  • "trellis-groupedbars" One SUM measure grouped by a color dimension.

  • "pie" One SUM measure broken down by one dimension.

NOT SUPPORTED via this tool (use CDV's interactive builder instead):

  • trellis-lines, trellis-areas (time-series; timestamp dimensions fail via API)

  • scatter, histogram, boxplot (require CDV builder shelf configuration)

  • table, crosstab (use query_dataapi for tabular results)

  • All other chart types

Each entry in columns must have:

  • column_name (str, required): exact column name in the dataset.

  • aggregate_function (str, optional): sum | avg | min | max Columns WITH this field are measures; columns WITHOUT are dimensions. IMPORTANT: "count" is not supported — use "sum" on a numeric column instead.

  • shelf (str, optional): explicit shelf override. Use "color_shelf" to add a grouping/color dimension to bar or pie charts.

COLUMN COMPATIBILITY RULES (enforced, CDV-side constraints):

  • At least one measure (column with aggregate_function) is required.

  • Column names that start with avg_, sum_, min_, max_, count_ CANNOT be used as aggregate targets (CDV's tokenizer confuses them with functions). Example: avg(avg_lead_time_days) fails. Use a different column.

  • Date/timestamp columns (names with "date", "time", "year", etc.) should NOT be used as dimensions — CDV fails to generate valid SQL for them. Use CDV's builder for time-series charts.

FILTERS ARE NOT SUPPORTED: CDV's filter SQL generation always produces bracket notation in the WHERE clause that Impala cannot parse. After creating the visual, open it in CDV's interactive builder to add filters manually.

EXAMPLES:

Total spend by supplier (bar chart): columns=[ {"column_name": "supplier_name"}, {"column_name": "total_price", "aggregate_function": "sum"}, ]

Supplier spend grouped by priority (grouped bar): visual_type="trellis-groupedbars", columns=[ {"column_name": "supplier_name"}, {"column_name": "priority_code", "shelf": "color_shelf"}, {"column_name": "total_price", "aggregate_function": "sum"}, ]

Spend breakdown by item (pie chart): visual_type="pie", columns=[ {"column_name": "item_description"}, {"column_name": "total_price", "aggregate_function": "sum"}, ]

VISIBILITY: Chart visuals do NOT appear in the CDV workspace until you call create_dashboard() with their IDs. Always follow up with create_dashboard().

Returns the created visual's metadata including its id, visual_id, and url.

list_connectionsA

STEP 1 of every data workflow — list all CDV data connections.

A data connection is the top-level link to an external database (Impala, Hive, etc.). The connection's numeric ID (dc_id / dataconnection_id) is used when: • Querying raw SQL via query_dataapi(dataconnection_id=, query="SELECT ...") • Creating a new dataset with create_dataset(body={"dc_id": , ...})

WORKFLOW — always run this FIRST, then:

  1. list_connections() → identify the right connection and its ID

  2. list_datasets() → find existing datasets built on that connection

  3. query_dataapi(...) → explore table columns and sample data

  4. list_workspaces() → get workspace_id for visual creation

  5. create_smart_visual(...) → create charts

  6. create_dashboard(...) → make charts visible in CDV

Never create a new connection unless the user explicitly requests it and no existing connection points to their data source.

get_connectionB

Get a single CDV data connection by its numeric ID.

create_connectionA

Create a new CDV data connection.

IMPORTANT: Do NOT call this without first calling list_connections() and confirming with the user that no existing connection points to their target data source.

body fields: name (str), type (str), connection_info (object with host, port, etc.).

update_connectionC

Update an existing CDV data connection by its numeric ID.

delete_connectionB

Delete a CDV data connection by its numeric ID.

export_connectionC

Export a CDV data connection definition by its numeric ID.

export_migrationA

Export all CDV visual artifacts (dashboards, datasets, connections) as a migration bundle.

import_migrationC

Import a CDV migration bundle.

query_dataapiA

STEP 3 & 5 of the workflow — explore columns, answer filtered questions, get raw data.

This is the MOST IMPORTANT tool for data exploration and filtered analysis. It runs arbitrary SQL against the database and returns structured results.

Returns: {"columns": [...], "rows": [{col: val}, ...]}

══ USE THIS TOOL FOR ══════════════════════════════════════════════════════

  1. COLUMN DISCOVERY (Step 3) — always do this before creating any visual: query_dataapi(dataconnection_id=10, query="SELECT * FROM schema.table_name LIMIT 3") → reveals exact column names, data types, and sample values. → column names are case-sensitive; use EXACTLY as returned here.

  2. FILTERED QUESTIONS — when the user asks about a specific subset of data: "Show shipping codes for Mock Vendor X" "What is Turbine Oil's price trend?" "Which priority-1 orders are overdue?" → create_smart_visual() CANNOT apply filters (CDV API limitation). → Use this tool with a WHERE clause instead, then present results as a table.

  3. COUNT / FREQUENCY questions — when the user wants counts: "What are the most common shipping codes?" → create_smart_visual() does NOT support COUNT aggregation. → Use this tool: query="SELECT col, COUNT(*) as cnt FROM ... GROUP BY col ORDER BY cnt DESC"

  4. TIME-SERIES queries — price trends, monthly patterns, etc.: → Time-based CDV visuals are blocked via the API. → Use this tool to fetch the trend data, then describe it or format it for Plotly.

  5. HEATMAPS / CROSS-TABS — e.g. "spend by destination and shipping code": → CDV has no heatmap type via the API. → Use this tool to get the pivot data, format it for plotly.graph_objects.Heatmap.

══ HOW TO USE ═════════════════════════════════════════════════════════════

Connection-based SQL (RECOMMENDED — most flexible): query_dataapi(dataconnection_id=<id_from_list_connections>, query="SELECT col1, SUM(col2) FROM schema.table WHERE col3='val' GROUP BY col1 ORDER BY 2 DESC LIMIT 20")

Important SQL notes: • Table names use schema.table format (e.g. logistics.procurement_transactions) • SQL reserved words (date, time, year, etc.) must be backtick-quoted: ✓ SELECT date, time FROM ... NOT: SELECT date, time FROM ... • Use standard Impala/Hive SQL syntax

Dataset-based query (simpler, but less flexible): query_dataapi(dataset=<id_from_list_datasets>, dimensions="col1,col2", aggregates="SUM(col3) as total", limit=20)

query_enhanced_data_apiB

Query data via the CDV Enhanced Data API (/api/data).

version: '0' for legacy API (provide dataset, limit, dimensions, aggregates), '1' for enhanced API (provide dsreq — a JSON-formatted dataset request string).

run_jobB

Trigger a rerun of one or more CDV scheduled jobs. Provide exactly one of schedule_ids (comma-separated IDs) or schedule_names (comma-separated names).

run_extractB

Run one or more CDV data extract jobs by their comma-separated IDs.

create_extractC

Create a CDV data extract job.

get_gc_monitorA

Check whether GC monitoring is currently enabled on the CDV server.

set_gc_monitorB

Enable or disable GC monitoring on the CDV server.

get_gc_statsA

Retrieve the current GC debug flags from the CDV server.

set_gc_statsB

Set GC debug flags on the CDV server. Requires sys_viewlogs permission.

get_log_levelsA

Retrieve the current log level for the root logger on the CDV server.

set_log_levelA

Set the log level for the root logger. Requires sys_viewlogs permission. level must be one of: CRITICAL, DEBUG, ERROR, FATAL, INFO, WARN, WARNING.

get_logger_levelB

Retrieve the current log level for a specific logger by name.

set_logger_levelC

Set the log level for a specific named logger. Requires sys_viewlogs permission. level must be one of: CRITICAL, DEBUG, ERROR, FATAL, INFO, WARN, WARNING.

get_toggle_cprofileA

Check whether cProfile profiling is enabled for a CDV server function. Defaults to sqlrun.views_jsonselect / jsonselect_parallel if not specified.

toggle_cprofileB

Toggle cProfile profiling on/off for a CDV server function. Requires sys_viewlogs permission. If profiling is already enabled, calling this disables it. strip_dirnames, print_callees, print_callers: 'yes' or 'no'.

reset_dataset_cacheB

Reset the query result cache for a specific CDV dataset. Requires ds_manage permission.

reset_dataconnection_cacheA

Reset the query result cache for a specific CDV data connection. Requires ds_manage permission.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kevintalbert/CDV-MCP-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server