Skip to main content
Glama

Databricks MCP Server

by samhavens
  • Linux
  • Apple

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
DATABRICKS_HOSTYesYour Databricks instance URL (e.g., https://your-databricks-instance.azuredatabricks.net)
DATABRICKS_TOKENYesYour Databricks personal access token

Schema

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Tools

Functions exposed to the LLM to take actions

NameDescription
list_clusters

List all Databricks clusters

create_cluster

Create a new Databricks cluster

terminate_cluster

Terminate a Databricks cluster

get_cluster

Get information about a specific Databricks cluster

start_cluster

Start a terminated Databricks cluster

list_jobs

List Databricks jobs with pagination and filtering.

Args: limit: Number of jobs to return (default: 25, keeps response under token limits) offset: Starting position for pagination (default: 0, use pagination_info.next_offset for next page) created_by: Filter by creator email (e.g. 'user@company.com'), case-insensitive, optional include_run_status: Include latest run status and duration (default: true, set false for faster response) Returns: JSON with jobs array and pagination_info. Each job includes latest_run with state, duration_minutes, etc. Use pagination_info.next_offset for next page. Total jobs shown in pagination_info.total_jobs.
list_job_runs

List recent job runs with detailed status and duration information.

Args: job_id: Specific job ID to list runs for (optional, omit to see runs across all jobs) limit: Number of runs to return (default: 10, most recent first) Returns: JSON with runs array. Each run includes state (RUNNING/SUCCESS/FAILED), result_state, duration_minutes for completed runs, current_duration_minutes for running jobs.
run_job

Run a Databricks job

list_notebooks

List notebooks in a workspace directory

export_notebook

Export a notebook from the workspace

list_files

List files and directories in DBFS

execute_sql

Execute a SQL statement and wait for completion (blocking)

execute_sql_nonblocking

Start SQL statement execution and return immediately with statement_id (non-blocking)

get_sql_status

Get the status and results of a SQL statement by statement_id

create_notebook

Create a new notebook in the Databricks workspace

create_job

Create a new Databricks job to run a notebook (uses serverless by default)

upload_file_to_volume
Upload a local file to a Databricks Unity Catalog volume. Args: local_file_path: Path to local file (e.g. './data/products.json') volume_path: Full volume path (e.g. '/Volumes/catalog/schema/volume/file.json') overwrite: Whether to overwrite existing file (default: False) Returns: JSON with upload results including success status, file size in MB, and upload time. Example: # Upload large dataset to volume result = upload_file_to_volume( local_file_path='./stark_export/products_full.json', volume_path='/Volumes/kbqa/stark_mas_eval/stark_raw_data/products_full.json', overwrite=True ) Note: Handles large files (multi-GB) with progress tracking and proper error handling. Perfect for uploading extracted datasets to Unity Catalog volumes for processing.
upload_file_to_dbfs
Upload a local file to Databricks File System (DBFS). Args: local_file_path: Path to local file (e.g. './data/notebook.py') dbfs_path: DBFS path (e.g. '/tmp/uploaded/notebook.py') overwrite: Whether to overwrite existing file (default: True) Returns: JSON with upload results including success status, file size, and upload time. Example: # Upload script to DBFS result = upload_file_to_dbfs( local_file_path='./scripts/analysis.py', dbfs_path='/tmp/analysis.py', overwrite=True ) Note: For large files (>10MB), uses chunked upload with proper retry logic. DBFS is good for temporary files, scripts, and smaller datasets.
list_volume_files
List files and directories in a Unity Catalog volume. Args: volume_path: Volume path to list (e.g. '/Volumes/catalog/schema/volume/directory') Returns: JSON with directory listing including file names, sizes, and modification times. Example: # List files in volume directory files = list_volume_files('/Volumes/kbqa/stark_mas_eval/stark_raw_data/') Note: Returns detailed file information including sizes for managing large datasets.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/samhavens/databricks-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server