Provides tools for interacting with Databricks, enabling management of clusters, jobs, and notebooks, execution of SQL queries, and access to schema documentation for silver and gold data layers.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Databricks MCP Serverquery the gold layer to find all users who subscribed in the last month"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Databricks MCP Server
This is a fork of the databricks-mcp-server with bug fixes and HIMS-specific changes. It's not a github fork because I wanted to keep this repo private.
With the enhancements in this package, you can run queries like this:
"Use the catalog resources to write me a sql query against databricks to find all users in the past month that went through the top of funnel and tell me whether they subscribed or didn't subscribe. When you're done, run the query and fix bugs."
With Claude Opus, the agent reads the schema resources, produces a somehow correct query, runs it, fixes it, and reports some results.
HIMS-specific Resources
In addition to tools, the server exposes MCP resources that provide schema reference documentation for the Databricks data layers:
databricks_gold_schema_reference (
databricks://schemas/gold-catalog-reference): Reference documentation for table schemas in the gold data layer (us_dpe_production_goldcatalog). Contains table names, column definitions, data types, and nullability for all gold-layer tables.databricks_silver_schema_reference (
databricks://schemas/silver-catalog-reference): Reference documentation for table schemas in the silver data layer (us_dpe_production_silvercatalog). Contains table names, column definitions, data types, and nullability for all silver-layer tables.
These resources allow LLMs to look up the exact schema of tables in the silver and gold catalogs so they can write accurate SQL queries and understand the data model without having to query INFORMATION_SCHEMA at runtime.
Available Tools
The Databricks MCP Server exposes the following tools:
list_clusters: List all Databricks clusters
create_cluster: Create a new Databricks cluster
terminate_cluster: Terminate a Databricks cluster
get_cluster: Get information about a specific Databricks cluster
start_cluster: Start a terminated Databricks cluster
list_jobs: List all Databricks jobs
run_job: Run a Databricks job
list_notebooks: List notebooks in a workspace directory
export_notebook: Export a notebook from the workspace
list_files: List files and directories in a DBFS path
execute_sql: Execute a SQL statement
Installation
Prerequisites
Python 3.10 or higher
uvpackage manager (recommended for MCP servers)
Setup
Install
uvif you don't have it already:curl -LsSf https://astral.sh/uv/install.sh | shRestart your terminal after installation.
Clone the repository:
git clone https://github.com/JustTryAI/databricks-mcp-server.git cd databricks-mcp-serverSet up the project with
uv:# Create and activate virtual environment uv venv source .venv/bin/activate # Install dependencies in development mode uv pip install -e . # Install development dependencies uv pip install -e ".[dev]"
Running the MCP Server
Cursor Integration
Add the following to your Cursor MCP config (~/.cursor/mcp.json):
Replace the --directory path with the absolute path to your cloned repository, and fill in your Databricks credentials.
Standalone
You can also run the server directly:
Development
Code Standards
Python code follows PEP 8 style guide with a maximum line length of 100 characters
Use 4 spaces for indentation (no tabs)
Use double quotes for strings
All classes, methods, and functions should have Google-style docstrings
Type hints are required for all code except tests
License
This project is licensed under the MIT License - see the LICENSE file for details.