# Databricks MCP Server
This is a very heavily modified fork of the [databricks-mcp-server](https://github.com/JustTryAI/databricks-mcp-server) with simplifications, bug fixes, and HIMS-specific changes. It's not a github fork because I wanted to keep this repo private.
With the enhancements in this package, you can run queries like this against Databricks:
> "Use the catalog resources to write me a sql query against databricks to find all users in the past month that went through the top of funnel and tell me whether they subscribed or didn't subscribe. When you're done, run the query and fix bugs."
With Claude Opus, the agent reads the schema resources, produces a somehow
correct query, runs it, fixes it, and reports some results.
## HIMS-specific Resources
The server exposes MCP resources that provide schema reference documentation for the Databricks data layers:
- **databricks_gold_schema_reference** (`databricks://schemas/gold-catalog-reference`): Reference documentation for table schemas in the gold data layer (`us_dpe_production_gold` catalog). Contains table names, column definitions, data types, and nullability for all gold-layer tables.
- **databricks_silver_schema_reference** (`databricks://schemas/silver-catalog-reference`): Reference documentation for table schemas in the silver data layer (`us_dpe_production_silver` catalog). Contains table names, column definitions, data types, and nullability for all silver-layer tables.
These resources allow LLMs to look up the exact schema of tables in the silver and gold catalogs so they can write accurate SQL queries and understand the data model without having to query `INFORMATION_SCHEMA` at runtime.
This it also provides this tool to execute SQL queries:
- **execute_sql**: Execute a SQL statement
## Installation
### Prerequisites
- Python 3.10 or higher
- `uv` package manager (recommended for MCP servers)
### Setup
1. Install `uv` if you don't have it already:
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
Restart your terminal after installation.
2. Clone the repository:
```bash
git clone https://github.com/JustTryAI/databricks-mcp-server.git
cd databricks-mcp-server
```
3. Set up the project with `uv`:
```bash
# Create and activate virtual environment
uv venv
source .venv/bin/activate
# Install dependencies in development mode
uv pip install -e .
# Install development dependencies
uv pip install -e ".[dev]"
```
## Running the MCP Server
### Cursor Integration
Add the following to your Cursor MCP config (`~/.cursor/mcp.json`):
```json
{
"mcpServers": {
"databricks-mcp": {
"command": "uv",
"args": [
"run",
"--directory",
"/path/to/databricks-mcp-server",
"python",
"-m",
"src.server.databricks_mcp_server"
],
"env": {
"DATABRICKS_HOST": "https://your-databricks-instance.cloud.databricks.com",
"DATABRICKS_TOKEN": "your-personal-access-token",
"DATABRICKS_WAREHOUSE_ID": "your-sql-warehouse-id"
}
}
}
}
```
Replace the `--directory` path with the absolute path to your cloned repository, and fill in your Databricks credentials.
### Standalone
You can also run the server directly:
```bash
export DATABRICKS_HOST=https://your-databricks-instance.cloud.databricks.com
export DATABRICKS_TOKEN=your-personal-access-token
export DATABRICKS_WAREHOUSE_ID=your-sql-warehouse-id
uv run python -m src.server.databricks_mcp_server
```
## License
This project is licensed under the MIT License - see the LICENSE file for details.