Databricks MCP Server
Allows querying Databricks datasets using natural language, executing read-only SQL queries on Databricks SQL warehouses.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Databricks MCP ServerWhat were the busiest pickup zones last month?"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Databricks MCP Server — Natural-Language Analytics POC
A small Model Context Protocol server that lets an LLM client (e.g. Claude Desktop) answer business questions in natural language over a Databricks dataset — without writing SQL by hand.
It runs against the public samples.nyctaxi.trips dataset that ships with every
Databricks workspace, so it's reproducible by anyone.
What it exposes (the three MCP primitives)
Primitive | Name | Purpose |
Tool |
| Executes a read-only SQL query against |
Resource |
| Curated schema + metric definitions and gotchas — the context layer that makes the generated SQL correct. |
Prompts |
| Ready-made business questions. |
Related MCP server: MCP Iceberg Catalog
Safety / governance
Two layers, on purpose:
App-level guard (
is_read_only): only a singleSELECT/WITHstatement is accepted; any write/DDL keyword (INSERT,UPDATE,DROP, ...) is rejected, and aLIMIT 1000is appended when missing.The real guarantee: connect with a Databricks token whose grants are read-only on the catalog. App guards reduce footguns; permissions are what actually protect the data. Never give an LLM a write-capable credential.
Architecture
Claude Desktop ──stdio──► MCP server (this repo) ──Databricks SQL connector──► samples.nyctaxi.trips
(client) tool · resource · prompts (read-only)run_query doesn't open the connection in-process — it shells out to
query_runner.py (subprocess.run(..., stdin=subprocess.DEVNULL, capture_output=True)).
See the note below for why.
Implementation note: why run_query uses a subprocess
Both points were reproduced and verified on Windows + the FastMCP stdio
transport (Claude Desktop and the MCP Inspector). Symptom in both: the tool call
hangs and the client returns MCP error -32001: Request timed out at ~60s, even
though the same query runs in ~4s with the connector directly.
sql.connect()stalls ~60s when called inside the server process. From a clean child process it connects in ~2s; inside the FastMCP process it blocks until the client's request times out. It stalls on the event-loop thread and on a worker thread, so it's a process-level interaction with the connector — not just the event loop being blocked. Running the query in a child process avoids it. (Disabling telemetry /use_cloud_fetchdoes not help.)stdin=subprocess.DEVNULLis required on the child. A stdio MCP server's own stdin is the JSON-RPC pipe from the client. A child started with the defaultstdin=Noneinherits that pipe handle and hangs until the client gives up (~60s). Detaching stdin makes it return at query speed.capture_output=Truealready detaches stdout/stderr — stdin is the one that's easy to miss, so piping the query out to a subprocess without it does not fix the hang.
Gotcha — don't launch the Inspector from Git Bash on Windows. MSYS2 rewrites the POSIX-looking
DATABRICKS_HTTP_PATH(/sql/1.0/warehouses/…→C:/Program Files/Git/sql/1.0/warehouses/…), so the server gets a 404, not a timeout. Use PowerShell orcmd. Claude Desktop passes env vars directly and is unaffected.
Run it
Prereqs: Python 3.11+, uv, a Databricks workspace
with a running SQL Warehouse and the samples catalog.
Windows / PowerShell (recommended on Windows — see the Git Bash gotcha above):
cd "C:\path\to\databricks-mcp"
uv sync # first time only
# from SQL Warehouses -> Connection details, plus a personal access token.
# These live only in THIS PowerShell window (nothing is written to disk):
$env:DATABRICKS_HOST = "dbc-xxxxxxxx-xxxx.cloud.databricks.com"
$env:DATABRICKS_HTTP_PATH = "/sql/1.0/warehouses/xxxxxxxxxxxxxxxx"
$env:DATABRICKS_TOKEN = "dapixxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
# launch the browser inspector, then run a query from its UI:
npx @modelcontextprotocol/inspector uv run server.pyuv sync
export DATABRICKS_HOST="adb-....azuredatabricks.net"
export DATABRICKS_HTTP_PATH="/sql/1.0/warehouses/...."
export DATABRICKS_TOKEN="dapi...."
npx @modelcontextprotocol/inspector uv run server.pyConnect to Claude Desktop
You can reach the config file in two ways:
Via the UI (recommended): in Claude Desktop go to Settings → Developer → Edit Config. This opens (and creates, if missing)
claude_desktop_config.jsonin the right folder.By path: edit it directly at
%APPDATA%\Claude\claude_desktop_config.json(Windows) or~/Library/Application Support/Claude/claude_desktop_config.json(macOS).
Copy the contents of claude_desktop_config.example.json into that file,
fill in your real values, and restart Claude Desktop. Then ask things like:
"What were the busiest pickup zones, and how does monthly revenue trend?"
Notes
samples.nyctaxi.tripsis a public Databricks dataset; no private data is used.Secrets live in env vars / the Claude Desktop config, both git-ignored.
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Tools
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/OliveriGuido/databricks-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server