Provides access to Databricks functionality including cluster management, job management, notebook operations, file system operations (DBFS and Unity Catalog volumes), and SQL execution capabilities.
Supports notebook operations with Jupyter, allowing export of Databricks notebooks in Jupyter format as well as other formats.
Enables interaction with Python notebooks in Databricks, supporting operations like creating and exporting notebooks with Python content.
Integrates with Unity Catalog, providing tools for volume operations such as uploading files to Unity Catalog volumes and listing files in volumes with detailed metadata.
Databricks MCP Server - Working Version
A fixed version of the Databricks MCP Server that properly works with Claude Code and other MCP clients.
🔧 What Was Fixed
This is a working fork of the original Databricks MCP server that fixes critical issues preventing it from working with Claude Code and other MCP clients.
Original Repository: https://github.com/JustTryAI/databricks-mcp-server
The Problems
Asyncio event loop conflict: Original server used
asyncio.run()
inside MCP tool functions, causingasyncio.run() cannot be called from a running event loop
errors when used with Claude Code (which already runs in an async context)Command spawning issues: Claude Code's MCP client can only spawn single executables, not commands with arguments like
databricks-mcp start
SQL API issues: Byte limit too high (100MB vs 25MB max), no API endpoint fallback for different Databricks workspace configurations
The Solutions
Fixed async patterns: Created
simple_databricks_mcp_server.py
that follows the working iPython MCP pattern - changed all tools to useasync def
withawait
instead ofasyncio.run()
Simplified CLI: Modified the CLI to default to starting the server when no command is provided, eliminating the need for wrapper scripts
SQL API improvements:
Reduced byte_limit from 100MB to 25MB (Databricks maximum allowed)
Added API endpoint fallback: tries
/statements
first, then/statements/execute
Better error logging when SQL APIs fail
🚀 Quick Start for Claude Code Users
Install directly from GitHub:
Or clone and install locally:
Configure credentials:
Add to Claude Code:
Test it works:
Why no arguments needed?
The CLI now defaults to starting the server when no command is provided, making it compatible with Claude Code's MCP client (which can only spawn single executables without arguments).
About This MCP Server
A Model Completion Protocol (MCP) server for Databricks that provides access to Databricks functionality via the MCP protocol. This allows LLM-powered tools to interact with Databricks clusters, jobs, notebooks, and more.
Features
MCP Protocol Support: Implements the MCP protocol to allow LLMs to interact with Databricks
Databricks API Integration: Provides access to Databricks REST API functionality
Tool Registration: Exposes Databricks functionality as MCP tools
Async Support: Built with asyncio for efficient operation
Available Tools
The Databricks MCP Server exposes 20 comprehensive tools across all major Databricks functionality areas:
Cluster Management (5 tools)
list_clusters: List all Databricks clusters with status and configuration details
create_cluster: Create a new Databricks cluster with specified configuration
terminate_cluster: Terminate a Databricks cluster
get_cluster: Get detailed information about a specific Databricks cluster
start_cluster: Start a terminated Databricks cluster
Job Management (4 tools)
list_jobs: List Databricks jobs with advanced pagination, creator filtering, and run status tracking
list_job_runs: List recent job runs with detailed execution status, duration, and result information
run_job: Execute a Databricks job with optional parameters
create_job: Create a new job to run a notebook (supports serverless compute by default)
Notebook Management (3 tools)
list_notebooks: List notebooks in a workspace directory with metadata
export_notebook: Export a notebook from the workspace in various formats (Jupyter, Python, etc.)
create_notebook: Create a new notebook in the workspace with specified content and language
File System (4 tools)
list_files: List files and directories in DBFS paths with size and modification details
upload_file_to_volume: Upload files to Unity Catalog volumes with progress tracking and large file support
upload_file_to_dbfs: Upload files to DBFS with chunked upload for large files
list_volume_files: List files and directories in Unity Catalog volumes with detailed metadata
SQL Execution (3 tools)
execute_sql: Execute SQL statement and wait for completion (blocking) - perfect for quick queries
execute_sql_nonblocking: Start SQL execution and return immediately with statement_id for long-running queries
get_sql_status: Monitor and retrieve results of non-blocking SQL executions by statement_id
Enhanced Features
Advanced Job Management
Pagination support:
list_jobs
includes pagination with configurable limits and offsetsCreator filtering: Filter jobs by creator email (case-insensitive)
Run status integration: Automatically includes latest run status and execution duration
Duration calculations: Real-time tracking of job execution times
Unity Catalog Integration
Volume operations: Full support for Unity Catalog volumes using Databricks SDK
Large file handling: Optimized upload with progress tracking for multi-GB files
Path validation: Automatic validation of volume paths and permissions
Non-blocking SQL Execution
Asynchronous execution: Start long-running SQL queries without blocking
Status monitoring: Real-time status tracking with detailed error reporting
Result retrieval: Fetch results when queries complete successfully
Key Features
Serverless Compute Support
The create_job
tool supports serverless compute by default, eliminating the need for cluster management:
Benefits of serverless:
No cluster creation permissions required
Auto-scaling compute resources
Cost-efficient - pay only for execution time
Faster job startup
Installation
Prerequisites
Python 3.10 or higher
uv
package manager (recommended for MCP servers)
Setup
Install
uv
if you don't have it already:# MacOS/Linux curl -LsSf https://astral.sh/uv/install.sh | sh # Windows (in PowerShell) irm https://astral.sh/uv/install.ps1 | iexRestart your terminal after installation.
Clone the repository:
git clone https://github.com/samhavens/databricks-mcp-server.git cd databricks-mcp-serverSet up the project with
uv
:# Create and activate virtual environment uv venv # On Windows .\.venv\Scripts\activate # On Linux/Mac source .venv/bin/activate # Install dependencies in development mode uv pip install -e . # Install development dependencies uv pip install -e ".[dev]"Set up environment variables:
# Windows set DATABRICKS_HOST=https://your-databricks-instance.azuredatabricks.net set DATABRICKS_TOKEN=your-personal-access-token # Linux/Mac export DATABRICKS_HOST=https://your-databricks-instance.azuredatabricks.net export DATABRICKS_TOKEN=your-personal-access-tokenYou can also create an
.env
file based on the.env.example
template.
Usage with Claude Code
The MCP server is automatically started by Claude Code when needed. No manual server startup is required.
After installation and configuration:
Start using Databricks tools in Claude Code:
> list all databricks clusters > create a job to run my notebook > execute SQL: SHOW CATALOGSCheck available tools:
databricks-mcp list-tools
Querying Databricks Resources
You can test the MCP server tools directly or use them through Claude Code once installed.
Project Structure
Development
Linting
The project includes optional linting tools for code quality:
Testing
The project uses pytest for testing with async support. Tests are automatically configured to run with pytest-asyncio.
Test Status: ✅ 12 passed, 5 skipped (intentionally disabled)
Test Types:
Unit tests (
test_clusters.py
): Test API functions with mocksIntegration tests (
test_direct.py
,test_tools.py
): Test MCP tools directly (requires Databricks credentials)Validation tests (
test_validation.py
): Test import and schema validation
Note: Integration tests will show errors if Databricks credentials are not configured, but this is expected behavior.
Documentation
API documentation is generated using Sphinx and can be found in the
docs/api
directoryAll code includes Google-style docstrings
See the
examples/
directory for usage examples
Examples
Volume Upload Operations
Upload a local file to Unity Catalog volume:
Upload to DBFS for temporary processing:
Non-blocking SQL Execution
Start long-running query and monitor progress:
Advanced Job Management
List jobs with filtering and pagination:
For more examples, check the examples/
directory:
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Ensure your code follows the project's coding standards
Add tests for any new functionality
Update documentation as necessary
Verify all tests pass before submitting
🔍 Technical Details
The key fix was changing from:
To:
This pattern was applied to all 20 MCP tools in the server.
🏗️ Implementation Architecture
SDK vs REST API Approach
The MCP server uses a hybrid implementation approach optimized for reliability and performance:
Databricks SDK (Preferred)
Used for: Volume operations, authentication, and core workspace interactions
Benefits: Automatic authentication, better error handling, type safety
Tools using SDK:
upload_file_to_volume
,list_volume_files
, authentication layerAuthentication: Automatically discovers credentials from environment, CLI config, or instance metadata
REST API (Legacy)
Used for: SQL execution, some job operations
Benefits: Direct control over API calls, established patterns
Tools using REST:
execute_sql
,execute_sql_nonblocking
,get_sql_status
Authentication: Uses manual token-based authentication
Migration Status
✅ Volume operations: Migrated to SDK (fixes 404 errors from REST)
🔄 In progress: Additional tools being evaluated for SDK migration
📝 Future: Plan to migrate remaining tools for consistency
Recommendation: New tools should use the Databricks SDK for better maintainability and error handling.
📝 Original Repository
Based on: https://github.com/JustTryAI/databricks-mcp-server
🐛 Issues Fixed
✅
asyncio.run() cannot be called from a running event loop
✅
spawn databricks-mcp start ENOENT
(command with arguments not supported)✅ MCP server connection failures with Claude Code
✅ Proper async/await patterns for MCP tools
✅ SQL execution byte limit issues (100MB → 25MB)
✅ SQL API endpoint compatibility across different Databricks workspaces
✅ Better error handling and logging for SQL operations
License
This project is licensed under the MIT License - see the LICENSE file for details.
Tools
A fixed Model Completion Protocol (MCP) server that enables LLMs like Claude Code to interact with Databricks functionality including clusters, jobs, notebooks, and SQL execution through natural language commands.
- 🔧 What Was Fixed
- 🚀 Quick Start for Claude Code Users
- About This MCP Server
- Features
- Available Tools
- Key Features
- Installation
- Usage with Claude Code
- Querying Databricks Resources
- Project Structure
- Development
- Testing
- Documentation
- Examples
- Contributing
- 🔍 Technical Details
- 🏗️ Implementation Architecture
- 📝 Original Repository
- 🐛 Issues Fixed
- License
Related MCP Servers
- -securityAlicense-qualityThis is a Model Context Protocol (MCP) server for executing SQL queries against Databricks using the Statement Execution API. It enables AI assistants to directly query Databricks data warehouses, analyze database schemas, and retrieve query results in a structured formatLast updated -24MIT License
- -securityFlicense-qualityA Model Context Protocol server that enables LLMs to interact with Databricks workspaces through natural language, allowing SQL query execution and job management operations.Last updated -41
- -securityFlicense-qualityAn MCP server that allows Claude to interact with local LLMs running in LM Studio, providing access to list models, generate text, and use chat completions through local models.Last updated -10
- AsecurityFlicenseAqualityA server that implements the Model Completion Protocol (MCP) to allow LLMs to interact with Databricks resources including clusters, jobs, notebooks, and SQL execution through natural language.Last updated -1137