Provides access to Databricks functionality including cluster management, job management, notebook operations, file system operations (DBFS and Unity Catalog volumes), and SQL execution capabilities.
Supports notebook operations with Jupyter, allowing export of Databricks notebooks in Jupyter format as well as other formats.
Enables interaction with Python notebooks in Databricks, supporting operations like creating and exporting notebooks with Python content.
Integrates with Unity Catalog, providing tools for volume operations such as uploading files to Unity Catalog volumes and listing files in volumes with detailed metadata.
Databricks MCP Server - Working Version
A fixed version of the Databricks MCP Server that properly works with Claude Code and other MCP clients.
🔧 What Was Fixed
This is a working fork of the original Databricks MCP server that fixes critical issues preventing it from working with Claude Code and other MCP clients.
Original Repository: https://github.com/JustTryAI/databricks-mcp-server
The Problems
- Asyncio event loop conflict: Original server used
asyncio.run()
inside MCP tool functions, causingasyncio.run() cannot be called from a running event loop
errors when used with Claude Code (which already runs in an async context) - Command spawning issues: Claude Code's MCP client can only spawn single executables, not commands with arguments like
databricks-mcp start
- SQL API issues: Byte limit too high (100MB vs 25MB max), no API endpoint fallback for different Databricks workspace configurations
The Solutions
- Fixed async patterns: Created
simple_databricks_mcp_server.py
that follows the working iPython MCP pattern - changed all tools to useasync def
withawait
instead ofasyncio.run()
- Simplified CLI: Modified the CLI to default to starting the server when no command is provided, eliminating the need for wrapper scripts
- SQL API improvements:
- Reduced byte_limit from 100MB to 25MB (Databricks maximum allowed)
- Added API endpoint fallback: tries
/statements
first, then/statements/execute
- Better error logging when SQL APIs fail
🚀 Quick Start for Claude Code Users
- Install directly from GitHub:
Or clone and install locally:
- Configure credentials:
- Add to Claude Code:
- Test it works:
Why no arguments needed?
The CLI now defaults to starting the server when no command is provided, making it compatible with Claude Code's MCP client (which can only spawn single executables without arguments).
About This MCP Server
A Model Completion Protocol (MCP) server for Databricks that provides access to Databricks functionality via the MCP protocol. This allows LLM-powered tools to interact with Databricks clusters, jobs, notebooks, and more.
Features
- MCP Protocol Support: Implements the MCP protocol to allow LLMs to interact with Databricks
- Databricks API Integration: Provides access to Databricks REST API functionality
- Tool Registration: Exposes Databricks functionality as MCP tools
- Async Support: Built with asyncio for efficient operation
Available Tools
The Databricks MCP Server exposes 20 comprehensive tools across all major Databricks functionality areas:
Cluster Management (5 tools)
- list_clusters: List all Databricks clusters with status and configuration details
- create_cluster: Create a new Databricks cluster with specified configuration
- terminate_cluster: Terminate a Databricks cluster
- get_cluster: Get detailed information about a specific Databricks cluster
- start_cluster: Start a terminated Databricks cluster
Job Management (4 tools)
- list_jobs: List Databricks jobs with advanced pagination, creator filtering, and run status tracking
- list_job_runs: List recent job runs with detailed execution status, duration, and result information
- run_job: Execute a Databricks job with optional parameters
- create_job: Create a new job to run a notebook (supports serverless compute by default)
Notebook Management (3 tools)
- list_notebooks: List notebooks in a workspace directory with metadata
- export_notebook: Export a notebook from the workspace in various formats (Jupyter, Python, etc.)
- create_notebook: Create a new notebook in the workspace with specified content and language
File System (4 tools)
- list_files: List files and directories in DBFS paths with size and modification details
- upload_file_to_volume: Upload files to Unity Catalog volumes with progress tracking and large file support
- upload_file_to_dbfs: Upload files to DBFS with chunked upload for large files
- list_volume_files: List files and directories in Unity Catalog volumes with detailed metadata
SQL Execution (3 tools)
- execute_sql: Execute SQL statement and wait for completion (blocking) - perfect for quick queries
- execute_sql_nonblocking: Start SQL execution and return immediately with statement_id for long-running queries
- get_sql_status: Monitor and retrieve results of non-blocking SQL executions by statement_id
Enhanced Features
Advanced Job Management
- Pagination support:
list_jobs
includes pagination with configurable limits and offsets - Creator filtering: Filter jobs by creator email (case-insensitive)
- Run status integration: Automatically includes latest run status and execution duration
- Duration calculations: Real-time tracking of job execution times
Unity Catalog Integration
- Volume operations: Full support for Unity Catalog volumes using Databricks SDK
- Large file handling: Optimized upload with progress tracking for multi-GB files
- Path validation: Automatic validation of volume paths and permissions
Non-blocking SQL Execution
- Asynchronous execution: Start long-running SQL queries without blocking
- Status monitoring: Real-time status tracking with detailed error reporting
- Result retrieval: Fetch results when queries complete successfully
Key Features
Serverless Compute Support
The create_job
tool supports serverless compute by default, eliminating the need for cluster management:
Benefits of serverless:
- No cluster creation permissions required
- Auto-scaling compute resources
- Cost-efficient - pay only for execution time
- Faster job startup
Installation
Prerequisites
- Python 3.10 or higher
uv
package manager (recommended for MCP servers)
Setup
- Install
uv
if you don't have it already:Restart your terminal after installation. - Clone the repository:
- Set up the project with
uv
: - Set up environment variables:You can also create an
.env
file based on the.env.example
template.
Usage with Claude Code
The MCP server is automatically started by Claude Code when needed. No manual server startup is required.
After installation and configuration:
- Start using Databricks tools in Claude Code:
- Check available tools:
Querying Databricks Resources
You can test the MCP server tools directly or use them through Claude Code once installed.
Project Structure
Development
Linting
The project includes optional linting tools for code quality:
Testing
The project uses pytest for testing with async support. Tests are automatically configured to run with pytest-asyncio.
Test Status: ✅ 12 passed, 5 skipped (intentionally disabled)
Test Types:
- Unit tests (
test_clusters.py
): Test API functions with mocks - Integration tests (
test_direct.py
,test_tools.py
): Test MCP tools directly (requires Databricks credentials) - Validation tests (
test_validation.py
): Test import and schema validation
Note: Integration tests will show errors if Databricks credentials are not configured, but this is expected behavior.
Documentation
- API documentation is generated using Sphinx and can be found in the
docs/api
directory - All code includes Google-style docstrings
- See the
examples/
directory for usage examples
Examples
Volume Upload Operations
Upload a local file to Unity Catalog volume:
Upload to DBFS for temporary processing:
Non-blocking SQL Execution
Start long-running query and monitor progress:
Advanced Job Management
List jobs with filtering and pagination:
For more examples, check the examples/
directory:
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Ensure your code follows the project's coding standards
- Add tests for any new functionality
- Update documentation as necessary
- Verify all tests pass before submitting
🔍 Technical Details
The key fix was changing from:
To:
This pattern was applied to all 20 MCP tools in the server.
🏗️ Implementation Architecture
SDK vs REST API Approach
The MCP server uses a hybrid implementation approach optimized for reliability and performance:
Databricks SDK (Preferred)
Used for: Volume operations, authentication, and core workspace interactions
- Benefits: Automatic authentication, better error handling, type safety
- Tools using SDK:
upload_file_to_volume
,list_volume_files
, authentication layer - Authentication: Automatically discovers credentials from environment, CLI config, or instance metadata
REST API (Legacy)
Used for: SQL execution, some job operations
- Benefits: Direct control over API calls, established patterns
- Tools using REST:
execute_sql
,execute_sql_nonblocking
,get_sql_status
- Authentication: Uses manual token-based authentication
Migration Status
- ✅ Volume operations: Migrated to SDK (fixes 404 errors from REST)
- 🔄 In progress: Additional tools being evaluated for SDK migration
- 📝 Future: Plan to migrate remaining tools for consistency
Recommendation: New tools should use the Databricks SDK for better maintainability and error handling.
📝 Original Repository
Based on: https://github.com/JustTryAI/databricks-mcp-server
🐛 Issues Fixed
- ✅
asyncio.run() cannot be called from a running event loop
- ✅
spawn databricks-mcp start ENOENT
(command with arguments not supported) - ✅ MCP server connection failures with Claude Code
- ✅ Proper async/await patterns for MCP tools
- ✅ SQL execution byte limit issues (100MB → 25MB)
- ✅ SQL API endpoint compatibility across different Databricks workspaces
- ✅ Better error handling and logging for SQL operations
License
This project is licensed under the MIT License - see the LICENSE file for details.
This server cannot be installed
A fixed Model Completion Protocol (MCP) server that enables LLMs like Claude Code to interact with Databricks functionality including clusters, jobs, notebooks, and SQL execution through natural language commands.
- 🔧 What Was Fixed
- 🚀 Quick Start for Claude Code Users
- About This MCP Server
- Features
- Available Tools
- Key Features
- Installation
- Usage with Claude Code
- Querying Databricks Resources
- Project Structure
- Development
- Testing
- Documentation
- Examples
- Contributing
- 🔍 Technical Details
- 🏗️ Implementation Architecture
- 📝 Original Repository
- 🐛 Issues Fixed
- License
Related MCP Servers
- -securityAlicense-qualityThis is a Model Context Protocol (MCP) server for executing SQL queries against Databricks using the Statement Execution API. It enables AI assistants to directly query Databricks data warehouses, analyze database schemas, and retrieve query results in a structured formatLast updated -12PythonMIT License
- -securityFlicense-qualityA Model Context Protocol server that enables LLMs to interact with Databricks workspaces through natural language, allowing SQL query execution and job management operations.Last updated -19Python
- -securityFlicense-qualityAn MCP server that allows Claude to interact with local LLMs running in LM Studio, providing access to list models, generate text, and use chat completions through local models.Last updated -8Python
- -securityFlicense-qualityA server that implements the Model Completion Protocol (MCP) to allow LLMs to interact with Databricks resources including clusters, jobs, notebooks, and SQL execution through natural language.Last updated -34Python