Skip to main content
Glama
new_features.md•6.15 kB
# New Features Documentation This document describes the two new features added to the Databricks MCP server. ## Feature 1: Warehouse ID Environment Variable ### Overview Previously, the `execute_sql` tool required a `warehouse_id` parameter every time. Now you can set a default warehouse ID as an environment variable, making the parameter optional. ### Configuration Set the environment variable: ```bash export DATABRICKS_WAREHOUSE_ID=sql_warehouse_12345 ``` ### Usage ```python # Without warehouse_id (uses environment variable) await session.call_tool("execute_sql", { "statement": "SELECT * FROM my_table LIMIT 10" }) # With explicit warehouse_id (overrides environment variable) await session.call_tool("execute_sql", { "statement": "SELECT * FROM my_table LIMIT 10", "warehouse_id": "sql_warehouse_specific" }) ``` ### Implementation Details - Added `DATABRICKS_WAREHOUSE_ID` to `src/core/config.py` - Modified `src/api/sql.py` to use environment variable as fallback - Updated MCP tool description to indicate optional parameter - Maintains backward compatibility ## Feature 2: Workspace File Content Retrieval ### Overview Two new tools have been added to retrieve content and metadata from Databricks workspace files (JSON files, notebooks, scripts, etc.). ### New Tools #### `get_workspace_file_content` Retrieves the actual content of a workspace file. **Parameters:** - `workspace_path` (required): Path to the file in workspace (e.g., `/Users/user@domain.com/file.json`) - `format` (optional): Export format - SOURCE, HTML, JUPYTER, DBC (default: SOURCE) **Example:** ```python await session.call_tool("get_workspace_file_content", { "workspace_path": "/Users/huytl@remitano.com/azasend_schema/azasend_overview.json" }) ``` #### `get_workspace_file_info` Gets metadata about a workspace file without downloading content. **Parameters:** - `workspace_path` (required): Path to the file in workspace **Example:** ```python await session.call_tool("get_workspace_file_info", { "workspace_path": "/Users/user@domain.com/config/settings.json" }) ``` ### Response Format #### File Content Response ```json { "content": "base64_encoded_content", "decoded_content": "actual_text_content", "content_type": "json|text|binary", "path": "/Users/user@domain.com/file.json", "language": "JSON", "object_type": "FILE" } ``` #### File Info Response ```json { "path": "/Users/user@domain.com/file.json", "object_type": "FILE", "language": "JSON", "created_at": 1640995200000, "modified_at": 1640995200000, "size": 1024 } ``` ### Features - **Automatic decoding**: Base64 content is automatically decoded to text - **JSON detection**: JSON files are automatically detected and marked - **Error handling**: Graceful handling of encoding issues and missing files - **Multiple formats**: Supports SOURCE, HTML, JUPYTER, DBC formats - **Content type detection**: Automatically detects JSON, text, or binary content ### API Endpoint Used Both tools use the Databricks Workspace API: - Endpoint: `/api/2.0/workspace/export` - Method: GET - Parameters: `path`, `format` ### Implementation Details - Added `export_workspace_file()` and `get_workspace_file_info()` to `src/api/notebooks.py` - Added two new MCP tools to `src/server/databricks_mcp_server.py` - Handles encoding issues gracefully with fallback strategies - Uses existing authentication and error handling infrastructure ## Testing ### Test Scripts Two test scripts are provided: 1. **`test_new_features.py`**: Comprehensive test of both features 2. **`test_example_file.py`**: Quick test with the specific example path ### Running Tests ```bash # Set environment variables export DATABRICKS_HOST=https://your-workspace.databricks.com export DATABRICKS_TOKEN=dapi_your_token export DATABRICKS_WAREHOUSE_ID=sql_warehouse_12345 # Run comprehensive test python test_new_features.py # Test with example file python test_example_file.py ``` ## Backward Compatibility Both features maintain full backward compatibility: - Existing `execute_sql` calls with `warehouse_id` parameter continue to work - No changes to existing tools or APIs - Optional parameters remain optional ## Error Handling ### Warehouse ID Feature - If no `warehouse_id` provided and no environment variable set: Clear error message - Invalid warehouse ID: Databricks API error is returned ### Workspace File Feature - File not found: Clear error message with file path - Access denied: Databricks API error is returned - Encoding issues: Graceful fallback with warnings - Binary files: Detected and handled appropriately ## Configuration Examples ### Environment Variables (.env file) ```bash # Required DATABRICKS_HOST=https://your-workspace.databricks.com DATABRICKS_TOKEN=dapi_your_token_here # Optional - enables default warehouse for SQL DATABRICKS_WAREHOUSE_ID=sql_warehouse_12345 # Optional - other settings LOG_LEVEL=INFO DEBUG=false ``` ### Cursor MCP Configuration ```json { "mcpServers": { "databricks-mcp-local": { "command": "/path/to/start_mcp_server.sh", "env": { "DATABRICKS_HOST": "https://your-workspace.databricks.com", "DATABRICKS_TOKEN": "dapi_your_token", "DATABRICKS_WAREHOUSE_ID": "sql_warehouse_12345" } } } } ``` ## Use Cases ### SQL Execution - **Data Analysis**: Quick SQL queries without specifying warehouse each time - **Automated Reports**: Scripts that run SQL without hardcoded warehouse IDs - **Multi-environment**: Different warehouses for dev/staging/prod via env vars ### Workspace Files - **Configuration Management**: Retrieve JSON config files from workspace - **Code Review**: Get content of Python/R scripts for analysis - **Notebook Export**: Export notebooks in different formats - **File Inspection**: Check file metadata before downloading large files ## Security Considerations - Environment variables are not logged in debug output - File content is transmitted securely via existing Databricks authentication - No additional authentication required beyond existing Databricks token - Large files are handled efficiently without memory issues

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/robkisk/databricks-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server