upload_file_to_volume
Transfer local files to Databricks Unity Catalog volumes for efficient processing. Supports large files with progress tracking, error handling, and optional overwrite.
Instructions
Upload a local file to a Databricks Unity Catalog volume.
Args:
local_file_path: Path to local file (e.g. './data/products.json')
volume_path: Full volume path (e.g. '/Volumes/catalog/schema/volume/file.json')
overwrite: Whether to overwrite existing file (default: False)
Returns:
JSON with upload results including success status, file size in MB, and upload time.
Example:
# Upload large dataset to volume
result = upload_file_to_volume(
local_file_path='./stark_export/products_full.json',
volume_path='/Volumes/kbqa/stark_mas_eval/stark_raw_data/products_full.json',
overwrite=True
)
Note: Handles large files (multi-GB) with progress tracking and proper error handling.
Perfect for uploading extracted datasets to Unity Catalog volumes for processing.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| local_file_path | Yes | ||
| overwrite | No | ||
| volume_path | Yes |
Implementation Reference
- src/api/volumes.py:28-96 (handler)Core implementation of the upload_file_to_volume tool using Databricks SDK for file upload to Unity Catalog volumes, including error handling, logging, and result formatting.async def upload_file_to_volume( local_file_path: str, volume_path: str, overwrite: bool = False ) -> Dict[str, Any]: """ Upload a local file to a Databricks Unity Catalog volume. Args: local_file_path: Path to local file to upload volume_path: Full volume path (e.g. '/Volumes/catalog/schema/volume/file.json') overwrite: Whether to overwrite existing file (default: False) Returns: Dict containing upload results with success status, file size, and timing Raises: FileNotFoundError: If local file doesn't exist """ start_time = time.time() if not os.path.exists(local_file_path): raise FileNotFoundError(f"Local file not found: {local_file_path}") # Get file size file_size = os.path.getsize(local_file_path) file_size_mb = file_size / (1024 * 1024) logger.info(f"Uploading {file_size_mb:.1f}MB from {local_file_path} to {volume_path}") try: # Use Databricks SDK for upload w = _get_workspace_client() # Read file content with open(local_file_path, 'rb') as f: file_content = f.read() # Upload using SDK - handles authentication, chunking, retries automatically w.files.upload( file_path=volume_path, contents=file_content, overwrite=overwrite ) end_time = time.time() upload_time = end_time - start_time return { "success": True, "file_size_mb": round(file_size_mb, 1), "upload_time_seconds": round(upload_time, 1), "volume_path": volume_path, "file_size_bytes": file_size } except Exception as e: logger.error(f"Error uploading file to volume: {str(e)}") end_time = time.time() upload_time = end_time - start_time return { "success": False, "error": str(e), "file_size_mb": round(file_size_mb, 1), "failed_after_seconds": round(upload_time, 1), "volume_path": volume_path }
- src/server/simple_databricks_mcp_server.py:413-456 (registration)MCP tool registration with @mcp.tool() decorator that wraps the volumes API implementation and serializes results to JSON for the MCP protocol.@mcp.tool() async def upload_file_to_volume( local_file_path: str, volume_path: str, overwrite: bool = False ) -> str: """ Upload a local file to a Databricks Unity Catalog volume. Args: local_file_path: Path to local file (e.g. './data/products.json') volume_path: Full volume path (e.g. '/Volumes/catalog/schema/volume/file.json') overwrite: Whether to overwrite existing file (default: False) Returns: JSON with upload results including success status, file size in MB, and upload time. Example: # Upload large dataset to volume result = upload_file_to_volume( local_file_path='./stark_export/products_full.json', volume_path='/Volumes/kbqa/stark_mas_eval/stark_raw_data/products_full.json', overwrite=True ) Note: Handles large files (multi-GB) with progress tracking and proper error handling. Perfect for uploading extracted datasets to Unity Catalog volumes for processing. """ logger.info(f"Uploading file from {local_file_path} to volume: {volume_path}") try: result = await volumes.upload_file_to_volume( local_file_path=local_file_path, volume_path=volume_path, overwrite=overwrite ) return json.dumps(result) except Exception as e: logger.error(f"Error uploading file to volume: {str(e)}") return json.dumps({ "success": False, "error": str(e), "volume_path": volume_path })