Skip to main content
Glama

download_kaggle_dataset

Download files from a specific Kaggle dataset by providing the dataset reference and optional download path. Simplifies data retrieval for analysis and projects.

Instructions

Downloads files for a specific Kaggle dataset. Args: dataset_ref: The reference of the dataset (e.g., 'username/dataset-slug'). download_path: Optional. The path to download the files to. Defaults to '<project_root>/datasets/<dataset_slug>'.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
dataset_refYes
download_pathNo

Implementation Reference

  • The handler function decorated with @mcp.tool(), which registers and implements the 'download_kaggle_dataset' tool. It uses the Kaggle API to download the specified dataset to a determined path, handling directory creation, path resolution, and errors.
    @mcp.tool() async def download_kaggle_dataset(dataset_ref: str, download_path: str | None = None) -> str: """Downloads files for a specific Kaggle dataset. Args: dataset_ref: The reference of the dataset (e.g., 'username/dataset-slug'). download_path: Optional. The path to download the files to. Defaults to '<project_root>/datasets/<dataset_slug>'. """ if not api: # Return an informative error if API is not available return json.dumps({"error": "Kaggle API not authenticated or available."}) print(f"Attempting to download dataset: {dataset_ref}") # Determine absolute download path based on script location # Use Path.cwd() if run via script entry point, or __file__ if run directly try: project_root = Path(__file__).parent.parent.resolve() # NEW: this is the parent of src/, i.e., the project root except NameError: # __file__ might not be defined when run via entry point project_root = Path.cwd() # NEW: Assume cwd is project root if __file__ is not defined if not download_path: try: dataset_slug = dataset_ref.split('/')[1] except IndexError: return f"Error: Invalid dataset_ref format '{dataset_ref}'. Expected 'username/dataset-slug'." # Construct absolute path relative to project root download_path_obj = project_root / "datasets" / dataset_slug # NEW else: # If a path is provided, resolve it relative to project root download_path_obj = project_root / Path(download_path) # NEW # Ensure it's fully resolved download_path_obj = download_path_obj.resolve() # Ensure download directory exists (using the Path object) try: download_path_obj.mkdir(parents=True, exist_ok=True) print(f"Ensured download directory exists: {download_path_obj}") # Will print absolute path except OSError as e: return f"Error creating download directory '{download_path_obj}': {e}" try: print(f"Calling api.dataset_download_files for {dataset_ref} to path {str(download_path_obj)}") # Pass the path as a string to the Kaggle API api.dataset_download_files(dataset_ref, path=str(download_path_obj), unzip=True, quiet=False) return f"Successfully downloaded and unzipped dataset '{dataset_ref}' to '{str(download_path_obj)}'." # Show absolute path except Exception as e: # Log the error potentially print(f"Error downloading dataset '{dataset_ref}': {e}") # Check for 404 Not Found if "404" in str(e): return f"Error: Dataset '{dataset_ref}' not found or access denied." # Check for other specific Kaggle errors if needed return f"Error downloading dataset '{dataset_ref}': {str(e)}"
  • src/server.py:63-63 (registration)
    The @mcp.tool() decorator registers the download_kaggle_dataset function as an MCP tool.
    @mcp.tool()
  • Type hints and docstring define the input schema: dataset_ref (str, required), download_path (str optional), returning str.
    async def download_kaggle_dataset(dataset_ref: str, download_path: str | None = None) -> str: """Downloads files for a specific Kaggle dataset. Args: dataset_ref: The reference of the dataset (e.g., 'username/dataset-slug'). download_path: Optional. The path to download the files to. Defaults to '<project_root>/datasets/<dataset_slug>'. """
  • Initialization and authentication of the KaggleApi instance used by the tool via closure.
    api = None # Initialize api as None first try: api = KaggleApi() api.authenticate() print("Kaggle API Authenticated Successfully.") except Exception as e: print(f"Error authenticating Kaggle API: {e}") # api remains None if authentication fails

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/arrismo/kaggle-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server