Skip to main content
Glama

download_kaggle_dataset

Download files from a specific Kaggle dataset by providing the dataset reference and optional download path. Simplifies data retrieval for analysis and projects.

Instructions

Downloads files for a specific Kaggle dataset. Args: dataset_ref: The reference of the dataset (e.g., 'username/dataset-slug'). download_path: Optional. The path to download the files to. Defaults to '<project_root>/datasets/<dataset_slug>'.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
dataset_refYes
download_pathNo

Implementation Reference

  • The handler function decorated with @mcp.tool(), which registers and implements the 'download_kaggle_dataset' tool. It uses the Kaggle API to download the specified dataset to a determined path, handling directory creation, path resolution, and errors.
    @mcp.tool() async def download_kaggle_dataset(dataset_ref: str, download_path: str | None = None) -> str: """Downloads files for a specific Kaggle dataset. Args: dataset_ref: The reference of the dataset (e.g., 'username/dataset-slug'). download_path: Optional. The path to download the files to. Defaults to '<project_root>/datasets/<dataset_slug>'. """ if not api: # Return an informative error if API is not available return json.dumps({"error": "Kaggle API not authenticated or available."}) print(f"Attempting to download dataset: {dataset_ref}") # Determine absolute download path based on script location # Use Path.cwd() if run via script entry point, or __file__ if run directly try: project_root = Path(__file__).parent.parent.resolve() # NEW: this is the parent of src/, i.e., the project root except NameError: # __file__ might not be defined when run via entry point project_root = Path.cwd() # NEW: Assume cwd is project root if __file__ is not defined if not download_path: try: dataset_slug = dataset_ref.split('/')[1] except IndexError: return f"Error: Invalid dataset_ref format '{dataset_ref}'. Expected 'username/dataset-slug'." # Construct absolute path relative to project root download_path_obj = project_root / "datasets" / dataset_slug # NEW else: # If a path is provided, resolve it relative to project root download_path_obj = project_root / Path(download_path) # NEW # Ensure it's fully resolved download_path_obj = download_path_obj.resolve() # Ensure download directory exists (using the Path object) try: download_path_obj.mkdir(parents=True, exist_ok=True) print(f"Ensured download directory exists: {download_path_obj}") # Will print absolute path except OSError as e: return f"Error creating download directory '{download_path_obj}': {e}" try: print(f"Calling api.dataset_download_files for {dataset_ref} to path {str(download_path_obj)}") # Pass the path as a string to the Kaggle API api.dataset_download_files(dataset_ref, path=str(download_path_obj), unzip=True, quiet=False) return f"Successfully downloaded and unzipped dataset '{dataset_ref}' to '{str(download_path_obj)}'." # Show absolute path except Exception as e: # Log the error potentially print(f"Error downloading dataset '{dataset_ref}': {e}") # Check for 404 Not Found if "404" in str(e): return f"Error: Dataset '{dataset_ref}' not found or access denied." # Check for other specific Kaggle errors if needed return f"Error downloading dataset '{dataset_ref}': {str(e)}"
  • src/server.py:63-63 (registration)
    The @mcp.tool() decorator registers the download_kaggle_dataset function as an MCP tool.
    @mcp.tool()
  • Type hints and docstring define the input schema: dataset_ref (str, required), download_path (str optional), returning str.
    async def download_kaggle_dataset(dataset_ref: str, download_path: str | None = None) -> str: """Downloads files for a specific Kaggle dataset. Args: dataset_ref: The reference of the dataset (e.g., 'username/dataset-slug'). download_path: Optional. The path to download the files to. Defaults to '<project_root>/datasets/<dataset_slug>'. """
  • Initialization and authentication of the KaggleApi instance used by the tool via closure.
    api = None # Initialize api as None first try: api = KaggleApi() api.authenticate() print("Kaggle API Authenticated Successfully.") except Exception as e: print(f"Error authenticating Kaggle API: {e}") # api remains None if authentication fails
Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/arrismo/kaggle-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server