list_dataset_resources
Discover and access all files within a dataset from France's open data platform. View file details like format, size, and type to identify data for analysis.
Instructions
List all resources (files) in a dataset with their metadata.
Returns information about each resource including ID, title, format, size, and type. This is a key step before querying data from resources.
Typical workflow:
Use search_datasets to find datasets
Use list_dataset_resources to see what files are in a dataset
Use get_resource_info to check if a resource is available via Tabular API
Use query_resource_data (for Tabular API) or download_and_parse_resource (for large/unsupported files)
Args: dataset_id: The ID of the dataset to list resources from (obtained from search_datasets or get_dataset_info)
Returns: Formatted text listing all resources with their metadata, including resource IDs for data queries
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset_id | Yes |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |
Implementation Reference
- tools/list_dataset_resources.py:13-99 (handler)The main asynchronous tool handler that lists all resources in a dataset. Fetches initial resource list and metadata via datagouv_api_client.get_resources_for_dataset, then details for each resource, formats human-readable output with ID, title, format, size (human-readable), MIME, type, URL.
async def list_dataset_resources(dataset_id: str) -> str: """ List all resources (files) in a dataset with their metadata. Returns information about each resource including ID, title, format, size, and type. This is a key step before querying data from resources. Typical workflow: 1. Use search_datasets to find datasets 2. Use list_dataset_resources to see what files are in a dataset 3. Use get_resource_info to check if a resource is available via Tabular API 4. Use query_resource_data (for Tabular API) or download_and_parse_resource (for large/unsupported files) Args: dataset_id: The ID of the dataset to list resources from (obtained from search_datasets or get_dataset_info) Returns: Formatted text listing all resources with their metadata, including resource IDs for data queries """ try: result = await datagouv_api_client.get_resources_for_dataset(dataset_id) dataset = result.get("dataset", {}) resources = result.get("resources", []) if not dataset.get("id"): return f"Error: Dataset with ID '{dataset_id}' not found." dataset_title = dataset.get("title", "Unknown") content_parts = [ f"Resources in dataset: {dataset_title}", f"Dataset ID: {dataset_id}", f"Total resources: {len(resources)}\n", ] if not resources: content_parts.append("This dataset has no resources.") return "\n".join(content_parts) # Get detailed info for each resource async with httpx.AsyncClient() as session: for i, (resource_id, resource_title) in enumerate(resources, 1): content_parts.append(f"{i}. {resource_title or 'Untitled'}") content_parts.append(f" Resource ID: {resource_id}") try: resource_data = await datagouv_api_client.get_resource_details( resource_id, session=session ) resource = resource_data.get("resource", {}) if resource.get("format"): content_parts.append(f" Format: {resource.get('format')}") if resource.get("filesize"): size = resource.get("filesize") if isinstance(size, int): # Format size in human-readable format if size < 1024: size_str = f"{size} B" elif size < 1024 * 1024: size_str = f"{size / 1024:.1f} KB" elif size < 1024 * 1024 * 1024: size_str = f"{size / (1024 * 1024):.1f} MB" else: size_str = f"{size / (1024 * 1024 * 1024):.1f} GB" content_parts.append(f" Size: {size_str}") if resource.get("mime"): content_parts.append( f" MIME type: {resource.get('mime')}" ) if resource.get("type"): content_parts.append(f" Type: {resource.get('type')}") if resource.get("url"): content_parts.append(f" URL: {resource.get('url')}") except Exception as e: # noqa: BLE001 logger.warning( f"Could not fetch details for resource {resource_id}: {e}" ) content_parts.append("") return "\n".join(content_parts) except httpx.HTTPStatusError as e: return f"Error: HTTP {e.response.status_code} - {str(e)}" except Exception as e: # noqa: BLE001 return f"Error: {str(e)}" - tools/list_dataset_resources.py:11-12 (registration)Local registration function for the tool using the @mcp.tool() decorator on the handler function.
def register_list_dataset_resources_tool(mcp: FastMCP) -> None: @mcp.tool() - tools/__init__.py:9-23 (registration)Central tool registration: imports the register function and calls it within register_tools(mcp), which sets up all tools.
from tools.list_dataset_resources import register_list_dataset_resources_tool from tools.query_resource_data import register_query_resource_data_tool from tools.search_datasets import register_search_datasets_tool def register_tools(mcp: FastMCP) -> None: """Register all MCP tools with the provided FastMCP instance.""" register_search_datasets_tool(mcp) register_query_resource_data_tool(mcp) register_get_dataset_info_tool(mcp) register_list_dataset_resources_tool(mcp) register_get_resource_info_tool(mcp) register_download_and_parse_resource_tool(mcp) register_get_metrics_tool(mcp) - Key helper utility invoked by the handler to fetch the dataset's resources list (ID and title tuples) and basic dataset metadata from the data.gouv.fr API v1 endpoint.
async def get_resources_for_dataset( dataset_id: str, session: httpx.AsyncClient | None = None ) -> dict[str, Any]: """ Get all resources for a given dataset. Returns: dict with 'dataset' metadata and 'resources' list of resource IDs and titles """ own = session is None if own: session = httpx.AsyncClient() try: ds = await get_dataset_metadata(dataset_id, session=session) base_url: str = env_config.get_base_url("datagouv_api") # Fetch resources from API v1 url = f"{base_url}1/datasets/{dataset_id}/" data = await _fetch_json(session, url) resources: list[dict[str, Any]] = data.get("resources", []) res_list: list[tuple[str, str]] = [ (res.get("id"), res.get("title", "") or res.get("name", "")) for res in resources if res.get("id") ] return {"dataset": ds, "resources": res_list} finally: if own and session: await session.aclose()