get_resource_info
Retrieve metadata for data files to determine optimal query method. Check format, size, and Tabular API availability to choose between direct querying or downloading.
Instructions
Get detailed information about a specific resource (file).
Returns comprehensive metadata including format, size, MIME type, URL, and associated dataset information. Also checks if the resource is available via the Tabular API (data.gouv.fr's API for parsing tabular files without downloading them).
Use this tool to determine which data querying tool to use:
If available via Tabular API: use query_resource_data (faster, no download needed)
If not available or too large: use download_and_parse_resource
Typical workflow:
Use list_dataset_resources to find resources in a dataset
Use get_resource_info to check resource details and Tabular API availability
Use query_resource_data or download_and_parse_resource based on availability
Args: resource_id: The ID of the resource to get information about (obtained from list_dataset_resources)
Returns: Formatted text with detailed resource information, including Tabular API availability status
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| resource_id | Yes |
Implementation Reference
- tools/get_resource_info.py:9-116 (handler)The main handler function that executes the tool logic: fetches resource details from data.gouv.fr API, formats metadata (title, size, format, etc.), retrieves dataset info, and checks availability via Tabular API.async def get_resource_info(resource_id: str) -> str: """ Get detailed information about a specific resource (file). Returns comprehensive metadata including format, size, MIME type, URL, and associated dataset information. Also checks if the resource is available via the Tabular API (data.gouv.fr's API for parsing tabular files without downloading them). Use this tool to determine which data querying tool to use: - If available via Tabular API: use query_resource_data (faster, no download needed) - If not available or too large: use download_and_parse_resource Typical workflow: 1. Use list_dataset_resources to find resources in a dataset 2. Use get_resource_info to check resource details and Tabular API availability 3. Use query_resource_data or download_and_parse_resource based on availability Args: resource_id: The ID of the resource to get information about (obtained from list_dataset_resources) Returns: Formatted text with detailed resource information, including Tabular API availability status """ try: # Get full resource data from API v2 resource_data = await datagouv_api_client.get_resource_details(resource_id) resource = resource_data.get("resource", {}) if not resource.get("id"): return f"Error: Resource with ID '{resource_id}' not found." resource_title = resource.get("title") or resource.get("name") or "Unknown" content_parts = [ f"Resource Information: {resource_title}", "", f"Resource ID: {resource_id}", ] if resource.get("format"): content_parts.append(f"Format: {resource.get('format')}") if resource.get("filesize"): size = resource.get("filesize") if isinstance(size, int): # Format size in human-readable format if size < 1024: size_str = f"{size} B" elif size < 1024 * 1024: size_str = f"{size / 1024:.1f} KB" elif size < 1024 * 1024 * 1024: size_str = f"{size / (1024 * 1024):.1f} MB" else: size_str = f"{size / (1024 * 1024 * 1024):.1f} GB" content_parts.append(f"Size: {size_str}") if resource.get("mime"): content_parts.append(f"MIME type: {resource.get('mime')}") if resource.get("type"): content_parts.append(f"Type: {resource.get('type')}") if resource.get("url"): content_parts.append("") content_parts.append(f"URL: {resource.get('url')}") if resource.get("description"): content_parts.append("") content_parts.append(f"Description: {resource.get('description')}") # Dataset information dataset_id = resource_data.get("dataset_id") if dataset_id: content_parts.append("") content_parts.append(f"Dataset ID: {dataset_id}") try: dataset_meta = await datagouv_api_client.get_dataset_metadata( str(dataset_id) ) if dataset_meta.get("title"): content_parts.append(f"Dataset: {dataset_meta.get('title')}") except Exception: # noqa: BLE001 pass # Check if resource is available via Tabular API content_parts.append("") try: # Try to get profile to check if it's tabular profile_url = f"{env_config.get_base_url('tabular_api')}resources/{resource_id}/profile/" async with httpx.AsyncClient() as session: resp = await session.get(profile_url, timeout=10.0) if resp.status_code == 200: content_parts.append( "✅ Available via Tabular API (can be queried)" ) else: content_parts.append( "⚠️ Not available via Tabular API (may not be tabular data)" ) except Exception: # noqa: BLE001 content_parts.append("⚠️ Could not check Tabular API availability") return "\n".join(content_parts) except httpx.HTTPStatusError as e: return f"Error: HTTP {e.response.status_code} - {str(e)}" except Exception as e: # noqa: BLE001 return f"Error: {str(e)}"
- tools/get_resource_info.py:7-117 (registration)The registration function that defines the tool using @mcp.tool() decorator within FastMCP.def register_get_resource_info_tool(mcp: FastMCP) -> None: @mcp.tool() async def get_resource_info(resource_id: str) -> str: """ Get detailed information about a specific resource (file). Returns comprehensive metadata including format, size, MIME type, URL, and associated dataset information. Also checks if the resource is available via the Tabular API (data.gouv.fr's API for parsing tabular files without downloading them). Use this tool to determine which data querying tool to use: - If available via Tabular API: use query_resource_data (faster, no download needed) - If not available or too large: use download_and_parse_resource Typical workflow: 1. Use list_dataset_resources to find resources in a dataset 2. Use get_resource_info to check resource details and Tabular API availability 3. Use query_resource_data or download_and_parse_resource based on availability Args: resource_id: The ID of the resource to get information about (obtained from list_dataset_resources) Returns: Formatted text with detailed resource information, including Tabular API availability status """ try: # Get full resource data from API v2 resource_data = await datagouv_api_client.get_resource_details(resource_id) resource = resource_data.get("resource", {}) if not resource.get("id"): return f"Error: Resource with ID '{resource_id}' not found." resource_title = resource.get("title") or resource.get("name") or "Unknown" content_parts = [ f"Resource Information: {resource_title}", "", f"Resource ID: {resource_id}", ] if resource.get("format"): content_parts.append(f"Format: {resource.get('format')}") if resource.get("filesize"): size = resource.get("filesize") if isinstance(size, int): # Format size in human-readable format if size < 1024: size_str = f"{size} B" elif size < 1024 * 1024: size_str = f"{size / 1024:.1f} KB" elif size < 1024 * 1024 * 1024: size_str = f"{size / (1024 * 1024):.1f} MB" else: size_str = f"{size / (1024 * 1024 * 1024):.1f} GB" content_parts.append(f"Size: {size_str}") if resource.get("mime"): content_parts.append(f"MIME type: {resource.get('mime')}") if resource.get("type"): content_parts.append(f"Type: {resource.get('type')}") if resource.get("url"): content_parts.append("") content_parts.append(f"URL: {resource.get('url')}") if resource.get("description"): content_parts.append("") content_parts.append(f"Description: {resource.get('description')}") # Dataset information dataset_id = resource_data.get("dataset_id") if dataset_id: content_parts.append("") content_parts.append(f"Dataset ID: {dataset_id}") try: dataset_meta = await datagouv_api_client.get_dataset_metadata( str(dataset_id) ) if dataset_meta.get("title"): content_parts.append(f"Dataset: {dataset_meta.get('title')}") except Exception: # noqa: BLE001 pass # Check if resource is available via Tabular API content_parts.append("") try: # Try to get profile to check if it's tabular profile_url = f"{env_config.get_base_url('tabular_api')}resources/{resource_id}/profile/" async with httpx.AsyncClient() as session: resp = await session.get(profile_url, timeout=10.0) if resp.status_code == 200: content_parts.append( "✅ Available via Tabular API (can be queried)" ) else: content_parts.append( "⚠️ Not available via Tabular API (may not be tabular data)" ) except Exception: # noqa: BLE001 content_parts.append("⚠️ Could not check Tabular API availability") return "\n".join(content_parts) except httpx.HTTPStatusError as e: return f"Error: HTTP {e.response.status_code} - {str(e)}" except Exception as e: # noqa: BLE001 return f"Error: {str(e)}"
- tools/__init__.py:14-23 (registration)Top-level registration function that calls register_get_resource_info_tool(mcp) to register the tool among others.def register_tools(mcp: FastMCP) -> None: """Register all MCP tools with the provided FastMCP instance.""" register_search_datasets_tool(mcp) register_query_resource_data_tool(mcp) register_get_dataset_info_tool(mcp) register_list_dataset_resources_tool(mcp) register_get_resource_info_tool(mcp) register_download_and_parse_resource_tool(mcp) register_get_metrics_tool(mcp)
- tools/get_resource_info.py:9-32 (schema)Input schema: resource_id (str); Output: str with formatted resource info. Detailed docstring describes usage and workflow.async def get_resource_info(resource_id: str) -> str: """ Get detailed information about a specific resource (file). Returns comprehensive metadata including format, size, MIME type, URL, and associated dataset information. Also checks if the resource is available via the Tabular API (data.gouv.fr's API for parsing tabular files without downloading them). Use this tool to determine which data querying tool to use: - If available via Tabular API: use query_resource_data (faster, no download needed) - If not available or too large: use download_and_parse_resource Typical workflow: 1. Use list_dataset_resources to find resources in a dataset 2. Use get_resource_info to check resource details and Tabular API availability 3. Use query_resource_data or download_and_parse_resource based on availability Args: resource_id: The ID of the resource to get information about (obtained from list_dataset_resources) Returns: Formatted text with detailed resource information, including Tabular API availability status """