Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
MCP_PORTNoPort for the MCP HTTP server8000
DATAGOUV_ENVNoControls which data.gouv.fr environment to use: 'prod' for https://www.data.gouv.fr or 'demo' for https://demo.data.gouv.frprod

Tools

Functions exposed to the LLM to take actions

NameDescription
search_datasets

Search for datasets on data.gouv.fr by keywords.

This is typically the first step in exploring data.gouv.fr. Returns a list of datasets matching the search query with their metadata, including title, description, organization, tags, and resource count.

After finding relevant datasets, use get_dataset_info to get more details, or list_dataset_resources to see what files are available in a dataset.

Args: query: Search query string (searches in title, description, tags) page: Page number (default: 1) page_size: Number of results per page (default: 20, max: 100)

Returns: Formatted text with dataset information, including dataset IDs for further queries

query_resource_data

Query data from a specific resource (file) via the Tabular API.

The Tabular API is data.gouv.fr's API for parsing and querying the content of resources (files) on the platform. It allows you to access structured data from tabular files (CSV, XLSX, etc.) without downloading the entire file. This tool fetches rows from a specific resource using this API.

Each call retrieves up to 200 rows (the maximum allowed by the API).

Note: The Tabular API has size limits (CSV > 100 MB, XLSX > 12.5 MB are not supported). For larger files or unsupported formats, use download_and_parse_resource. You can use get_resource_info to check if a resource is available via Tabular API.

Recommended workflow:

  1. Use search_datasets to find the appropriate dataset

  2. Use list_dataset_resources to see available resources (files) in the dataset

  3. (Optional) Use get_resource_info to verify Tabular API availability

  4. Use query_resource_data with the chosen resource_id to fetch data

  5. If the answer is not in the first page, use query_resource_data with page=2, page=3, etc.

Args: question: The question or description of what data you're looking for (for context) resource_id: Resource ID (use list_dataset_resources to find resource IDs) page: Page number to retrieve (default: 1). Use this to navigate through large datasets. Each page contains up to 200 rows.

Returns: Formatted text with the data found from the resource, including pagination info

get_dataset_info

Get detailed information about a specific dataset.

Returns comprehensive metadata including title, description, organization, tags, resource count, creation/update dates, license, and other details. Use this after finding a dataset with search_datasets to get more context before exploring its resources.

Typical workflow:

  1. Use search_datasets to find datasets of interest

  2. Use get_dataset_info to get detailed information about a specific dataset

  3. Use list_dataset_resources to see what files are available in the dataset

Args: dataset_id: The ID of the dataset to get information about (obtained from search_datasets)

Returns: Formatted text with detailed dataset information

list_dataset_resources

List all resources (files) in a dataset with their metadata.

Returns information about each resource including ID, title, format, size, and type. This is a key step before querying data from resources.

Typical workflow:

  1. Use search_datasets to find datasets

  2. Use list_dataset_resources to see what files are in a dataset

  3. Use get_resource_info to check if a resource is available via Tabular API

  4. Use query_resource_data (for Tabular API) or download_and_parse_resource (for large/unsupported files)

Args: dataset_id: The ID of the dataset to list resources from (obtained from search_datasets or get_dataset_info)

Returns: Formatted text listing all resources with their metadata, including resource IDs for data queries

get_resource_info

Get detailed information about a specific resource (file).

Returns comprehensive metadata including format, size, MIME type, URL, and associated dataset information. Also checks if the resource is available via the Tabular API (data.gouv.fr's API for parsing tabular files without downloading them).

Use this tool to determine which data querying tool to use:

  • If available via Tabular API: use query_resource_data (faster, no download needed)

  • If not available or too large: use download_and_parse_resource

Typical workflow:

  1. Use list_dataset_resources to find resources in a dataset

  2. Use get_resource_info to check resource details and Tabular API availability

  3. Use query_resource_data or download_and_parse_resource based on availability

Args: resource_id: The ID of the resource to get information about (obtained from list_dataset_resources)

Returns: Formatted text with detailed resource information, including Tabular API availability status

download_and_parse_resource

Download and parse a resource that is not accessible via Tabular API.

The Tabular API is data.gouv.fr's API for parsing tabular files (CSV, XLSX, etc.) without downloading them. However, it has limitations. This tool downloads and parses resources directly when the Tabular API cannot be used.

This tool is useful for:

  • Files larger than Tabular API limits (CSV > 100 MB, XLSX > 12.5 MB)

  • Formats not supported by Tabular API (JSON, XML, etc.)

  • Files with external URLs

For smaller tabular files, prefer using query_resource_data with the Tabular API as it's faster and more efficient. Use get_resource_info to check if a resource is available via Tabular API before choosing which tool to use.

Typical workflow:

  1. Use list_dataset_resources to find resources in a dataset

  2. Use get_resource_info to check Tabular API availability

  3. If not available via Tabular API, use download_and_parse_resource

  4. If available via Tabular API, use query_resource_data instead

Supported formats: CSV, CSV.GZ, JSON, JSONL, XLSX (if openpyxl available)

Args: resource_id: The ID of the resource to download and parse max_rows: Maximum number of rows to return (default: 1000) max_size_mb: Maximum file size to download in MB (default: 500)

Returns: Formatted text with the parsed data

get_metrics

Get metrics (visits, downloads) for a dataset and/or a resource.

Returns monthly statistics including visits and downloads, sorted by month in descending order (most recent first). This tool is useful for analyzing the popularity and usage of datasets and resources, but is optional in the data exploration workflow.

Typical use cases:

  • Analyze which datasets/resources are most popular

  • Track usage trends over time

  • Understand data consumption patterns

Note: This is separate from the main data querying workflow. Use this after exploring datasets/resources if you need usage statistics.

Args: dataset_id: Optional dataset ID to get metrics for (obtained from search_datasets or get_dataset_info) resource_id: Optional resource ID to get metrics for (obtained from list_dataset_resources or get_resource_info) limit: Maximum number of monthly records to return (default: 12, max: 100)

Returns: Formatted text with monthly metrics for the dataset and/or resource

Note: At least one of dataset_id or resource_id must be provided. This tool only works with the production environment (DATAGOUV_ENV=prod). The Metrics API does not have a demo/preprod environment.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bolinocroustibat/datagouv-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server