datagouv-mcp

README.md•12.4 KiB

# data.gouv.fr MCP Server [![CircleCI](https://circleci.com/gh/datagouv/datagouv-mcp.svg?style=svg)](https://circleci.com/gh/datagouv/datagouv-mcp) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from [data.gouv.fr](https://www.data.gouv.fr), the French national Open Data platform, directly through conversation. ## 🤔 What is this? The data.gouv.fr MCP server is a tool that allows AI chatbots (like Claude, Gemini, or Cursor) to interact with datasets from [data.gouv.fr](https://www.data.gouv.fr). Instead of manually browsing the website, you can simply ask questions like "Quels jeux de données sont disponibles sur les prix de l'immobilier ?" or "Montre-moi les dernières données de population pour Paris" and get instant answers. A hosted endpoint is available at `https://mcp.data.gouv.fr/mcp`, and you can also run the server locally if you prefer. The server is built using the [official Python SDK for MCP servers and clients](https://github.com/modelcontextprotocol/python-sdk) and uses the Streamable HTTP transport protocol. ## 🌐 Connect your chatbot to the MCP server Use the hosted endpoint `https://mcp.data.gouv.fr/mcp` (recommended). If you self-host, swap in your own URL. The MCP server configuration depends on your client. Use the appropriate configuration format for your client: ### ChatGPT ChatGPT can connect via “Connectors” (beta, paid plans only: Plus/Pro/Team/Enterprise depending on workspace enablement). Create a custom connector and set the URL to `https://mcp.data.gouv.fr/mcp` (no API key needed, tools are read-only). If the connector option is missing in your account, it has not yet been enabled by OpenAI. ### Claude Desktop Add the following to your Claude Desktop configuration file (typically `~/Library/Application Support/Claude/claude_desktop_config.json` on MacOS, or `%APPDATA%\Claude\claude_desktop_config.json` on Windows): ```json { "mcpServers": { "datagouv": { "command": "npx", "args": [ "mcp-remote", "https://mcp.data.gouv.fr/mcp" ] } } } ``` ### Claude Code Use the `claude mcp` command to add the MCP server: ```shell claude mcp add --transport http datagouv https://mcp.data.gouv.fr/mcp ``` ### Gemini CLI Add the following to your `~/.gemini/settings.json` file: ```json { "mcpServers": { "datagouv": { "transport": "http", "httpUrl": "https://mcp.data.gouv.fr/mcp" } } } ``` ### Mistral Vibe CLI Edit your Vibe config (default `~/.vibe/config.toml`) and add the MCP server: ```toml [[mcp_servers]] name = "datagouv" transport = "streamable-http" url = "https://mcp.data.gouv.fr/mcp" ``` See the full Vibe MCP options in the official docs: [MCP server configuration](https://github.com/mistralai/mistral-vibe?tab=readme-ov-file#mcp-server-configuration). ### AnythingLLM 1. Locate the `anythingllm_mcp_servers.json` file in your AnythingLLM storage plugins directory: - **Mac**: `~/Library/Application Support/anythingllm-desktop/storage/plugins/anythingllm_mcp_servers.json` - **Linux**: `~/.config/anythingllm-desktop/storage/plugins/anythingllm_mcp_servers.json` - **Windows**: `C:\Users\<username>\AppData\Roaming\anythingllm-desktop\storage\plugins\anythingllm_mcp_servers.json` 2. Add the following configuration: ```json { "mcpServers": { "datagouv": { "type": "streamable", "url": "https://mcp.data.gouv.fr/mcp" } } } ``` For more details, see the [AnythingLLM MCP documentation](https://docs.anythingllm.com/mcp-compatibility/overview). ### VS Code Add the following to your VS Code `settings.json`: ```json { "servers": { "datagouv": { "url": "https://mcp.data.gouv.fr/mcp", "type": "http" } } } ``` ### Cursor Cursor supports MCP servers through its settings. To configure the server: 1. Open Cursor Settings 2. Search for "MCP" or "Model Context Protocol" 3. Add a new MCP server with the following configuration: ```json { "mcpServers": { "datagouv": { "url": "https://mcp.data.gouv.fr/mcp", "transport": "http" } } } ``` ### Windsurf Add the following to your `~/.codeium/mcp_config.json`: ```json { "mcpServers": { "datagouv": { "command": "npx", "args": [ "-y", "mcp-remote", "https://mcp.data.gouv.fr/mcp" ] } } } ``` **Note:** - The hosted endpoint is `https://mcp.data.gouv.fr/mcp`. If you run the server yourself, replace it with your own URL (see “Run locally” below for the default local endpoint). - This MCP server only exposes read-only tools for now, so no API key is required. ## 🖥️ Run locally ### 1. Run the MCP server Before starting, clone this repository and browse into it: ```shell git clone git@github.com:datagouv/datagouv-mcp.git cd datagouv-mcp ``` Docker is required for the recommended setup. Install it via [Docker Desktop](https://www.docker.com/products/docker-desktop/) or any compatible Docker Engine before continuing. #### 🐳 With Docker (Recommended) ```shell # With default settings (port 8000, prod environment) docker compose up -d # With custom environment variables MCP_PORT=8007 DATAGOUV_ENV=demo docker compose up -d # Stop docker compose down ``` **Environment variables:** - `MCP_PORT`: port for the MCP HTTP server (defaults to `8000` when unset). - `DATAGOUV_ENV`: `prod` (default) or `demo`. This controls which data.gouv.fr environement it uses the data from (https://www.data.gouv.fr or https://demo.data.gouv.fr). By default the MCP server talks to the production data.gouv.fr. Set `DATAGOUV_ENV=demo` if you specifically need the demo environment. #### ⚙️ Manual Installation You will need [uv](https://github.com/astral-sh/uv) to install dependencies and run the server. 1. **Install dependencies** ```shell uv sync ``` 2. **Prepare the environment file** Copy the [example environment file](.env.example) to create your own `.env` file: ```shell cp .env.example .env ``` Then optionnaly edit `.env` and set the variables that matter for your run: ``` MCP_PORT=8007 # (defaults to 8000 when unset) DATAGOUV_ENV=prod # Allowed values: demo | prod (defaults to prod when unset) ``` Load the variables with your preferred method, e.g.: ```shell set -a && source .env && set +a ``` 3. **Start the HTTP MCP server** ```shell uv run main.py ``` ### 2. Connect your chatbot to the local MCP server Follow the steps in [Connect your chatbot to the MCP server](#-connect-your-chatbot-to-the-mcp-server) and simply swap the hosted URL for your local endpoint (default: `http://127.0.0.1:${MCP_PORT:-8000}/mcp`). ## 🚚 Transport support This MCP server uses FastMCP and implements the **Streamable HTTP transport only**. **STDIO and SSE are not supported**. ## 📋 Available Endpoints **Streamable HTTP transport (standards-compliant):** - `POST /mcp` - JSON-RPC messages (client → server) - `GET /health` - Simple JSON health probe (`{"status":"ok","timestamp":"..."}`) ## 🛠️ Available Tools The MCP server provides tools to interact with data.gouv.fr datasets: - **`search_datasets`** - Search for datasets by keywords. Returns datasets with metadata (title, description, organization, tags, resource count). Parameters: `query` (required), `page` (optional, default: 1), `page_size` (optional, default: 20, max: 100) - **`get_dataset_info`** - Get detailed information about a specific dataset (metadata, organization, tags, dates, license, etc.). Parameters: `dataset_id` (required) - **`list_dataset_resources`** - List all resources (files) in a dataset with their metadata (format, size, type, URL). Parameters: `dataset_id` (required) - **`get_resource_info`** - Get detailed information about a specific resource (format, size, MIME type, URL, dataset association, Tabular API availability). Parameters: `resource_id` (required) - **`query_resource_data`** - Query data from a specific resource via the Tabular API. Fetches rows from a resource to answer questions. Parameters: `question` (required), `resource_id` (required), `page` (optional, default: 1) Note: Each call retrieves up to 200 rows (the maximum allowed by the Tabular API). Recommended workflow: 1) Use `search_datasets` to find the dataset, 2) Use `list_dataset_resources` to see available resources, 3) Use `query_resource_data` with the chosen resource ID. If the answer is not in the first page, use `page=2`, `page=3`, etc. to navigate through large datasets. Works for CSV/XLS resources within Tabular API size limits (CSV ≤ 100 MB, XLSX ≤ 12.5 MB). - **`download_and_parse_resource`** - Download and parse a resource that is not accessible via Tabular API (files too large, formats not supported, external URLs). Parameters: `resource_id` (required), `max_rows` (optional, default: 1000), `max_size_mb` (optional, default: 500) Supported formats: CSV, CSV.GZ, JSON, JSONL. Useful for files exceeding Tabular API limits or formats not supported by Tabular API. - **`get_metrics`** - Get metrics (visits, downloads) for a dataset and/or a resource. Parameters: `dataset_id` (optional), `resource_id` (optional), `limit` (optional, default: 12, max: 100) Returns monthly statistics including visits and downloads, sorted by month in descending order (most recent first). At least one of `dataset_id` or `resource_id` must be provided. **Note:** This tool only works with the production environment (`DATAGOUV_ENV=prod`). The Metrics API does not have a demo/preprod environment. ## 🧪 Tests ### ✅ Automated Tests with pytest Run the tests with pytest (these cover helper modules; the MCP server wiring is best exercised via the MCP Inspector): ```shell # Run all tests uv run pytest # Run with verbose output uv run pytest -v # Run specific test file uv run pytest tests/test_tabular_api.py # Run with custom resource ID RESOURCE_ID=3b6b2281-b9d9-4959-ae9d-c2c166dff118 uv run pytest tests/test_tabular_api.py # Run with prod environment DATAGOUV_ENV=prod uv run pytest ``` ### 🔍 Interactive Testing with MCP Inspector Use the official [MCP Inspector](https://modelcontextprotocol.io/docs/tools/inspector) to interactively test the server tools and resources. Prerequisites: - Node.js with `npx` available Steps: 1. Start the MCP server (see above) 2. In another terminal, launch the inspector: ```shell npx @modelcontextprotocol/inspector --http-url "http://127.0.0.1:${MCP_PORT}/mcp" ``` Adjust the URL if you exposed the server on another host/port. ## 🤝 Contributing ### 🧹 Code Linting and Formatting This project follows PEP 8 style guidelines using [Ruff](https://astral.sh/ruff/) for linting and formatting. **Either running these commands manually or [installing the pre-commit hook](#-pre-commit-hooks) is required before submitting contributions.** ```shell # Lint and sort imports, and format code uv run ruff check --select I --fix && uv run ruff format ``` ### 🔗 Pre-commit Hooks This repository uses a [pre-commit](https://pre-commit.com/) hook which lint and format code before each commit. Installing the pre-commit hook is strongly recommended so the checks run automatically. **Install pre-commit hooks:** ```shell uv run pre-commit install ``` The pre-commit hook that automatically: - Check YAML syntax - Fix end-of-file issues - Remove trailing whitespace - Check for large files - Run Ruff linting and formatting ### 🏷️ Releases and versioning The release process uses the [`tag_version.sh`](tag_version.sh) script to create git tags, GitHub releases and update [CHANGELOG.md](CHANGELOG.md) automatically. Package version numbers are automatically derived from git tags using [setuptools_scm](https://github.com/pypa/setuptools_scm), so no manual version updates are needed in `pyproject.toml`. **Prerequisites**: [GitHub CLI](https://cli.github.com/) must be installed and authenticated, and you must be on the main branch with a clean working directory. ```shell # Create a new release ./tag_version.sh <version> # Example ./tag_version.sh 2.5.0 # Dry run to see what would happen ./tag_version.sh 2.5.0 --dry-run ``` The script automatically: - Extracts commits since the last tag and formats them for CHANGELOG.md - Identifies breaking changes (commits with `!:` in the subject) - Creates a git tag and pushes it to the remote repository - Creates a GitHub release with the changelog content ## 📄 License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bolinocroustibat/datagouv-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•12.4 KiB