Skip to main content
Glama
sprine

ontario-data-mcp

by sprine

ontario-data-mcp

IMPORTANT


Beta: This project is under active development. The data structure and tool interfaces may change. LLM-generated analysis may contain errors. Always verify critical findings against the returned source data.

This is an MCP server for discovering, downloading, querying, and analyzing datasets from Ontario's Open Data portals. It allows asking questions of the data in English (or Spanish, Chinese, French, etc).

It currently supports the Ontario, Toronto, Ottawa, Waterloo, Kitchener, and Region of Waterloo portals, and utilizes a shared DuckDB cache for fast SQL queries, statistical analysis, and geospatial operations.

Contributing

Contributions welcome! To get started, see Installation below.

Found a bug? Have an idea? Discovered something interesting? Open an issue here: https://github.com/sprine/ontario-data-mcp/issues

Features

  • find - search across supported Ontario open data portals

  • download - retrieve and cache datasets

  • query - run SQL, statistical, and geospatial analysis via DuckDB

  • validate — verify that data claims are supported by query results

  • A shared DuckDB cache for high-performance analytics

Architecture

flowchart TD
    Client["AI Client<br/>(Claude Code · VS Code · etc.)"]

    subgraph Server["ontario-data-mcp (FastMCP)"]
        direction TB

        subgraph Tools["MCP Tools"]
            direction LR
            T1["Discovery"]
            T2["Metadata"]
            T3["Retrieval"]
            T4["Querying"]
            T5["Geospatial"]
            T6["Quality & Validation"]
        end

        PC["Portal Clients<br/>CKANClient · ArcGISHubClient"]
        Cache[("DuckDB Cache<br/>~/.cache/ontario-data/")]

        Tools -->|"fan out to all portals"| PC
        T3 & T5 -->|"download → store"| Cache
        T4 & T6 -->|"SQL queries"| Cache
    end

    subgraph Portals["Open Data Portals"]
        direction LR
        CKAN["Ontario · Toronto<br/>CKAN API"]
        ArcGIS["Ottawa · Waterloo · Kitchener<br/>Region of Waterloo<br/>ArcGIS Hub"]
    end

    Client <-->|"MCP Protocol"| Tools
    PC -->|"CKAN 2.8"| CKAN
    PC -->|"OGC Records / Hub v3"| ArcGIS

Data flow: Discovery and metadata tools fan out to all portals in parallel. Retrieval tools download data and store it in a local DuckDB cache. Querying and quality tools run fast SQL locally against the cache — no repeated API calls.

Installation

With Claude Code

claude mcp add ontario-data -- uvx ontario-data-mcp

To auto-approve all tool calls (no confirmation prompts), add to your Claude Code settings:

{
  "permissions": {
    "allow": ["mcp:ontario-data:*"]
  }
}

Tools are annotated as read-only or destructive per the MCP spec. Download tools populate the local cache but are read-only (no remote mutations). Destructive tools (cache_manage, refresh_cache) only modify local cached data.

Add to .vscode/mcp.json:

{
  "mcpServers": {
    "ontario-data": {
      "command": "uvx",
      "args": ["ontario-data-mcp"]
    }
  }
}
git clone https://github.com/sprine/ontario-data-mcp
cd ontario-data-mcp
uv sync
uv run ontario-data-mcp

To connect from source to Claude Code:

Note: MCP subprocesses don't inherit your shell's PATH, so you must use the absolute path to uv (find it with which uv).

claude mcp add ontario-data -- /absolute/path/to/uv run --directory /path/to/ontario-data-mcp ontario-data-mcp

Supported Portals

All searches fan out to every portal by default — no need to select a portal. Dataset and resource IDs are prefixed with their portal (e.g. toronto:abc123).

Portal

Platform

Datasets

ontario

CKAN

~5,700

toronto

CKAN

~533

ottawa

ArcGIS Hub

~665

waterloo

ArcGIS Hub

~129

kitchener

ArcGIS Hub

~219

region-waterloo

ArcGIS Hub

~125

List of tools available to the AI agent

Tool

Description

search_datasets

Search for datasets across all portals (or narrow with portal=)

list_portals

List all available portals with platform type

list_organizations

List government ministries with dataset counts

list_topics

List all tags/topics in the catalogue

find_related_datasets

Find datasets related by tags and organization

Tool

Description

get_dataset_info

Get full metadata for a dataset (use prefixed ID like toronto:abc123)

list_resources

List all files in a dataset with formats and sizes

get_resource_schema

Get column schema and sample values for a datastore resource

compare_datasets

Compare metadata side-by-side for multiple datasets (cross-portal)

Tool

Description

download_resource

Download a resource and cache it in DuckDB (use prefixed ID like toronto:abc123)

cache_info

Cache statistics + list all cached datasets with staleness

cache_manage

Remove a single cached resource or clear the entire cache

refresh_cache

Re-download cached resources with latest data

Tool

Description

query_resource

Query a resource via CKAN Datastore API (remote)

sql_query

Run SQL against the CKAN Datastore (remote)

query_cached

Run SQL against locally cached data in DuckDB

preview_data

Quick preview of first N rows of a resource

Tool

Description

check_freshness

Check if a dataset is current vs. its update schedule

profile_data

Statistical profile using DuckDB SUMMARIZE

validate_result

Validate that a data claim is supported by query results

Tool

Description

load_geodata

Cache a geospatial resource (SHP, KML, GeoJSON) into DuckDB

spatial_query

Run spatial queries against cached geospatial data

list_geo_datasets

Find datasets containing geospatial resources

MCP Resources

Resources the agent can read for context without calling a tool:

URI

Description

ontario://cache/index

List of all locally cached datasets with freshness info

ontario://dataset/{dataset_id}

Full metadata for a specific dataset (supports prefixed IDs)

ontario://portal/stats

Overview statistics across all data portals

ontario://schema/{table_name}

Column schema, types, sample values, and type warnings for a cached table

ontario://guides/duckdb-sql

DuckDB SQL reference for Ontario open data analysis

Prompts

Context-aware guided workflow prompts:

  • explore_topic — Guided exploration of a topic (fetches live catalogue context)

  • data_investigation — Deep dive into a specific dataset: schema, quality, statistics

  • compare_data — Side-by-side analysis of multiple datasets

Environment Variables

Variable

Default

Purpose

ONTARIO_DATA_CACHE_DIR

~/.cache/ontario-data

DuckDB storage + log file location

ONTARIO_DATA_TIMEOUT

30

HTTP timeout in seconds

ONTARIO_DATA_RATE_LIMIT

10

Max CKAN requests per second

Development

uv sync
uv run python -m pytest tests/ -v

License

MIT — see LICENSE for the software.

Data accessed through this tool is provided under the following open government licences:

Install Server
A
license - permissive license
A
quality
D
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Issues opened vs closed

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sprine/ontario-data-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server