Schema | ontario-data-mcp

ontario-data-mcp

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`ONTARIO_DATA_TIMEOUT`	No	HTTP timeout in seconds	30
`ONTARIO_DATA_CACHE_DIR`	No	DuckDB storage + log file location	~/.cache/ontario-data
`ONTARIO_DATA_RATE_LIMIT`	No	Max CKAN requests per second	10

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": true }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`extensions`	{ "io.modelcontextprotocol/ui": {} }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
search_datasetsA	Search for datasets across all open data portals (Ontario, Toronto, Ottawa). Results include portal-prefixed IDs (e.g. toronto:abc123). Use these IDs with get_dataset_info to view full metadata, then download_resource to cache locally. Use get_resource_schema to check column types before writing queries. Searches fan out to all portals by default. Narrow to one portal with the portal param. Args: query: Search terms (e.g. "covid cases", "housing prices", "school enrollment") organization: Filter by ministry/org (e.g. "health", "education") resource_format: Filter by file format (e.g. "CSV", "JSON", "SHP") update_frequency: Filter by frequency (e.g. "yearly", "monthly", "daily") sort_by: Sort order (default: relevance) limit: Max results per portal (1-50) portal: Narrow to one portal (e.g. "ontario", "toronto"). Default: all portals.
list_organizationsA	List government ministries and organizations with dataset counts across all portals. Args: include_counts: Include dataset counts per organization portal: Narrow to one portal. Default: all portals.
list_topicsA	List top tags/topics used across data portals, ranked by dataset count. Args: query: Optional filter to match tag names portal: Narrow to one portal. Default: all portals. limit: Max tags to return (default 100). Use a higher value to see more.
find_related_datasetsA	Find datasets related to a given dataset by shared tags and organization. Searches within the same portal as the source dataset only. Args: dataset_id: Prefixed dataset ID (e.g. "toronto:ttc-ridership") or bare ID limit: Max related datasets to return
list_portalsA	List all available data portals with their platform type and descriptions.
get_dataset_infoA	Get full metadata for a dataset including all resources. Args: dataset_id: Prefixed dataset ID (e.g. "toronto:ttc-ridership") or bare ID
list_resourcesA	List all resources (files) in a dataset with their formats and sizes. Args: dataset_id: Prefixed dataset ID (e.g. "toronto:ttc-ridership") or bare ID
get_resource_schemaA	Get the column schema and sample values for a datastore resource. Args: resource_id: Prefixed resource ID (e.g. "toronto:abc123") or bare ID sample_size: Number of sample rows to include
compare_datasetsA	Compare metadata side-by-side for multiple datasets (can be cross-portal). Args: dataset_ids: List of prefixed dataset IDs (e.g. ["toronto:abc", "ontario:def"]) to compare (2-5)
download_resourceA	Download a dataset resource and cache it locally in DuckDB for fast querying. Supports CSV, XLSX, JSON, and datastore-active resources. If already cached, returns staleness info so you can decide whether to refresh. Numeric columns stored as text are automatically cast to DOUBLE. Workflow: search_datasets → get_dataset_info → download_resource → query_cached. Args: resource_id: Prefixed resource ID (e.g. "toronto:abc123") or bare ID
cache_infoA	Get cache statistics and list all cached datasets. Returns size, table count, and details for every cached resource.
cache_manageA	Manage the local DuckDB cache: remove or clear cached data. Args: action: One of "remove" (single resource) or "clear" (all) resource_id: Required for "remove" action. Prefixed or bare ID accepted.
refresh_cacheA	Re-download cached resources to get the latest data. Args: resource_id: Specific resource to refresh (prefixed or bare ID), or omit to refresh all
query_resourceA	Query a resource via the CKAN Datastore API (remote, no download needed). Only works for resources with datastore_active=True. Args: resource_id: Prefixed resource ID (e.g. "toronto:abc123") or bare ID filters: Column filters as {column: value} pairs fields: List of columns to return (default: all) sort: Sort string (e.g. "date desc", "name asc") limit: Max rows (1-1000) offset: Row offset for pagination
sql_queryA	Run a SQL query against the CKAN Datastore (remote). NOTE: Prefer download_resource + query_cached for repeated queries — the remote API has rate limits (429 errors). Use this tool only for quick one-off queries on datastore-active resources. Use resource IDs as table names in double quotes. Example: SELECT "Column Name" FROM "resource-id-here" WHERE "Year" > 2020 LIMIT 10 Args: sql: SQL query string (read-only, SELECT only) portal: Portal to query (default: "ontario"). Required because SQL embeds resource IDs directly.
query_cachedA	Run a SQL query against locally cached data in DuckDB. Use table names from download_resource or cache_info. Supports full DuckDB SQL: aggregations, window functions, CTEs, JOINs across tables. Use SUM(quantity_col) not COUNT() when rows contain per-row counts (e.g. a "count" or "number_of" column). COUNT() counts rows, not quantities. Column names vary across resources in the same dataset — always DESCRIBE first. Use SELECT * RENAME ("old col" AS new_col) to normalize column names when joining across datasets. Values containing semicolons should be matched with LIKE patterns, not = equality. Quote table names with double quotes in SQL. Args: sql: SQL query (e.g. SELECT * FROM "ds_my_table_abc12345" LIMIT 10)
preview_dataA	Quick preview of the first N rows of a resource (fetched remotely). Args: resource_id: Prefixed resource ID (e.g. "toronto:abc123") or bare ID rows: Number of rows to preview (1-100)
check_freshnessA	Check if a dataset is current by comparing its update frequency to its last modification date. Args: dataset_id: Prefixed dataset ID (e.g. "toronto:ttc-ridership") or bare ID
profile_dataA	Statistical profile and quality check of a cached dataset. Uses DuckDB SUMMARIZE for column-level statistics (min, max, avg, std, nulls, unique counts). Also checks for duplicate rows. Args: resource_id: Resource ID (must be cached)
validate_resultA	Validate that a claim is supported by query results. Call this after making a data claim to verify it against the source. Re-executes the SQL, extracts numbers and terms from the claim, and checks them against the actual data. Args: sql: The SQL query that produced the data backing the claim claim: The natural-language claim to verify (e.g. "Toronto had 12,345 building permits in 2023")
load_geodataA	Download and cache a geospatial resource (SHP, KML, GeoJSON) into DuckDB with spatial support. Args: resource_id: Prefixed resource ID (e.g. "toronto:abc123") or bare ID force_refresh: Re-download even if cached
spatial_queryA	Run spatial queries against cached geospatial data. Args: resource_id: Resource ID (must be cached via load_geodata) operation: "contains_point", "within_bbox", or "within_radius" latitude: Latitude for point queries (-90 to 90) longitude: Longitude for point queries (-180 to 180) radius_km: Radius in kilometers (for within_radius, must be > 0) bbox: Bounding box as [min_lng, min_lat, max_lng, max_lat] (for within_bbox) limit: Max results
list_geo_datasetsA	Find all datasets that contain geospatial resources (SHP, KML, GeoJSON). Args: format_filter: Filter to specific format: "SHP", "KML", "GEOJSON", or None for all limit: Max results per portal portal: Narrow to one portal. Default: all portals.

Prompts

Interactive templates invoked by user choice

Name	Description
`explore_topic`	Guided exploration of a topic across all open data portals. Searches for datasets, summarizes what's available, and suggests deep dives.
`data_investigation`	Deep investigation of a specific dataset: schema, quality, statistics, insights.
`compare_data`	Side-by-side analysis of multiple datasets (comma-separated IDs, can be cross-portal).

Resources

Contextual data attached and managed by the client

Name	Description
`cache_index`	List of all locally cached datasets with freshness info.
`portal_stats`	Overview statistics across all data portals.
`duckdb_sql_guide`	DuckDB SQL reference for Ontario open data analysis.

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sprine/ontario-data-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server