Skip to main content
Glama
ondata

CKAN MCP Server

by ondata

Search CKAN Datasets

ckan_package_search
Read-onlyIdempotent

Search datasets on any CKAN open data portal using advanced Solr query syntax. Supports filters, facets, sorting, and pagination to find relevant data.

Instructions

Search for datasets (packages) on a CKAN server using Solr query syntax.

Supports full Solr search capabilities including filters, facets, and sorting. Use this to discover datasets matching specific criteria.

Note on parser behavior: Some CKAN portals use a restrictive default query parser that can break long OR queries. For those portals, this tool may force the query into 'text:(...)' based on per-portal config. You can override with 'query_parser' to force or disable this behavior per request.

Important - Date field semantics:

  • issued: publisher's content publish date when available (best proxy for "created/published")

  • modified: publisher's content update date when available

  • metadata_created: CKAN record creation timestamp (publish time on source portals, harvest time on aggregators; fallback for "created" if issued missing)

  • metadata_modified: CKAN record update timestamp (publish time on source portals, harvest time on aggregators; use for "updated/modified in last X")

Natural language mapping (important for tool callers):

  • "created"/"published" -> prefer issued; fallback to metadata_created

  • "updated"/"modified" -> prefer modified; fallback to metadata_modified

  • For "recent in last X", consider using content_recent (issued with metadata_created fallback)

Content-recent helper:

  • content_recent: if true, rewrites the query to use issued with a fallback to metadata_created when issued is missing.

  • content_recent_days: window for content_recent (default 30 days).

Args:

  • server_url (string): Base URL of CKAN server (e.g., "https://dati.gov.it/opendata")

  • q (string): Search query using Solr syntax (default: ":" for all)

  • fq (string): Filter query (e.g., "organization:comune-palermo") IMPORTANT — Solr fq syntax rules:

    1. OR inside a single field: use field:(val1 OR val2), NOT field:val1 OR field:val2. Wrong: fq=type:"A" OR type:"B" → silently ignored, returns entire catalog. Right: fq=type:("A" OR "B")

    2. CKAN extras fields are indexed as extras_fieldname, not fieldname. e.g. to filter on extra field "hvd_category" use fq=extras_hvd_category:""

  • rows (number): Number of results to return (default: 10, max: 1000)

  • start (number): Offset for pagination (default: 0)

  • page (number): Page number (1-based); alias for start. Overrides start if provided.

  • page_size (number): Results per page when using page (default: 10, max: 1000)

  • sort (string): Sort field and direction (e.g., "metadata_modified desc")

  • facet_field (array): Fields to facet on (e.g., ["organization", "tags"])

  • facet_limit (number): Max facet values per field (default: 50)

  • include_drafts (boolean): Include draft datasets (default: false)

  • query_parser ('default' | 'text'): Override search parser behavior

  • response_format ('markdown' | 'json'): Output format

Returns: Search results with:

  • count: Number of results found

  • results: Array of dataset objects

  • facets: Facet counts (if facet_field specified)

  • search_facets: Detailed facet information

Query Syntax (parameter q): Boolean operators: - AND / &&: "water AND climate" - OR / ||: "health OR sanità" - NOT / !: "data NOT personal" - +required -excluded: "+title:water -title:sea" - Grouping: "(title:water OR title:climate) AND tags:environment"

Wildcards: - : "title:environment" (matches environmental, environments, etc.) - Note: Left truncation (*water) not supported

Fuzzy search (edit distance): - : "title:rest" or "title:rest~1" (finds "test", "best", "rest")

Proximity search (words within N positions): - "phrase"~N: "title:"climate change"~5"

Range queries: - Inclusive [a TO b]: "num_resources:[5 TO 10]" - Exclusive {a TO b}: "num_resources:{0 TO 100}" - One side open: "metadata_modified:[2024-01-01T00:00:00Z TO *]"

Date math: - NOW-1YEAR, NOW-6MONTHS, NOW-7DAYS, NOW-1HOUR - NOW/DAY, NOW/MONTH (round down) - Combined: "metadata_modified:[NOW-2MONTHS TO NOW]" - Example: "metadata_created:[NOW-1YEAR TO *]" - IMPORTANT: NOW syntax works on metadata_modified and metadata_created fields - For 'modified' and 'issued' fields, NOW syntax is auto-converted to ISO dates - Manual ISO dates always work: "modified:[2026-01-15T00:00:00Z TO *]"

Field existence: - Exists: "field:" or "field:[ TO ]" - Not exists: "NOT field:" or "-field:*"

Boosting (relevance scoring): - Boost term: "title:water^2 OR notes:water" (title matches score higher) - Constant score: "title:water^=1.5"

Examples:

  • Search all: { q: ":" }

  • By tag: { q: "tags:sanità" }

  • Boolean: { q: "(title:water OR title:climate) AND NOT title:sea" }

  • Wildcard: { q: "title:environment*" }

  • Fuzzy: { q: "title:health~2" }

  • Proximity: { q: "notes:"open data"~3" }

  • Date range: { q: "metadata_modified:[2024-01-01T00:00:00Z TO 2024-12-31T23:59:59Z]" }

  • Date math: { q: "metadata_modified:[NOW-6MONTHS TO *]" }

  • Date math (auto-converted): { q: "modified:[NOW-30DAYS TO NOW]" }

  • Published in 2025 (content date): { fq: "issued:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]" }

  • First appeared on portal in 2025: { fq: "metadata_created:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]" }

  • Recent content (issued w/ fallback): { q: ":", content_recent: true, content_recent_days: 180 }

  • Field exists: { q: "organization:* AND num_resources:[1 TO *]" }

  • Boosting: { q: "title:climate^2 OR notes:climate" }

  • Filter org: { fq: "organization:regione-siciliana" }

  • Filter extras field (correct): { fq: "extras_hvd_category:"http://data.europa.eu/bna/c_ac64a52d"" }

  • Filter extras OR (correct): { fq: "extras_hvd_category:("http://data.europa.eu/bna/c_ac64a52d" OR "http://data.europa.eu/bna/c_dd313021")" }

  • Get facets: { facet_field: ["organization"], rows: 0 }

Query language: Before searching a portal, check its locale via ckan_status_show (field: "Portal Locale" / locale_default). Translate query terms to the portal's language — searching in English on a non-English portal returns 0 results. Examples: locale "it" → Italian terms; "uk_UA" → Ukrainian (Cyrillic); "fr_FR" → French. Exception: multilingual portals (e.g. data.europa.eu, open.canada.ca) accept EN + native terms joined with OR.

Typical workflow: ckan_status_show (check locale) → ckan_package_search (query in portal's language) → ckan_package_show (get full metadata + resource IDs) → ckan_datastore_search (query tabular data)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
qNoSearch query in Solr syntax*:*
fqNoFilter query in Solr syntax; applied after scoring, does not affect relevance. CKAN extras fields use prefix 'extras_' (e.g. extras_hvd_category). For OR on same field use field:(val1 OR val2), never field:val1 OR field:val2 (silently breaks). Examples: 'organization:comune-palermo', 'res_format:CSV', 'extras_hvd_category:("uri1" OR "uri2")'.
pageNoPage number (1-based); alias for start. Overrides start if provided.
rowsNoNumber of results to return
sortNoSort field and direction (e.g., 'metadata_modified desc')
startNoOffset for pagination
page_sizeNoResults per page when using page (default: 10)
server_urlYesBase URL of the CKAN server
facet_fieldNoFields to facet on
facet_limitNoMaximum facet values per field
query_parserNoOverride search parser ('text' forces text:(...) on non-fielded queries)
content_recentNoUse issued date with fallback to metadata_created for recent content
include_draftsNoInclude draft datasets
response_formatNoOutput format: 'markdown' for human-readable or 'json' for machine-readablemarkdown
content_recent_daysNoDay window for content_recent (default 30)
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint, idempotentHint, openWorldHint, and non-destructive. Description adds crucial behavioral details: parser behavior on restrictive portals, date field semantics (issued vs modified vs metadata_created), natural language mapping, and content_recent helper. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Long but well-structured with sections, bullet points, and examples. Front-loaded with purpose. Every section earns its place given complexity (15 params, query language nuances). Minor redundancy in date examples could be trimmed, but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high complexity (15 params, no output schema), the description is remarkably complete: covers query syntax, date semantics, parser behavior, locale guidance, workflow, and 13 examples. No gaps for intended use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant value: explains fq Solr syntax rules with examples, q query syntax with boolean/wildcard/fuzzy/date math, content_recent behavior, and query_parser override. Goes well beyond what schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Search for datasets (packages) on a CKAN server using Solr query syntax', with specific verb ('search'), resource ('datasets'), and scope. It distinguishes from siblings like ckan_package_show (single dataset) and ckan_list_resources (resources).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides extensive context: when to use (discover datasets), typical workflow (status_show → package_search → package_show → datastore_search), and locale guidance. Lacks explicit 'when not to use', but alternatives like ckan_package_show are implied.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ondata/ckan-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server