CKAN MCP Server

ckan-mcp
examples

README_analysis.md•4.24 KiB

# CKAN Analysis Components Examples This directory contains examples demonstrating the analysis components of the CKAN MCP server. ## Files ### `analysis_demo.py` (Recommended) A simplified, working demonstration of all three analysis components: - **RelevanceScorer**: Scores datasets based on query relevance - **UpdateFrequencyAnalyzer**: Categorizes dataset update patterns - **SummaryBuilder**: Creates structured summaries of CKAN data ### `analysis.py` A comprehensive demonstration with more detailed examples and real-world data access. This file has some advanced features that may require troubleshooting. ## Running the Examples ### Prerequisites Set up environment variables for Toronto Open Data access: ```bash export CKAN_BASE_URL='https://ckan0.cf.opendata.inter.prod-toronto.ca/api/3/action' export CKAN_SITE_URL='https://ckan0.cf.opendata.inter.prod-toronto.ca' ``` ### Run the Demo ```bash # Activate virtual environment source venv/bin/activate # Run the simplified demo (recommended) python examples/analysis_demo.py ``` ## Analysis Components Overview ### 1. RelevanceScorer The RelevanceScorer ranks datasets based on how well they match a query using weighted scoring: **Scoring Components:** - **Title matches**: 15 points (highest weight) - **Description matches**: 7 points - **Tag matches**: 5 points - **Organization matches**: 3 points - **Resource matches**: 2 points (lowest weight) **Example Output:** ``` Query: 'traffic ' -> Score: 29 Title match: ✓ (+15) Description match: ✓ (+7) Tag match: ✓ (+5) Resource match: ✓ (+2) ``` ### 2. UpdateFrequencyAnalyzer The UpdateFrequencyAnalyzer categorizes dataset update patterns using: **Methods:** - **Explicit patterns**: Reads `refresh_rate` field (daily, weekly, monthly, etc.) - **Inferred patterns**: Analyzes `metadata_modified` timestamps against configurable thresholds **Frequency Categories:** - `DAILY` - Real-time or daily updates - `WEEKLY` - Weekly updates - `MONTHLY` - Monthly updates - `QUARTERLY` - Quarterly updates - `ANNUALLY` - Annual updates - `IRREGULAR` - As-needed updates - `FREQUENT` - Recent updates (within 14 days) - `MONTHLY` - Updates within 45 days - `QUARTERLY` - Updates within 120 days - `INFREQUENT` - Older updates (120+ days) **Thresholds (Customizable):** - Frequent: 14 days - Monthly: 45 days - Quarterly: 120 days ### 3. SummaryBuilder The SummaryBuilder creates structured, truncated summaries of CKAN data: **Package Summary:** - Truncated description (200 chars max) - Key metadata (created, modified dates) - Resource counts (total vs datastore-enabled) - Dataset URL generation - Organization and tag information **Resource Summary:** - Resource metadata (format, size, datastore status) - Last modified information - Datastore analysis (fields, record counts, sample data) **Example Output:** ``` Package Summary: -------------------- ID: traffic-volumes-toronto Title: Traffic Volumes - Toronto Transportation Description: This dataset contains traffic volume counts... Organization: City of Toronto Tags: ['transportation', 'traffic', 'real-time', 'api'] Resource Count: 2 Datastore Resources: 1 URL: https://ckan0.cf.opendata.inter.prod-toronto.ca/dataset/traffic-volumes-toronto ``` ## Configuration The analysis components use configurable weights and thresholds: ```python # Relevance scoring weights RelevanceWeights( title=15, # Title match weight description=7, # Description match weight tags=5, # Tag match weight organization=3, # Organization match weight resource=2 # Resource match weight ) # Frequency analysis thresholds FrequencyThresholds( frequent_days=14, # Days to consider "frequent" monthly_days=45, # Days to consider "monthly" quarterly_days=120 # Days to consider "quarterly" ) ``` ## Usage in Practice These analysis components are used throughout the CKAN MCP server to: 1. **Rank search results** - RelevanceScorer ensures most relevant datasets appear first 2. **Provide update insights** - UpdateFrequencyAnalyzer helps users understand data freshness 3. **Generate summaries** - SummaryBuilder creates concise, readable summaries for AI assistants The components work together to provide intelligent data discovery and analysis capabilities for open data portals.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/openascot/ckan-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README_analysis.md•4.24 KiB