socrata-mcp-server

Name: socrata-mcp-server
Author: cyanheads

by io.github.cyanheads

Server Details

Search and query government open-data portals (Socrata SODA API).

Status: Healthy
Last Tested: 2026-07-28 17:31
Transport: Streamable HTTP
URL
Repository: cyanheads/socrata-mcp-server
GitHub Stars: 1

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A4.6/5.0

Tool DescriptionsA

Average 4.5/5 across 6 of 6 tools scored.

Server CoherenceA

Disambiguation5/5

Each tool has a clear, distinct purpose: listing portals, searching datasets, fetching metadata, querying, and handling large results via DataFrame. No overlap or ambiguity.

Naming Consistency5/5

All tools follow the consistent pattern 'socrata_<verb>_<noun>' in snake_case. The two dataframe tools slightly diverge but still adhere to the same structure, making it predictable.

Tool Count5/5

With 6 tools, the server is well-scoped for the domain of accessing Socrata open data portals. Each tool serves a necessary step in the workflow without redundancy or missing coverage.

Completeness5/5

The tool set covers the full lifecycle: discover portals, find datasets, examine schemas, query data, and handle large results via DataFrame. No obvious gaps for the stated purpose.

Available Tools

6 tools

socrata_dataframe_describeDescribe DataCanvas Tables

Read-onlyIdempotent

Inspect

List registered tables in a DataCanvas session — schema, row count, column names, and registration time. Shows what datasets are available for SQL queries via socrata_dataframe_query. Only meaningful when CANVAS_PROVIDER_TYPE=duckdb is set. Use after socrata_query_dataset spills a large result set to canvas.

ParametersJSON Schema

Name	Required	Description	Default
`canvas_id`	No	Canvas ID returned by socrata_query_dataset when a large result spills to canvas. Required in practice when canvas is enabled — canvases cannot be enumerated, so omitting it fails with canvas_id_required instead of listing tables.

Output Schema

ParametersJSON Schema

Name	Required	Description
`notice`	No	Status message when canvas is not enabled or no tables are registered. Absent when tables are present.
`tables`	Yes	Tables available for SQL queries. Empty when none registered.
`canvas_id`	No	Canvas ID resolved, when canvas is enabled.

socrata_dataframe_queryQuery DataCanvas Table

Read-onlyIdempotent

Inspect

Run SELECT-only SQL against a DataCanvas table populated by socrata_query_dataset. DuckDB infers types from spilled data, so numeric columns that SODA returned as strings become queryable with numeric comparisons (year > 2020, amount < 500). Only works when CANVAS_PROVIDER_TYPE=duckdb is set. Use socrata_dataframe_describe to see registered tables and their schemas.

ParametersJSON Schema

Name	Required	Description
`sql`	Yes	SELECT-only SQL to run against registered canvas tables. DDL, DML, and file-reading functions are rejected. Use table names from socrata_dataframe_describe.
`limit`	No	Max rows to return (1–10000). Default 1000.
`canvas_id`	Yes	Canvas ID returned from socrata_query_dataset or socrata_dataframe_describe.

Output Schema

ParametersJSON Schema

Name	Required	Description
`cap`	No	The row limit that was applied when capped.
`sql`	Yes	SQL that was executed.
`rows`	Yes	Query result rows. DuckDB may return native JS types (number, boolean, null) for numeric/boolean columns.
`shown`	No	Rows returned in this response when capped.
`notice`	No	Guidance when the SQL returned zero rows. Absent when rows are present.
`canvas_id`	Yes	Canvas ID queried.
`row_count`	Yes	Number of rows returned.
`truncated`	No	True when results were capped at the limit — more rows match the query.

socrata_find_datasetsFind Socrata Datasets

Read-onlyIdempotent

Inspect

Search for datasets across all Socrata-powered government open-data portals, or scope to one portal with the domain parameter. Returns dataset IDs, names, abbreviated column lists, domains, and update timestamps. Use socrata_get_dataset to fetch the full typed column schema before writing queries — columnNames here are preview-only and lack type information.

ParametersJSON Schema

Name	Required	Description
`only`	No	Filter by asset type. Omit to include all types. Usually "datasets" is what you want.
`tags`	No	Filter by tags (e.g. ["covid19", "permits"]).
`limit`	No	Number of results to return (1–100). Default 10.
`order`	No	Sort order. Defaults to relevance. Use updated_at to surface recently-refreshed datasets.
`query`	No	Full-text search across dataset names and descriptions. Omit to browse without filtering.
`domain`	No	Scope search to a single portal (e.g. data.seattle.gov, data.cityofnewyork.us). Omit to search all portals.
`offset`	No	Pagination offset. Default 0.
`categories`	No	Filter by domain categories (e.g. ["Public Safety", "Transportation"]).

Output Schema

ParametersJSON Schema

Name	Required	Description
`notice`	No	Recovery hint when results are empty — echoes filters and suggests how to broaden. Absent on non-empty result pages.
`results`	Yes	Matching datasets. Empty when no results.
`totalCount`	Yes	Total matches before pagination. 0 when empty.
`effectiveQuery`	No	Search query applied, for reference.

socrata_get_datasetGet Dataset Schema

Read-onlyIdempotent

Inspect

Fetch full metadata and column schema for a Socrata dataset by ID. Returns field names, data types, descriptions, row count, and licensing. Always call this before writing a socrata_query_dataset — the column types determine correct WHERE clause syntax: Number columns accept bare literals (year=2023) while Text columns require single-quoted strings (year='2023').

ParametersJSON Schema

Name	Required	Description	Default
`domain`	No	Portal domain (e.g. data.seattle.gov). Defaults to SOCRATA_DEFAULT_DOMAIN env var or data.seattle.gov.
`dataset_id`	Yes	Four-by-four dataset ID matching pattern like kzjm-xkqj. Obtain from socrata_find_datasets.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Dataset display name.
`tags`	Yes	Associated tags.
`domain`	Yes	Portal domain hosting this dataset.
`columns`	Yes	Column schema. Computed region columns (:@computed_region_*) are excluded to reduce noise.
`license`	No	License name when available.
`category`	No	Domain category when available.
`row_count`	No	Approximate row count when available. See row_count_source for provenance.
`dataset_id`	Yes	Four-by-four dataset ID.
`description`	No	Dataset description when available.
`data_updated_at`	No	ISO 8601 timestamp of last data update when available.
`row_count_source`	No	How row_count was obtained: 'top_level_cached_contents' — reported directly by the portal's views metadata; 'column_cached_contents' — derived as the maximum per-column cached count when the top-level value is absent. Absent when row_count is absent.

socrata_list_portalsList Socrata Portals

Read-onlyIdempotent

Inspect

List known Socrata-powered government open-data portals with their domain, organization name, and approximate dataset count. The catalog is a curated list of 40 well-known portals; dataset counts are fetched from the Discovery API and cached for ~24 hours. Filtering is client-side substring match on the query parameter. Use this first when you do not know which portal to target, then pass the domain to socrata_find_datasets.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max portals to return (1–200). Default 50.
`query`	No	Keyword to filter portal names or organization names (case-insensitive substring match). Omit to list all portals.
`offset`	No	Pagination offset. Default 0.

Output Schema

ParametersJSON Schema

Name	Required	Description
`notice`	No	Recovery hint when no portals matched the filter. Absent on non-empty pages.
`portals`	Yes	Matching portals. Empty when no results.
`totalCount`	Yes	Total portals before pagination. 0 when empty.

socrata_query_datasetQuery Dataset

Read-onlyIdempotent

Inspect

Execute a SoQL query against any dataset on any Socrata portal. Use the search parameter for quick full-text lookup, or combine select/where/group/having/order for full analytical control. Returns rows plus the assembled SoQL string so you can learn the pattern. All SODA 2.1 row values are strings even for numeric columns — check dataType from socrata_get_dataset to determine correct WHERE quoting: Number columns use bare literals (year=2023), Text columns use single-quoted strings (year='2023'). To enumerate distinct values, use select="col, count(*) as n" with group="col" and order="n DESC". When CANVAS_PROVIDER_TYPE=duckdb and rows fill the limit, results spill to a DataCanvas table for SQL-based analysis.

ParametersJSON Schema

Name	Required	Description
`group`	No	SoQL GROUP BY clause. Requires an aggregate function in select.
`limit`	No	Max rows to return (1–5000). Default 100. Use with offset for pagination.
`order`	No	SoQL ORDER BY clause, e.g. "total_deaths DESC" or "date ASC".
`where`	No	SoQL WHERE clause. Check column dataType from socrata_get_dataset first — Number columns: year=2023, Text columns: year='2023'. Operators: =, !=, >, <, LIKE, IN(...), BETWEEN, IS NULL, starts_with(), contains(), AND, OR, NOT.
`domain`	No	Portal domain (e.g. data.seattle.gov). Defaults to SOCRATA_DEFAULT_DOMAIN or data.seattle.gov.
`having`	No	SoQL HAVING clause. Filters on aggregated results, e.g. count > 100.
`offset`	No	Row offset for pagination. Default 0.
`search`	No	Full-text search across all text columns ($q). For field-specific filtering, use where instead.
`select`	No	SoQL SELECT clause — column names, aliases, aggregates: "state, sum(deaths) as total_deaths". Omit for all columns.
`canvas_id`	No	Optional 10-char DataCanvas token from a prior call. Omit on first call when CANVAS_PROVIDER_TYPE=duckdb to mint a fresh canvas. Large result sets spill here automatically.
`dataset_id`	Yes	Four-by-four dataset ID (e.g. kzjm-xkqj). Obtain from socrata_find_datasets.

Output Schema

ParametersJSON Schema

Name	Required	Description
`cap`	No	The row limit that was applied when capped.
`rows`	Yes	Result rows. Scalar values are strings (SODA 2.1); geo/location columns return nested objects. Use column schema from socrata_get_dataset for type context.
`shown`	No	Rows returned in this response when capped.
`domain`	Yes	Portal domain queried.
`notice`	No	Guidance when the query returned zero rows — suggests narrowing or reviewing the SoQL. Absent on non-empty result sets.
`canvas_id`	No	DataCanvas token when results spilled (requires CANVAS_PROVIDER_TYPE=duckdb). Pass to socrata_dataframe_query to run SQL over the staged rows — a bounded copy of the matching set (up to 50,000 rows, reported in canvas_row_count), not the full set when total_count exceeds that cap. Page with offset to reach rows beyond it.
`row_count`	Yes	Rows returned in this response.
`truncated`	No	True when rows filled the limit — more rows may match (see total_count when present). Spills to canvas when enabled.
`dataset_id`	Yes	Dataset ID queried.
`total_count`	No	Total matching source rows when a plain row query is truncated (row_count < total_count). Absent when the full result fits and for grouped/aggregate queries (group set), where a source-row count would not describe the returned groups.
`assembled_query`	Yes	SoQL clauses assembled for this request — useful for learning the syntax.
`canvas_row_count`	No	Rows staged onto the DataCanvas — a bounded copy of the matching result set (capped at 50,000). Fewer than total_count when the match exceeds the cap. Present only when canvas_id is.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

socrata-mcp-server

Server Details

Tool Definition Quality

Available Tools

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Discussions

Your Connectors

Resources