Skip to main content
Glama

socrata-mcp-server

Server Details

Search and query government open-data portals (Socrata SODA API).

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
cyanheads/socrata-mcp-server
GitHub Stars
1

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.5/5 across 6 of 6 tools scored.

Server CoherenceA
Disambiguation5/5

Each tool targets a distinct action: listing portals, searching datasets, fetching metadata, querying, and DataCanvas operations. No overlap in purpose.

Naming Consistency5/5

All tools follow the 'socrata_verb_noun' pattern in snake_case, consistent and predictable.

Tool Count5/5

6 tools cover the full workflow from portal discovery to dataset querying and large-result handling, well-scoped for the domain.

Completeness5/5

Covers portal listing, dataset search, metadata retrieval, query execution, and DataCanvas integration. No obvious gaps for standard Socrata interactions.

Available Tools

6 tools
socrata_dataframe_describeDescribe DataCanvas TablesA
Read-onlyIdempotent
Inspect

List registered tables in a DataCanvas session — schema, row count, column names, and registration time. Shows what datasets are available for SQL queries via socrata_dataframe_query. Only meaningful when CANVAS_PROVIDER_TYPE=duckdb is set. Use after socrata_query_dataset spills a large result set to canvas.

ParametersJSON Schema
NameRequiredDescriptionDefault
canvas_idNoCanvas ID returned from socrata_query_dataset. Omit to list all tables visible in the current session.

Output Schema

ParametersJSON Schema
NameRequiredDescription
tablesYesTables available for SQL queries. Empty when none registered.
messageNoStatus message when canvas is not enabled or no tables are registered. Absent when tables are present.
canvas_idNoCanvas ID resolved, when canvas is enabled.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint. The description adds context: the dependency on DuckDB environment, the intended sequencing after socrata_query_dataset, and the specific output fields (schema, row count, column names, registration time). This enriches the agent's understanding beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each serving a distinct purpose: stating what it does, linking to a sibling tool, and providing a usage condition and sequence. It is front-loaded with the core functionality and has no redundant words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (one optional parameter, output schema present), the description covers purpose, usage context, and a prerequisite condition. It does not discuss error cases or authentication, but these are minor omissions for a read-only listing tool. The presence of an output schema means return values need not be detailed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the schema description for canvas_id is already informative: 'Canvas ID returned from socrata_query_dataset. Omit to list all tables visible in the current session.' The tool description does not add further parameter details beyond the schema, so the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists registered tables in a DataCanvas session, providing schema, row count, column names, and registration time. It explicitly relates to socrata_dataframe_query, distinguishing it from sibling tools like socrata_find_datasets or socrata_query_dataset.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides specific usage guidance: 'Use after socrata_query_dataset spills a large result set to canvas' and 'Only meaningful when CANVAS_PROVIDER_TYPE=duckdb is set.' It implies an order of operations but does not compare to alternatives or state when not to use it, which would be needed for a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

socrata_dataframe_queryQuery DataCanvas TableA
Read-onlyIdempotent
Inspect

Run SELECT-only SQL against a DataCanvas table populated by socrata_query_dataset. DuckDB infers types from spilled data, so numeric columns that SODA returned as strings become queryable with numeric comparisons (year > 2020, amount < 500). Only works when CANVAS_PROVIDER_TYPE=duckdb is set. Use socrata_dataframe_describe to see registered tables and their schemas.

ParametersJSON Schema
NameRequiredDescriptionDefault
sqlYesSELECT-only SQL to run against registered canvas tables. DDL, DML, and file-reading functions are rejected. Use table names from socrata_dataframe_describe.
limitNoMax rows to return (1–10000). Default 1000.
canvas_idYesCanvas ID returned from socrata_query_dataset or socrata_dataframe_describe.

Output Schema

ParametersJSON Schema
NameRequiredDescription
sqlYesSQL that was executed.
rowsYesQuery result rows. DuckDB may return native JS types (number, boolean, null) for numeric/boolean columns.
canvas_idYesCanvas ID queried.
row_countYesNumber of rows returned.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark the tool as read-only and idempotent. The description adds behavioral details about DuckDB type inference (e.g., numeric columns from string data become queryable with comparisons) and reinforces the SELECT-only restriction, providing additional context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, consisting of three sentences that front-load the primary purpose and essential usage condition, with no redundant information. Every sentence contributes to the tool's understanding.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and high schema coverage, the description adequately covers prerequisites, type inference behavior, and complementary tools. It may lack details on error handling or performance, but for a query tool it is sufficiently complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage for all three parameters. The description adds value by clarifying that SQL must be SELECT-only and that table names come from socrata_dataframe_describe, complementing the schema without repeating it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool runs SELECT-only SQL against a DataCanvas table, specifying the resource and the type of operation. It distinguishes from sibling tools like socrata_dataframe_describe and socrata_query_dataset by noting that it works on tables populated by the latter, making its purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use the tool by stating it only works when CANVAS_PROVIDER_TYPE=duckdb is set, and directs to socrata_dataframe_describe for schema exploration. However, it does not explicitly note when not to use it, though the condition is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

socrata_find_datasetsFind Socrata DatasetsA
Read-onlyIdempotent
Inspect

Search for datasets across all Socrata-powered government open-data portals, or scope to one portal with the domain parameter. Returns dataset IDs, names, abbreviated column lists, domains, and update timestamps. Use socrata_get_dataset to fetch the full typed column schema before writing queries — columnNames here are preview-only and lack type information.

ParametersJSON Schema
NameRequiredDescriptionDefault
onlyNoFilter by asset type. Omit to include all types. Usually "datasets" is what you want.
tagsNoFilter by tags (e.g. ["covid19", "permits"]).
limitNoNumber of results to return (1–100). Default 10.
orderNoSort order. Defaults to relevance. Use updated_at to surface recently-refreshed datasets.
queryNoFull-text search across dataset names and descriptions. Omit to browse without filtering.
domainNoScope search to a single portal (e.g. data.seattle.gov, data.cityofnewyork.us). Omit to search all portals.
offsetNoPagination offset. Default 0.
categoriesNoFilter by domain categories (e.g. ["Public Safety", "Transportation"]).

Output Schema

ParametersJSON Schema
NameRequiredDescription
queryNoSearch query applied, for reference.
messageNoRecovery hint when results are empty — echoes filters and suggests how to broaden. Absent on non-empty result pages.
resultsYesMatching datasets. Empty when no results.
total_countYesTotal matches before pagination. 0 when empty.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, openWorldHint, and idempotentHint. The description adds that column names lack type information (preview-only), which is critical behavioral context beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two well-structured sentences: the first states purpose and output, the second gives usage guidance. No filler, every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 8 optional parameters, full schema coverage, and an output schema, the description adequately covers return values, scoping, and limitation warnings. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed parameter descriptions (enum values, defaults, ranges). The description references the 'domain' parameter but does not add significant meaning beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches for datasets across Socrata portals, lists return fields (IDs, names, columns, etc.), and contrasts with sibling socrata_get_dataset. The verb 'search' and resource 'datasets' are explicit and distinguishable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises to use socrata_get_dataset for full schema before writing queries, and notes that columnNames here are preview-only. This is a clear when-to-use and when-not-to-use directive, referencing an alternative sibling.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

socrata_get_datasetGet Dataset SchemaA
Read-onlyIdempotent
Inspect

Fetch full metadata and column schema for a Socrata dataset by ID. Returns field names, data types, descriptions, row count, and licensing. Always call this before writing a socrata_query_dataset — the column types determine correct WHERE clause syntax: Number columns accept bare literals (year=2023) while Text columns require single-quoted strings (year='2023').

ParametersJSON Schema
NameRequiredDescriptionDefault
domainNoPortal domain (e.g. data.seattle.gov). Defaults to SOCRATA_DEFAULT_DOMAIN env var or data.seattle.gov.
dataset_idYesFour-by-four dataset ID matching pattern like kzjm-xkqj. Obtain from socrata_find_datasets.

Output Schema

ParametersJSON Schema
NameRequiredDescription
nameYesDataset display name.
tagsYesAssociated tags.
domainYesPortal domain hosting this dataset.
columnsYesColumn schema. Computed region columns (:@computed_region_*) are excluded to reduce noise.
licenseNoLicense name when available.
categoryNoDomain category when available.
row_countNoApproximate row count when available.
dataset_idYesFour-by-four dataset ID.
descriptionNoDataset description when available.
data_updated_atNoISO 8601 timestamp of last data update when available.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and idempotentHint=true. The description adds valuable behavioral context: returns field names, data types, descriptions, row count, and licensing. Also warns about query implications. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose. Second sentence provides essential guidance. No redundant information. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, description does not need to detail return values. It already mentions key return fields and provides critical usage context. The tool is simple and well-described.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage with descriptions for both parameters. Description adds beyond schema: explains domain default behavior and source for dataset_id (from socrata_find_datasets). This adds meaningful context, justifying a score above baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool fetches full metadata and column schema for a Socrata dataset by ID. The verb 'Fetch' and resource 'metadata and column schema' are specific. It distinguishes from siblings like socrata_find_datasets (which finds datasets) and socrata_query_dataset (which queries data).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explicitly directs 'Always call this before writing a socrata_query_dataset' and explains why: column types determine correct WHERE clause syntax. Provides concrete examples (Number vs Text columns). This is excellent usage guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

socrata_list_portalsList Socrata PortalsA
Read-onlyIdempotent
Inspect

List known Socrata-powered government open-data portals with their domain, organization name, and dataset count. Backed by the Discovery API domains catalog. Filtering is client-side substring match on the query parameter. Use this first when you do not know which portal to target, then pass the domain to socrata_find_datasets.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax portals to return (1–200). Default 50.
queryNoKeyword to filter portal names or organization names (case-insensitive substring match). Omit to list all portals.
offsetNoPagination offset. Default 0.

Output Schema

ParametersJSON Schema
NameRequiredDescription
messageNoRecovery hint when no portals matched the filter. Absent on non-empty pages.
portalsYesMatching portals. Empty when no results.
total_countYesTotal portals before pagination. 0 when empty.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that filtering is client-side substring match, adding behavioral context beyond annotations (readOnlyHint, idempotentHint). Could also mention data freshness or performance implications, but overall informative.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two focused sentences with front-loaded purpose and immediate usage guidance. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 100% schema description coverage, output schema present, and annotations covering safety, the description leaves no gaps: explains what is returned, client-side filtering, and tool workflow.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all parameters; description adds value by specifying that query filtering is client-side and clarifying use of limit/offset. Enhances understanding beyond schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the verb 'List' and resource 'portals', specifies returned fields (domain, organization name, dataset count), and distinguishes itself from siblings like socrata_find_datasets by being the first step when portal is unknown.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises to 'Use this first when you do not know which portal to target, then pass the domain to socrata_find_datasets', providing clear when-to-use and next step.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

socrata_query_datasetQuery DatasetA
Read-onlyIdempotent
Inspect

Execute a SoQL query against any dataset on any Socrata portal. Use the search parameter for quick full-text lookup, or combine select/where/group/having/order for full analytical control. Returns rows plus the assembled SoQL string so you can learn the pattern. All SODA 2.1 row values are strings even for numeric columns — check dataType from socrata_get_dataset to determine correct WHERE quoting: Number columns use bare literals (year=2023), Text columns use single-quoted strings (year='2023'). To enumerate distinct values, use select="col, count(*) as n" with group="col" and order="n DESC". When CANVAS_PROVIDER_TYPE=duckdb and rows fill the limit, results spill to a DataCanvas table for SQL-based analysis.

ParametersJSON Schema
NameRequiredDescriptionDefault
groupNoSoQL GROUP BY clause. Requires an aggregate function in select.
limitNoMax rows to return (1–5000). Default 100. Use with offset for pagination.
orderNoSoQL ORDER BY clause, e.g. "total_deaths DESC" or "date ASC".
whereNoSoQL WHERE clause. Check column dataType from socrata_get_dataset first — Number columns: year=2023, Text columns: year='2023'. Operators: =, !=, >, <, LIKE, IN(...), BETWEEN, IS NULL, starts_with(), contains(), AND, OR, NOT.
domainNoPortal domain (e.g. data.seattle.gov). Defaults to SOCRATA_DEFAULT_DOMAIN or data.seattle.gov.
havingNoSoQL HAVING clause. Filters on aggregated results, e.g. count > 100.
offsetNoRow offset for pagination. Default 0.
searchNoFull-text search across all text columns ($q). For field-specific filtering, use where instead.
selectNoSoQL SELECT clause — column names, aliases, aggregates: "state, sum(deaths) as total_deaths". Omit for all columns.
canvas_idNoOptional 10-char DataCanvas token from a prior call. Omit on first call when CANVAS_PROVIDER_TYPE=duckdb to mint a fresh canvas. Large result sets spill here automatically.
dataset_idYesFour-by-four dataset ID (e.g. kzjm-xkqj). Obtain from socrata_find_datasets.

Output Schema

ParametersJSON Schema
NameRequiredDescription
rowsYesResult rows. Scalar values are strings (SODA 2.1); geo/location columns return nested objects. Use column schema from socrata_get_dataset for type context.
domainYesPortal domain queried.
canvas_idNoDataCanvas token when results spilled (requires CANVAS_PROVIDER_TYPE=duckdb). Pass to socrata_dataframe_query for SQL over the full result set.
row_countYesRows returned in this response.
dataset_idYesDataset ID queried.
total_countNoTotal matching rows when result is truncated (row_count < total_count). Absent when the full result fits.
assembled_queryYesSoQL clauses assembled for this request — useful for learning the syntax.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only and idempotent. The description adds significant behavioral details: return of SoQL string, string formatting for all values (including numeric), dependence on dataType for quoting, and spillover to DataCanvas under specific conditions. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is detailed but well-structured, front-loading the core purpose. Every sentence adds useful information, though the length could be trimmed slightly without losing meaning. It earns its space with critical quoting and spillover details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (11 parameters, output schema present), the description covers all essential behavioral aspects: quoting rules, spillover mechanism, return of SoQL string, and caveats about string values. No major gaps are apparent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions cover all parameters (100% coverage), providing a baseline of 3. The description adds extra context for several parameters, such as search being full-text ($q), domain defaulting to SOCRATA_DEFAULT_DOMAIN, and canvas_id usage for spillover, which increases value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool executes SoQL queries on any Socrata portal, specifying both quick search and full analytical control. However, it does not explicitly differentiate from the sibling tool socrata_dataframe_query, which likely also performs queries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Guidance is provided on when to use search vs. full SoQL clauses, and tips for quoting and distinct enumeration. It also mentions spillover behavior. However, it lacks explicit when-not-to-use or comparisons with sibling tools like socrata_dataframe_query.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.