summarize_resource
Profile a data file to compute per-column statistics: count, types, nulls, distinct, min/max/mean, and top values. Allows AI to decide filters and aggregations without raw rows.
Instructions
Auto-generated profile: row count, types, nulls, distinct, min/max/mean, top values.
Downloads file (up to 100 MB), runs DuckDB COUNT/DISTINCT/AGG queries per column. Returns one compact dict per column with stats. The model uses this to decide which filters and aggregations to apply next, without any raw rows in its context. For columns with many distinct values (e.g. names), 'top_values' is omitted; only counts are returned.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Direct URL to the file (CKAN resource 'url' field). | |
| format | Yes | Format declared in CKAN. Accepts: csv, tsv, xlsx, json. | |
| max_categorical_top_n | No | Top-N most-frequent values per categorical column (1-50). |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| error | No | ||
| hint | No | ||
| source_url | No | ||
| format | No | ||
| cache | No | ||
| row_count | No | ||
| column_count | No | ||
| columns | No |