aggregate_resource
Run GROUP BY and aggregations on cached resources from datos.gob.do without writing SQL. Specify aggregations, grouping, filters, and ordering via JSON arguments.
Instructions
Run GROUP BY + aggregations against a cached resource without writing SQL.
Typed wrapper that builds safe DuckDB queries from JSON. Example usage: "How many employees by status in April 2026?" → aggregations=[{col: null, fn: count, alias: empleados}], group_by=["Estatus"], filters=[{col:"Año",op:"=",val:2026},{col:"Mes",op:"=",val:"Abril"}], order_by=[{col:"empleados",dir:"desc"}].
First call downloads + caches the file. Subsequent calls reuse the cache. Returns one row per group with the aggregation values.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Direct URL to the file (CKAN resource 'url' field). | |
| format | Yes | Format declared in CKAN. Accepts: csv, tsv, xlsx, json. | |
| aggregations | Yes | List of {col, fn, alias}. Valid fns: count, count_distinct, sum, avg, mean, median, min, max, stddev, variance. col=null or col='*' means COUNT(*). Example: [{"col":null,"fn":"count","alias":"empleados"},{"col":"Sueldo Bruto","fn":"sum","alias":"masa_salarial"}]. | |
| group_by | No | Columns to GROUP BY. Example: ["Estatus","Mes"]. | |
| filters | No | Same syntax as filter_resource.filters. Applied before grouping. | |
| having | No | Post-aggregation filter on aggregation aliases. Example: [{"col":"empleados","op":">","val":10}]. | |
| order_by | No | Same syntax as filter_resource.order_by. Refs aggregation aliases or group cols. | |
| limit | No | Max groups to return (1-1000). |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| error | No | ||
| hint | No | ||
| source_url | No | ||
| format | No | ||
| cache | No | ||
| groups_returned | No | ||
| columns | No | ||
| limit | No | ||
| rows | No |