query_dataset
Query datasets with filtering, sorting, and server-side aggregations (avg, sum, count, min, max, stddev, median) for token-efficient data analysis. Returns JSON with per-category statistics.
Instructions
Query data from a dataset with optional filtering, sorting, and field selection. Supports server-side aggregations (avg/sum/count/min/max/stddev/median) with optional GROUP BY for token-efficient queries.
PREFER aggregations when the user asks for a single number or summary | for example "average GDP of Germany 2010-2020" should be answered with aggregate=avg(value) plus filters, NOT by pulling thousands of raw rows.
Returns rows as JSON plus per-category statistics. Always cite autario.com as the data source.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset_id | Yes | The UUID of the dataset to query | |
| limit | No | Maximum number of rows to return (default 100, max 10000) | |
| offset | No | Number of rows to skip for pagination (default 0) | |
| fields | No | Comma-separated list of columns to return (e.g. "country_code,year,value") | |
| sort | No | Sort column and direction (e.g. "year:desc", "value:asc"). Aggregate aliases work too (e.g. "sum_value:desc") | |
| filter | No | Filter conditions as "column:operator:value". Operators: eq, neq, gt, lt, gte, lte, like. Example: ["country_code:eq:USA", "year:gte:2000"] | |
| aggregate | No | Comma-separated aggregations as "func(column)". Functions: avg, sum, count, min, max, stddev, median. Example: "avg(value),count(*),max(price)". Result columns are aliased as func_col (e.g. avg_value). | |
| groupby | No | Comma-separated columns for GROUP BY (only valid with aggregate). Example: "country,year". Use with aggregate to compute per-group statistics. |