stats
Compute summary statistics (count, sum, mean, median, min, max, stddev) for filtered data, with optional group-by support.
Instructions
Aggregate statistics (count, sum, mean, median, min, max, stddev) for one measure across all rows matching filters. Optionally grouped.
Without group_by: returns one stats payload over all matching rows.
With group_by: returns per-group stats — much more powerful for
"distribution X by Y" queries that would otherwise require N filtered
calls.
Examples: # Single aggregate over NSW postcodes stats("IND_POSTCODE_MEDIAN", "median_taxable_income_2022_23", filters={"state": "nsw"}) # → {statistics: {count: 587, mean: 55017, median: 53484, ...}}
# Stats grouped by state — one call instead of 8
stats("IND_POSTCODE_MEDIAN", "median_taxable_income_2022_23",
group_by="state")
# → {by: "state", groups: [
# {key: "ACT", statistics: {...}},
# {key: "NSW", statistics: {...}},
# ...
# ]}
# Tax payable per income year across the corporate sector
stats("CORP_TRANSPARENCY", "tax_payable", group_by="income_year")Returns:
Without group_by: dict with statistics field.
With group_by: dict with by and groups fields; each group
carries key, statistics, plus the same envelope
metadata (dataset_id, unit, attribution, etc.).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset_id | Yes | Curated dataset ID. Use search_datasets() / list_curated(). | |
| measure | Yes | The measure key to aggregate over. Use describe_dataset() to see available measures. | |
| filters | No | Optional dimension filters — same shape as get_data. | |
| group_by | No | Optional dimension key to partition rows by. When set, returns per-group statistics instead of a single aggregate. Caps at 200 groups to keep responses bounded — exceeding the cap returns the first 200 groups by row order and sets a `groups_truncated` flag in the response. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||