top_n
Retrieve the top or bottom N records from a WGEA dataset ranked by a numeric measure, such as number of employees or responses. Supports filtering by dimensions and reporting year for targeted analysis.
Instructions
Return the N rows with the largest (or smallest) value of a measure.
Ranks across one WGEA reporting year (the latest by default, or a
specific year via reporting_year=). This is the most common agent
workflow — "show me the top 10 X by Y" — collapsed into a single
server-side call: rank-and-slice happens on the server so the agent
never has to fetch a full table just to take the top of it.
Examples: # 10 employers with the most women managers (latest reporting year) top_n("WORKFORCE_COMPOSITION", "n_employees", n=10, filters={"gender": "Women", "manager_category": "Manager"})
# 5 ANZSIC divisions with the fewest Yes responses on Gender Pay Gap
top_n("GENDER_EQUALITY_ACTIONS", "n_responses", n=5, direction="bottom",
filters={"section": "Gender Pay Gap", "response": "Yes"})
# Top 5 employers in Mining by total workforce in 2023-24
top_n("WORKFORCE_COMPOSITION", "n_employees", n=5,
filters={"anzsic_division": "Mining"},
reporting_year="2023-24")Returns:
DataResponse with at most n records, sorted by measure value
in the requested direction. Other fields (reporting_year, unit,
attribution) match a regular get_data call.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset_id | Yes | Curated dataset ID. Use search_datasets() / list_curated(). | |
| measure | Yes | Numeric measure column to rank by. WGEA measures are `n_employees` (WORKFORCE_COMPOSITION, WORKFORCE_MANAGEMENT) or `n_responses` (the other five questionnaire datasets). Use describe_dataset() to confirm. | |
| n | No | How many top (or bottom) rows to return. | |
| filters | No | Optional dimension filters, same shape as get_data. | |
| direction | No | 'top' returns the N rows with the LARGEST measure values (highest n_employees, biggest n_responses, etc.). 'bottom' returns the SMALLEST. | top |
| reporting_year | No | Optional single WGEA reporting year to restrict the ranking to. Format: 'YYYY-YY' (e.g. '2024-25') or 'YYYY' (e.g. '2024'). Defaults to the latest reporting year present in the data so the rank is a clean 'top N at the current reporting year' view. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset_id | Yes | ||
| dataset_name | Yes | ||
| query | No | ||
| reporting_year | No | ||
| period | No | Canonical period bounds {start, end} for cross-sister consumers. Populated alongside the wgea-specific reporting_year. For a single reporting year both bounds match; for multi-year spans they bracket the range. | |
| unit | No | ||
| row_count | No | ||
| records | No | ||
| csv | No | ||
| source | No | Workplace Gender Equality Agency | |
| attribution | No | Source: Workplace Gender Equality Agency. Licensed under Creative Commons Attribution 3.0 Australia (https://creativecommons.org/licenses/by/3.0/au/). Original dataset: https://data.gov.au/data/dataset/wgea-dataset | |
| retrieved_at | Yes | ||
| source_url | Yes | ||
| download_url | No | ||
| did_you_mean | No | ||
| stale | No | ||
| stale_reason | No | ||
| truncated_at | No | ||
| server_version | No |