get_data

Query a curated WGEA dataset and return observations.

Examples: # Gender breakdown at Commonwealth Bank resp = await get_data( "WORKFORCE_COMPOSITION", filters={"employer_name": "Commonwealth Bank"}, )

# Promotions to manager by gender at Westpac in 2024-25
resp = await get_data(
    "WORKFORCE_MANAGEMENT",
    filters={"employer_name": "Westpac", "movement_type": "Promotions",
             "manager_category": "Managers"},
)

# Which employers in mining set gender targets?
resp = await get_data(
    "GENDER_EQUALITY_ACTIONS",
    filters={"anzsic_division": "Mining",
             "section": "Gender Pay Gap",
             "response": "Yes"},
)

# Sexual harassment policy responses across financial services
resp = await get_data(
    "HARM_PREVENTION",
    filters={"anzsic_division": "Financial and Insurance Services",
             "subsection": "Sexual Harassment"},
)

Returns: DataResponse with records (or csv), unit, reporting_year, row_count, source URL, the actual download_url used, "did you mean?" fuzzy hints if the employer-name filter didn't match exactly, and CC-BY 3.0 AU attribution.

Name	Required	Description	Default
`format`	No	Response shape. 'records' (default): flat list of observations. 'series': grouped by measure. 'csv': pandas CSV string in `csv` field.	records
`filters`	No	Dimension filters. Keys are plain-English aliases from the dataset's describe_dataset response. Values are matched against the source data; pass a list to OR across values. Permissive dimensions (e.g. employer_name, question_text) accept any string and support fuzzy matching — try {'employer_name': 'CBA'} or {'employer_name': 'commonwealth*'} for wildcard substring search.
`max_rows`	No	Cap on returned rows after filtering. Default 2000. Max 10000. Tighten filters to narrow further.
`dataset_id`	Yes	Curated dataset ID. Use the search or list-curated endpoint/tool to discover.
`end_period`	No	Inclusive end reporting year. Same format as start_period.
`start_period`	No	Inclusive start reporting year. Format: 'YYYY-YY' (e.g. '2023-24') or 'YYYY' (matched against WGEA's reporting_year column). Bare int years like 2023 are coerced to '2023' automatically.

Name

Required

Description

Default

format

Response shape. 'records' (default): flat list of observations. 'series': grouped by measure. 'csv': pandas CSV string in `csv` field.

records

filters

Dimension filters. Keys are plain-English aliases from the dataset's describe_dataset response. Values are matched against the source data; pass a list to OR across values. Permissive dimensions (e.g. employer_name, question_text) accept any string and support fuzzy matching — try {'employer_name': 'CBA'} or {'employer_name': 'commonwealth*'} for wildcard substring search.

max_rows

Cap on returned rows after filtering. Default 2000. Max 10000. Tighten filters to narrow further.

dataset_id

Yes

Curated dataset ID. Use the search or list-curated endpoint/tool to discover.

end_period

Inclusive end reporting year. Same format as start_period.

start_period

Inclusive start reporting year. Format: 'YYYY-YY' (e.g. '2023-24') or 'YYYY' (matched against WGEA's reporting_year column). Bare int years like 2023 are coerced to '2023' automatically.

Name	Required	Description	Default
`csv`	No
`unit`	No
`query`	No
`stale`	No
`period`	No	Canonical period bounds {start, end} for cross-sister consumers. Populated alongside the wgea-specific reporting_year. For a single reporting year both bounds match; for multi-year spans they bracket the range.
`source`	No		Workplace Gender Equality Agency
`records`	No
`row_count`	No
`dataset_id`	Yes
`source_url`	Yes
`attribution`	No		Source: Workplace Gender Equality Agency. Licensed under Creative Commons Attribution 3.0 Australia (https://creativecommons.org/licenses/by/3.0/au/). Original dataset: https://data.gov.au/data/dataset/wgea-dataset
`dataset_name`	Yes
`did_you_mean`	No
`download_url`	No
`retrieved_at`	Yes
`stale_reason`	No
`truncated_at`	No
`reporting_year`	No
`server_version`	No

Name

Required

Description

Default

csv

unit

query

stale

period

Canonical period bounds {start, end} for cross-sister consumers. Populated alongside the wgea-specific reporting_year. For a single reporting year both bounds match; for multi-year spans they bracket the range.

source

Workplace Gender Equality Agency

records

row_count

dataset_id

Yes

source_url

Yes

attribution

Source: Workplace Gender Equality Agency. Licensed under Creative Commons Attribution 3.0 Australia (https://creativecommons.org/licenses/by/3.0/au/). Original dataset: https://data.gov.au/data/dataset/wgea-dataset

dataset_name

Yes

did_you_mean

download_url

retrieved_at

Yes

stale_reason

truncated_at

reporting_year

server_version

wgea-mcp

Instructions

Input Schema

Output Schema

Tool Definition Quality

Other Tools

Latest Blog Posts

MCP directory API