Skip to main content
Glama

get_data

Retrieve workforce composition, pay gap, and gender equality data from the WGEA dataset by applying filters on employer, industry, or reporting period.

Instructions

Query a curated WGEA dataset and return observations.

Examples: # Gender breakdown at Commonwealth Bank resp = await get_data( "WORKFORCE_COMPOSITION", filters={"employer_name": "Commonwealth Bank"}, )

# Promotions to manager by gender at Westpac in 2024-25
resp = await get_data(
    "WORKFORCE_MANAGEMENT",
    filters={"employer_name": "Westpac", "movement_type": "Promotions",
             "manager_category": "Managers"},
)

# Which employers in mining set gender targets?
resp = await get_data(
    "GENDER_EQUALITY_ACTIONS",
    filters={"anzsic_division": "Mining",
             "section": "Gender Pay Gap",
             "response": "Yes"},
)

# Sexual harassment policy responses across financial services
resp = await get_data(
    "HARM_PREVENTION",
    filters={"anzsic_division": "Financial and Insurance Services",
             "subsection": "Sexual Harassment"},
)

Returns: DataResponse with records (or csv), unit, reporting_year, row_count, source URL, the actual download_url used, "did you mean?" fuzzy hints if the employer-name filter didn't match exactly, and CC-BY 3.0 AU attribution.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
formatNoResponse shape. 'records' (default): flat list of observations. 'series': grouped by measure. 'csv': pandas CSV string in `csv` field.records
filtersNoDimension filters. Keys are plain-English aliases from the dataset's describe_dataset response. Values are matched against the source data; pass a list to OR across values. Permissive dimensions (e.g. employer_name, question_text) accept any string and support fuzzy matching — try {'employer_name': 'CBA'} or {'employer_name': 'commonwealth*'} for wildcard substring search.
max_rowsNoCap on returned rows after filtering. Default 2000. Max 10000. Tighten filters to narrow further.
dataset_idYesCurated dataset ID. Use the search or list-curated endpoint/tool to discover.
end_periodNoInclusive end reporting year. Same format as start_period.
start_periodNoInclusive start reporting year. Format: 'YYYY-YY' (e.g. '2023-24') or 'YYYY' (matched against WGEA's reporting_year column). Bare int years like 2023 are coerced to '2023' automatically.

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
csvNo
unitNo
queryNo
staleNo
periodNoCanonical period bounds {start, end} for cross-sister consumers. Populated alongside the wgea-specific reporting_year. For a single reporting year both bounds match; for multi-year spans they bracket the range.
sourceNoWorkplace Gender Equality Agency
recordsNo
row_countNo
dataset_idYes
source_urlYes
attributionNoSource: Workplace Gender Equality Agency. Licensed under Creative Commons Attribution 3.0 Australia (https://creativecommons.org/licenses/by/3.0/au/). Original dataset: https://data.gov.au/data/dataset/wgea-dataset
dataset_nameYes
did_you_meanNo
download_urlNo
retrieved_atYes
stale_reasonNo
truncated_atNo
reporting_yearNo
server_versionNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It transparently describes the return structure, including fields like row_count, download_url, and fuzzy hints. It also mentions fuzzy matching on employer_name and wildcard support. However, it does not discuss error handling, rate limits, or idempotency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with separate sections for examples and return information. It is front-loaded with the core purpose. While it is somewhat lengthy, every sentence serves a purpose, and the organization helps readability without sacrificing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, no annotations, but with output schema), the description provides sufficient context. It covers practical usage through examples, parameter semantics, and return details. The agent can effectively decide when and how to invoke this tool without ambiguous gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage, establishing a baseline of 3. The description adds value beyond schema by providing detailed examples for filters and start_period format, explaining fuzzy matching, and clarifying the default and maximum for max_rows. This enriches the semantic understanding for the agent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function: 'Query a curated WGEA dataset and return observations.' It uses specific verbs and resources, and provides multiple concrete examples covering different datasets and filters, leaving no ambiguity about what the tool does.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes examples showing typical usage contexts but does not explicitly state when to use this tool versus alternatives like search_datasets or describe_dataset. There is no guidance on when not to use this tool or when a sibling would be more appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Bigred97/wgea-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server