Skip to main content
Glama
llnOrmll

World Bank Data360 MCP Server

by llnOrmll

retrieve_data_tool

Retrieve World Bank economic and social indicator data for specific countries, years, and indicators with customizable filters and sorting options.

Instructions

[STEP 3/3] Retrieve actual data from World Bank Data360.

⚠️ PREREQUISITE: Call get_temporal_coverage first to get latest_year.

🚨 CRITICAL TYPE REQUIREMENTS 🚨

When calling this tool, you MUST pass parameters with the EXACT types shown below. Common mistakes that cause validation errors:

❌ INCORRECT: {"limit": "10"} ← limit as STRING (causes error!) βœ… CORRECT: {"limit": 10} ← limit as INTEGER

❌ INCORRECT: {"exclude_aggregates": "true"} ← boolean as STRING βœ… CORRECT: {"exclude_aggregates": true} ← boolean as BOOLEAN

❌ INCORRECT: {"year": 2023} ← year as NUMBER βœ… CORRECT: {"year": "2023"} ← year as STRING

πŸ“‹ PARAMETER TYPES - MUST MATCH EXACTLY:

STRING parameters (use quotes in JSON): indicator: "WB_WDI_SP_POP_TOTL" database: "WB_WDI" year: "2023" countries: "USA,CHN,JPN" sex: "M" or "F" or "_T" age: "0-14" sort_order: "desc" or "asc"

INTEGER parameters (no quotes in JSON): limit: 10 (default: 20)

BOOLEAN parameters (no quotes in JSON): exclude_aggregates: true or false (default: true) compact_response: true or false (default: true)

🎯 CORRECT JSON EXAMPLES:

Example 1 - Top 10 countries by population: { "indicator": "WB_WDI_SP_POP_TOTL", "database": "WB_WDI", "year": "2023", "limit": 10, "sort_order": "desc", "exclude_aggregates": true }

Example 2 - Specific countries GDP: { "indicator": "WB_WDI_NY_GDP_MKTP_CD", "database": "WB_WDI", "year": "2023", "countries": "USA,CHN,JPN" }

Example 3 - All data with aggregates: { "indicator": "WB_WDI_SP_POP_TOTL", "database": "WB_WDI", "year": "2022", "exclude_aggregates": false }

⚑ HOW IT WORKS:

  • countries parameter: API fetches ONLY those countries (efficient)

  • exclude_aggregates: Filters out 47 regional/income codes (ARB, AFE, WLD, HIC, etc.) ⚠️ DEFAULT is TRUE - only individual countries returned Set to false to include aggregates like "World", "High income", "Arab World"

  • sort_order: Sorts by OBS_VALUE before limiting

  • limit parameter: Returns top N records to minimize tokens ⚠️ DEFAULT is 20 - provides reasonable default, override if you need more

  • compact_response: Returns only essential fields (country, country_name, year, value) ⚠️ DEFAULT is TRUE - minimizes token usage by ~75% Set to false if you need all fields (REF_AREA, TIME_PERIOD, OBS_VALUE, UNIT_MEASURE, etc.)

πŸ“Š AFTER RECEIVING DATA: Format results as markdown table:

  • Sort by value (highest to lowest)

  • Add rank numbers

  • Format with thousand separators

  • Include country names (not just codes)

Returns: Data records with summary statistics.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
indicatorYes
databaseYes
yearNo
countriesNo
sexNo
ageNo
limitNo
sort_orderNodesc
exclude_aggregatesNo
compact_responseNo

Implementation Reference

  • Handler function for 'retrieve_data_tool' registered via @server.tool() decorator. Converts year parameter to string and delegates to the retrieve_data helper function. Includes detailed docstring specifying parameter types and usage examples.
        def retrieve_data_tool(
            indicator: str,
            database: str,
            year: str | int | None = None,
            countries: str | None = None,
            sex: str | None = None,
            age: str | None = None,
            limit: int = 20,
            sort_order: str = "desc",
            exclude_aggregates: bool = True,
            compact_response: bool = True
        ) -> dict[str, Any]:
            """[STEP 3/3] Retrieve actual data from World Bank Data360.
    
    ⚠️ PREREQUISITE: Call get_temporal_coverage first to get latest_year.
    
    🚨 CRITICAL TYPE REQUIREMENTS 🚨
    
    When calling this tool, you MUST pass parameters with the EXACT types shown below.
    Common mistakes that cause validation errors:
    
    ❌ INCORRECT: {"limit": "10"}      ← limit as STRING (causes error!)
    βœ… CORRECT:   {"limit": 10}        ← limit as INTEGER
    
    ❌ INCORRECT: {"exclude_aggregates": "true"}    ← boolean as STRING
    βœ… CORRECT:   {"exclude_aggregates": true}      ← boolean as BOOLEAN
    
    ❌ INCORRECT: {"year": 2023}       ← year as NUMBER
    βœ… CORRECT:   {"year": "2023"}     ← year as STRING
    
    πŸ“‹ PARAMETER TYPES - MUST MATCH EXACTLY:
    
    STRING parameters (use quotes in JSON):
      indicator: "WB_WDI_SP_POP_TOTL"
      database: "WB_WDI"
      year: "2023"
      countries: "USA,CHN,JPN"
      sex: "M" or "F" or "_T"
      age: "0-14"
      sort_order: "desc" or "asc"
    
    INTEGER parameters (no quotes in JSON):
      limit: 10 (default: 20)
    
    BOOLEAN parameters (no quotes in JSON):
      exclude_aggregates: true or false (default: true)
      compact_response: true or false (default: true)
    
    🎯 CORRECT JSON EXAMPLES:
    
    Example 1 - Top 10 countries by population:
    {
      "indicator": "WB_WDI_SP_POP_TOTL",
      "database": "WB_WDI",
      "year": "2023",
      "limit": 10,
      "sort_order": "desc",
      "exclude_aggregates": true
    }
    
    Example 2 - Specific countries GDP:
    {
      "indicator": "WB_WDI_NY_GDP_MKTP_CD",
      "database": "WB_WDI",
      "year": "2023",
      "countries": "USA,CHN,JPN"
    }
    
    Example 3 - All data with aggregates:
    {
      "indicator": "WB_WDI_SP_POP_TOTL",
      "database": "WB_WDI",
      "year": "2022",
      "exclude_aggregates": false
    }
    
    ⚑ HOW IT WORKS:
    - countries parameter: API fetches ONLY those countries (efficient)
    - exclude_aggregates: Filters out 47 regional/income codes (ARB, AFE, WLD, HIC, etc.)
      ⚠️ DEFAULT is TRUE - only individual countries returned
      Set to false to include aggregates like "World", "High income", "Arab World"
    - sort_order: Sorts by OBS_VALUE before limiting
    - limit parameter: Returns top N records to minimize tokens
      ⚠️ DEFAULT is 20 - provides reasonable default, override if you need more
    - compact_response: Returns only essential fields (country, country_name, year, value)
      ⚠️ DEFAULT is TRUE - minimizes token usage by ~75%
      Set to false if you need all fields (REF_AREA, TIME_PERIOD, OBS_VALUE, UNIT_MEASURE, etc.)
    
    πŸ“Š AFTER RECEIVING DATA:
    Format results as markdown table:
    - Sort by value (highest to lowest)
    - Add rank numbers
    - Format with thousand separators
    - Include country names (not just codes)
    
    Returns: Data records with summary statistics."""
            year_str = str(year) if year is not None else None
    
            return retrieve_data(
                indicator, database, year_str, countries, sex, age,
                limit, sort_order, exclude_aggregates, compact_response
            )
  • Core helper function implementing the data retrieval logic: constructs API parameters, handles pagination loop, fetches data from Data360 endpoint, applies client-side filtering (exclude aggregates), sorting by OBS_VALUE, limiting results, compacting response fields, and generating summary statistics.
    def retrieve_data(
        indicator: str,
        database: str,
        year: str | None = None,
        countries: str | None = None,
        sex: str | None = None,
        age: str | None = None,
        limit: int | None = 20,
        sort_order: str = "desc",
        exclude_aggregates: bool = True,
        compact_response: bool = True
    ) -> dict[str, Any]:
        """Retrieve actual data from World Bank API"""
        
        params = {
            "DATABASE_ID": database,
            "INDICATOR": indicator,
            "skip": 0,
        }
    
        # Apply filters
        if year:
            params["timePeriodFrom"] = year
            params["timePeriodTo"] = year
        
        if countries:
            params["REF_AREA"] = countries
        
        if sex:
            params["SEX"] = sex
        
        if age:
            params["AGE"] = age
    
        all_data = []
    
        try:
            # Pagination loop (max 10000 records as safety limit)
            while len(all_data) < 10000:
                response = requests.get(
                    DATA_ENDPOINT,
                    params=params,
                    headers={"Accept": "application/json"},
                    timeout=30,
                )
                response.raise_for_status()
    
                data = response.json()
                values = data.get("value", [])
    
                if not values:
                    break
    
                all_data.extend(values)
    
                total_count = data.get("count", 0)
                if len(all_data) >= total_count:
                    break
    
                params["skip"] = len(all_data)
    
            # CLIENT-SIDE: Filter out regional/income aggregates if requested
            if exclude_aggregates and all_data:
                all_data = [d for d in all_data if d.get("REF_AREA") not in AGGREGATE_CODES]
    
            # CLIENT-SIDE: Sort by OBS_VALUE if requested
            if all_data and sort_order:
                data_with_values = [d for d in all_data if d.get("OBS_VALUE") is not None]
                data_without_values = [d for d in all_data if d.get("OBS_VALUE") is None]
    
                reverse_order = (sort_order.lower() == "desc")
                try:
                    sorted_data = sorted(
                        data_with_values,
                        key=lambda x: float(str(x.get("OBS_VALUE", "0"))),
                        reverse=reverse_order
                    )
                    all_data = sorted_data + data_without_values
                except (ValueError, TypeError) as e:
                    # If sorting fails, return error in response
                    return {
                        "success": False,
                        "error": f"Sorting failed: {str(e)}. OBS_VALUE type: {type(data_with_values[0].get('OBS_VALUE')) if data_with_values else 'no data'}"
                    }
    
            # CLIENT-SIDE: Apply limit to reduce tokens sent to Claude
            display_data = all_data[:limit] if limit else all_data
    
            # Generate summary (before compacting to preserve field names)
            unique_countries = set(d.get("REF_AREA") for d in display_data if d.get("REF_AREA"))
            unique_years = sorted(set(d.get("TIME_PERIOD") for d in display_data if d.get("TIME_PERIOD")))
    
            # CLIENT-SIDE: Compact response to minimize tokens (only essential fields)
            if compact_response and display_data:
                display_data = [
                    {
                        "country": d.get("REF_AREA"),
                        "country_name": d.get("REF_AREA_label"),
                        "year": d.get("TIME_PERIOD"),
                        "value": d.get("OBS_VALUE"),
                    }
                    for d in display_data
                ]
    
            return {
                "success": True,
                "record_count": len(display_data),
                "total_available": len(all_data),
                "data": display_data,
                "summary": {
                    "countries": len(unique_countries),
                    "years": unique_years,
                    "applied_filters": {
                        "year": year,
                        "countries": countries,
                        "sex": sex,
                        "age": age
                    }
                }
            }
            
        except Exception as e:
            return {"success": False, "error": str(e)}
  • The @server.tool() decorator registers the retrieve_data_tool as an MCP tool in the FastMCP server.
        def retrieve_data_tool(
            indicator: str,
            database: str,
            year: str | int | None = None,
            countries: str | None = None,
            sex: str | None = None,
            age: str | None = None,
            limit: int = 20,
            sort_order: str = "desc",
            exclude_aggregates: bool = True,
            compact_response: bool = True
        ) -> dict[str, Any]:
            """[STEP 3/3] Retrieve actual data from World Bank Data360.
    
    ⚠️ PREREQUISITE: Call get_temporal_coverage first to get latest_year.
    
    🚨 CRITICAL TYPE REQUIREMENTS 🚨
    
    When calling this tool, you MUST pass parameters with the EXACT types shown below.
    Common mistakes that cause validation errors:
    
    ❌ INCORRECT: {"limit": "10"}      ← limit as STRING (causes error!)
    βœ… CORRECT:   {"limit": 10}        ← limit as INTEGER
    
    ❌ INCORRECT: {"exclude_aggregates": "true"}    ← boolean as STRING
    βœ… CORRECT:   {"exclude_aggregates": true}      ← boolean as BOOLEAN
    
    ❌ INCORRECT: {"year": 2023}       ← year as NUMBER
    βœ… CORRECT:   {"year": "2023"}     ← year as STRING
    
    πŸ“‹ PARAMETER TYPES - MUST MATCH EXACTLY:
    
    STRING parameters (use quotes in JSON):
      indicator: "WB_WDI_SP_POP_TOTL"
      database: "WB_WDI"
      year: "2023"
      countries: "USA,CHN,JPN"
      sex: "M" or "F" or "_T"
      age: "0-14"
      sort_order: "desc" or "asc"
    
    INTEGER parameters (no quotes in JSON):
      limit: 10 (default: 20)
    
    BOOLEAN parameters (no quotes in JSON):
      exclude_aggregates: true or false (default: true)
      compact_response: true or false (default: true)
    
    🎯 CORRECT JSON EXAMPLES:
    
    Example 1 - Top 10 countries by population:
    {
      "indicator": "WB_WDI_SP_POP_TOTL",
      "database": "WB_WDI",
      "year": "2023",
      "limit": 10,
      "sort_order": "desc",
      "exclude_aggregates": true
    }
    
    Example 2 - Specific countries GDP:
    {
      "indicator": "WB_WDI_NY_GDP_MKTP_CD",
      "database": "WB_WDI",
      "year": "2023",
      "countries": "USA,CHN,JPN"
    }
    
    Example 3 - All data with aggregates:
    {
      "indicator": "WB_WDI_SP_POP_TOTL",
      "database": "WB_WDI",
      "year": "2022",
      "exclude_aggregates": false
    }
    
    ⚑ HOW IT WORKS:
    - countries parameter: API fetches ONLY those countries (efficient)
    - exclude_aggregates: Filters out 47 regional/income codes (ARB, AFE, WLD, HIC, etc.)
      ⚠️ DEFAULT is TRUE - only individual countries returned
      Set to false to include aggregates like "World", "High income", "Arab World"
    - sort_order: Sorts by OBS_VALUE before limiting
    - limit parameter: Returns top N records to minimize tokens
      ⚠️ DEFAULT is 20 - provides reasonable default, override if you need more
    - compact_response: Returns only essential fields (country, country_name, year, value)
      ⚠️ DEFAULT is TRUE - minimizes token usage by ~75%
      Set to false if you need all fields (REF_AREA, TIME_PERIOD, OBS_VALUE, UNIT_MEASURE, etc.)
    
    πŸ“Š AFTER RECEIVING DATA:
    Format results as markdown table:
    - Sort by value (highest to lowest)
    - Add rank numbers
    - Format with thousand separators
    - Include country names (not just codes)
    
    Returns: Data records with summary statistics."""
            year_str = str(year) if year is not None else None
    
            return retrieve_data(
                indicator, database, year_str, countries, sex, age,
                limit, sort_order, exclude_aggregates, compact_response
            )

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/llnOrmll/world-bank-data-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server