retrieve_data_tool
Retrieve World Bank economic and social indicator data for specific countries, years, and indicators with customizable filters and sorting options.
Instructions
[STEP 3/3] Retrieve actual data from World Bank Data360.
ā ļø PREREQUISITE: Call get_temporal_coverage first to get latest_year.
šØ CRITICAL TYPE REQUIREMENTS šØ
When calling this tool, you MUST pass parameters with the EXACT types shown below. Common mistakes that cause validation errors:
ā INCORRECT: {"limit": "10"} ā limit as STRING (causes error!) ā CORRECT: {"limit": 10} ā limit as INTEGER
ā INCORRECT: {"exclude_aggregates": "true"} ā boolean as STRING ā CORRECT: {"exclude_aggregates": true} ā boolean as BOOLEAN
ā INCORRECT: {"year": 2023} ā year as NUMBER ā CORRECT: {"year": "2023"} ā year as STRING
š PARAMETER TYPES - MUST MATCH EXACTLY:
STRING parameters (use quotes in JSON): indicator: "WB_WDI_SP_POP_TOTL" database: "WB_WDI" year: "2023" countries: "USA,CHN,JPN" sex: "M" or "F" or "_T" age: "0-14" sort_order: "desc" or "asc"
INTEGER parameters (no quotes in JSON): limit: 10 (default: 20)
BOOLEAN parameters (no quotes in JSON): exclude_aggregates: true or false (default: true) compact_response: true or false (default: true)
šÆ CORRECT JSON EXAMPLES:
Example 1 - Top 10 countries by population: { "indicator": "WB_WDI_SP_POP_TOTL", "database": "WB_WDI", "year": "2023", "limit": 10, "sort_order": "desc", "exclude_aggregates": true }
Example 2 - Specific countries GDP: { "indicator": "WB_WDI_NY_GDP_MKTP_CD", "database": "WB_WDI", "year": "2023", "countries": "USA,CHN,JPN" }
Example 3 - All data with aggregates: { "indicator": "WB_WDI_SP_POP_TOTL", "database": "WB_WDI", "year": "2022", "exclude_aggregates": false }
ā” HOW IT WORKS:
countries parameter: API fetches ONLY those countries (efficient)
exclude_aggregates: Filters out 47 regional/income codes (ARB, AFE, WLD, HIC, etc.) ā ļø DEFAULT is TRUE - only individual countries returned Set to false to include aggregates like "World", "High income", "Arab World"
sort_order: Sorts by OBS_VALUE before limiting
limit parameter: Returns top N records to minimize tokens ā ļø DEFAULT is 20 - provides reasonable default, override if you need more
compact_response: Returns only essential fields (country, country_name, year, value) ā ļø DEFAULT is TRUE - minimizes token usage by ~75% Set to false if you need all fields (REF_AREA, TIME_PERIOD, OBS_VALUE, UNIT_MEASURE, etc.)
š AFTER RECEIVING DATA: Format results as markdown table:
Sort by value (highest to lowest)
Add rank numbers
Format with thousand separators
Include country names (not just codes)
Returns: Data records with summary statistics.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| indicator | Yes | ||
| database | Yes | ||
| year | No | ||
| countries | No | ||
| sex | No | ||
| age | No | ||
| limit | No | ||
| sort_order | No | desc | |
| exclude_aggregates | No | ||
| compact_response | No |
Implementation Reference
- src/world_bank_mcp/server.py:426-527 (handler)Handler function for 'retrieve_data_tool' registered via @server.tool() decorator. Converts year parameter to string and delegates to the retrieve_data helper function. Includes detailed docstring specifying parameter types and usage examples.def retrieve_data_tool( indicator: str, database: str, year: str | int | None = None, countries: str | None = None, sex: str | None = None, age: str | None = None, limit: int = 20, sort_order: str = "desc", exclude_aggregates: bool = True, compact_response: bool = True ) -> dict[str, Any]: """[STEP 3/3] Retrieve actual data from World Bank Data360. ā ļø PREREQUISITE: Call get_temporal_coverage first to get latest_year. šØ CRITICAL TYPE REQUIREMENTS šØ When calling this tool, you MUST pass parameters with the EXACT types shown below. Common mistakes that cause validation errors: ā INCORRECT: {"limit": "10"} ā limit as STRING (causes error!) ā CORRECT: {"limit": 10} ā limit as INTEGER ā INCORRECT: {"exclude_aggregates": "true"} ā boolean as STRING ā CORRECT: {"exclude_aggregates": true} ā boolean as BOOLEAN ā INCORRECT: {"year": 2023} ā year as NUMBER ā CORRECT: {"year": "2023"} ā year as STRING š PARAMETER TYPES - MUST MATCH EXACTLY: STRING parameters (use quotes in JSON): indicator: "WB_WDI_SP_POP_TOTL" database: "WB_WDI" year: "2023" countries: "USA,CHN,JPN" sex: "M" or "F" or "_T" age: "0-14" sort_order: "desc" or "asc" INTEGER parameters (no quotes in JSON): limit: 10 (default: 20) BOOLEAN parameters (no quotes in JSON): exclude_aggregates: true or false (default: true) compact_response: true or false (default: true) šÆ CORRECT JSON EXAMPLES: Example 1 - Top 10 countries by population: { "indicator": "WB_WDI_SP_POP_TOTL", "database": "WB_WDI", "year": "2023", "limit": 10, "sort_order": "desc", "exclude_aggregates": true } Example 2 - Specific countries GDP: { "indicator": "WB_WDI_NY_GDP_MKTP_CD", "database": "WB_WDI", "year": "2023", "countries": "USA,CHN,JPN" } Example 3 - All data with aggregates: { "indicator": "WB_WDI_SP_POP_TOTL", "database": "WB_WDI", "year": "2022", "exclude_aggregates": false } ā” HOW IT WORKS: - countries parameter: API fetches ONLY those countries (efficient) - exclude_aggregates: Filters out 47 regional/income codes (ARB, AFE, WLD, HIC, etc.) ā ļø DEFAULT is TRUE - only individual countries returned Set to false to include aggregates like "World", "High income", "Arab World" - sort_order: Sorts by OBS_VALUE before limiting - limit parameter: Returns top N records to minimize tokens ā ļø DEFAULT is 20 - provides reasonable default, override if you need more - compact_response: Returns only essential fields (country, country_name, year, value) ā ļø DEFAULT is TRUE - minimizes token usage by ~75% Set to false if you need all fields (REF_AREA, TIME_PERIOD, OBS_VALUE, UNIT_MEASURE, etc.) š AFTER RECEIVING DATA: Format results as markdown table: - Sort by value (highest to lowest) - Add rank numbers - Format with thousand separators - Include country names (not just codes) Returns: Data records with summary statistics.""" year_str = str(year) if year is not None else None return retrieve_data( indicator, database, year_str, countries, sex, age, limit, sort_order, exclude_aggregates, compact_response )
- src/world_bank_mcp/server.py:126-249 (helper)Core helper function implementing the data retrieval logic: constructs API parameters, handles pagination loop, fetches data from Data360 endpoint, applies client-side filtering (exclude aggregates), sorting by OBS_VALUE, limiting results, compacting response fields, and generating summary statistics.def retrieve_data( indicator: str, database: str, year: str | None = None, countries: str | None = None, sex: str | None = None, age: str | None = None, limit: int | None = 20, sort_order: str = "desc", exclude_aggregates: bool = True, compact_response: bool = True ) -> dict[str, Any]: """Retrieve actual data from World Bank API""" params = { "DATABASE_ID": database, "INDICATOR": indicator, "skip": 0, } # Apply filters if year: params["timePeriodFrom"] = year params["timePeriodTo"] = year if countries: params["REF_AREA"] = countries if sex: params["SEX"] = sex if age: params["AGE"] = age all_data = [] try: # Pagination loop (max 10000 records as safety limit) while len(all_data) < 10000: response = requests.get( DATA_ENDPOINT, params=params, headers={"Accept": "application/json"}, timeout=30, ) response.raise_for_status() data = response.json() values = data.get("value", []) if not values: break all_data.extend(values) total_count = data.get("count", 0) if len(all_data) >= total_count: break params["skip"] = len(all_data) # CLIENT-SIDE: Filter out regional/income aggregates if requested if exclude_aggregates and all_data: all_data = [d for d in all_data if d.get("REF_AREA") not in AGGREGATE_CODES] # CLIENT-SIDE: Sort by OBS_VALUE if requested if all_data and sort_order: data_with_values = [d for d in all_data if d.get("OBS_VALUE") is not None] data_without_values = [d for d in all_data if d.get("OBS_VALUE") is None] reverse_order = (sort_order.lower() == "desc") try: sorted_data = sorted( data_with_values, key=lambda x: float(str(x.get("OBS_VALUE", "0"))), reverse=reverse_order ) all_data = sorted_data + data_without_values except (ValueError, TypeError) as e: # If sorting fails, return error in response return { "success": False, "error": f"Sorting failed: {str(e)}. OBS_VALUE type: {type(data_with_values[0].get('OBS_VALUE')) if data_with_values else 'no data'}" } # CLIENT-SIDE: Apply limit to reduce tokens sent to Claude display_data = all_data[:limit] if limit else all_data # Generate summary (before compacting to preserve field names) unique_countries = set(d.get("REF_AREA") for d in display_data if d.get("REF_AREA")) unique_years = sorted(set(d.get("TIME_PERIOD") for d in display_data if d.get("TIME_PERIOD"))) # CLIENT-SIDE: Compact response to minimize tokens (only essential fields) if compact_response and display_data: display_data = [ { "country": d.get("REF_AREA"), "country_name": d.get("REF_AREA_label"), "year": d.get("TIME_PERIOD"), "value": d.get("OBS_VALUE"), } for d in display_data ] return { "success": True, "record_count": len(display_data), "total_available": len(all_data), "data": display_data, "summary": { "countries": len(unique_countries), "years": unique_years, "applied_filters": { "year": year, "countries": countries, "sex": sex, "age": age } } } except Exception as e: return {"success": False, "error": str(e)}
- src/world_bank_mcp/server.py:426-527 (registration)The @server.tool() decorator registers the retrieve_data_tool as an MCP tool in the FastMCP server.def retrieve_data_tool( indicator: str, database: str, year: str | int | None = None, countries: str | None = None, sex: str | None = None, age: str | None = None, limit: int = 20, sort_order: str = "desc", exclude_aggregates: bool = True, compact_response: bool = True ) -> dict[str, Any]: """[STEP 3/3] Retrieve actual data from World Bank Data360. ā ļø PREREQUISITE: Call get_temporal_coverage first to get latest_year. šØ CRITICAL TYPE REQUIREMENTS šØ When calling this tool, you MUST pass parameters with the EXACT types shown below. Common mistakes that cause validation errors: ā INCORRECT: {"limit": "10"} ā limit as STRING (causes error!) ā CORRECT: {"limit": 10} ā limit as INTEGER ā INCORRECT: {"exclude_aggregates": "true"} ā boolean as STRING ā CORRECT: {"exclude_aggregates": true} ā boolean as BOOLEAN ā INCORRECT: {"year": 2023} ā year as NUMBER ā CORRECT: {"year": "2023"} ā year as STRING š PARAMETER TYPES - MUST MATCH EXACTLY: STRING parameters (use quotes in JSON): indicator: "WB_WDI_SP_POP_TOTL" database: "WB_WDI" year: "2023" countries: "USA,CHN,JPN" sex: "M" or "F" or "_T" age: "0-14" sort_order: "desc" or "asc" INTEGER parameters (no quotes in JSON): limit: 10 (default: 20) BOOLEAN parameters (no quotes in JSON): exclude_aggregates: true or false (default: true) compact_response: true or false (default: true) šÆ CORRECT JSON EXAMPLES: Example 1 - Top 10 countries by population: { "indicator": "WB_WDI_SP_POP_TOTL", "database": "WB_WDI", "year": "2023", "limit": 10, "sort_order": "desc", "exclude_aggregates": true } Example 2 - Specific countries GDP: { "indicator": "WB_WDI_NY_GDP_MKTP_CD", "database": "WB_WDI", "year": "2023", "countries": "USA,CHN,JPN" } Example 3 - All data with aggregates: { "indicator": "WB_WDI_SP_POP_TOTL", "database": "WB_WDI", "year": "2022", "exclude_aggregates": false } ā” HOW IT WORKS: - countries parameter: API fetches ONLY those countries (efficient) - exclude_aggregates: Filters out 47 regional/income codes (ARB, AFE, WLD, HIC, etc.) ā ļø DEFAULT is TRUE - only individual countries returned Set to false to include aggregates like "World", "High income", "Arab World" - sort_order: Sorts by OBS_VALUE before limiting - limit parameter: Returns top N records to minimize tokens ā ļø DEFAULT is 20 - provides reasonable default, override if you need more - compact_response: Returns only essential fields (country, country_name, year, value) ā ļø DEFAULT is TRUE - minimizes token usage by ~75% Set to false if you need all fields (REF_AREA, TIME_PERIOD, OBS_VALUE, UNIT_MEASURE, etc.) š AFTER RECEIVING DATA: Format results as markdown table: - Sort by value (highest to lowest) - Add rank numbers - Format with thousand separators - Include country names (not just codes) Returns: Data records with summary statistics.""" year_str = str(year) if year is not None else None return retrieve_data( indicator, database, year_str, countries, sex, age, limit, sort_order, exclude_aggregates, compact_response )