list_datasets

View available clinical datasets and their status, including active selection and database support for EHR data and clinical notes queries.

Instructions

📋 List all available datasets and their status.

Returns: A formatted string listing available datasets, indicating which one is active, and showing availability of local database and BigQuery support.

Input Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Implementation Reference

src/m4/core/tools/management.py:36-90 (handler)
ListDatasetsTool class: defines the tool with name 'list_datasets', input/output models, and invoke() method that generates formatted list of available datasets including active status, backend, local files, and BigQuery support.
class ListDatasetsTool: """Tool for listing all available datasets. This tool shows which datasets are configured and available, both locally (DuckDB) and remotely (BigQuery). """ name = "list_datasets" description = "📋 List all available medical datasets" input_model = ListDatasetsInput output_model = ToolOutput # Management tools have no modality requirements - always available required_modalities: frozenset[Modality] = frozenset() supported_datasets: frozenset[str] | None = None # Always available def invoke( self, dataset: DatasetDefinition, params: ListDatasetsInput ) -> ToolOutput: """List all available datasets with their status.""" active = get_active_dataset() or "(unset)" availability = detect_available_local_datasets() backend_name = os.getenv("M4_BACKEND", "duckdb") if not availability: return ToolOutput(result="No datasets detected.") output = [f"Active dataset: {active}\n"] output.append( f"Backend: {'local (DuckDB)' if backend_name == 'duckdb' else 'cloud (BigQuery)'}\n" ) for label, info in availability.items(): is_active = " (Active)" if label == active else "" output.append(f"=== {label.upper()}{is_active} ===") parquet_icon = "✅" if info["parquet_present"] else "❌" db_icon = "✅" if info["db_present"] else "❌" output.append(f" Local Parquet: {parquet_icon}") output.append(f" Local Database: {db_icon}") # BigQuery status ds_def = DatasetRegistry.get(label) if ds_def: bq_status = "✅" if ds_def.bigquery_dataset_ids else "❌" output.append(f" BigQuery Support: {bq_status}") output.append("") return ToolOutput(result="\n".join(output)) def is_compatible(self, dataset: DatasetDefinition) -> bool: """Management tools are always compatible.""" return True
src/m4/core/tools/management.py:22-26 (schema)
ListDatasetsInput dataclass: schema for tool input (empty, no parameters required).
@dataclass class ListDatasetsInput(ToolInput): """Input for list_datasets tool.""" pass # No parameters needed
src/m4/core/tools/__init__.py:60-60 (registration)
Registration of ListDatasetsTool instance in ToolRegistry during init_tools().
ToolRegistry.register(ListDatasetsTool())
src/m4/mcp_server.py:66-77 (registration)
MCP tool registration: @mcp.tool() def list_datasets() wrapper that retrieves and invokes the ListDatasetsTool.
@mcp.tool() def list_datasets() -> str: """📋 List all available datasets and their status. Returns: A formatted string listing available datasets, indicating which one is active, and showing availability of local database and BigQuery support. """ tool = ToolRegistry.get("list_datasets") dataset = DatasetRegistry.get_active() return tool.invoke(dataset, ListDatasetsInput()).result

M4

list_datasets

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API