list_datasets
View available clinical datasets and their status, including active selection and database support for EHR data and clinical notes queries.
Instructions
📋 List all available datasets and their status.
Returns: A formatted string listing available datasets, indicating which one is active, and showing availability of local database and BigQuery support.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Implementation Reference
- src/m4/core/tools/management.py:36-90 (handler)ListDatasetsTool class: defines the tool with name 'list_datasets', input/output models, and invoke() method that generates formatted list of available datasets including active status, backend, local files, and BigQuery support.class ListDatasetsTool: """Tool for listing all available datasets. This tool shows which datasets are configured and available, both locally (DuckDB) and remotely (BigQuery). """ name = "list_datasets" description = "📋 List all available medical datasets" input_model = ListDatasetsInput output_model = ToolOutput # Management tools have no modality requirements - always available required_modalities: frozenset[Modality] = frozenset() supported_datasets: frozenset[str] | None = None # Always available def invoke( self, dataset: DatasetDefinition, params: ListDatasetsInput ) -> ToolOutput: """List all available datasets with their status.""" active = get_active_dataset() or "(unset)" availability = detect_available_local_datasets() backend_name = os.getenv("M4_BACKEND", "duckdb") if not availability: return ToolOutput(result="No datasets detected.") output = [f"Active dataset: {active}\n"] output.append( f"Backend: {'local (DuckDB)' if backend_name == 'duckdb' else 'cloud (BigQuery)'}\n" ) for label, info in availability.items(): is_active = " (Active)" if label == active else "" output.append(f"=== {label.upper()}{is_active} ===") parquet_icon = "✅" if info["parquet_present"] else "❌" db_icon = "✅" if info["db_present"] else "❌" output.append(f" Local Parquet: {parquet_icon}") output.append(f" Local Database: {db_icon}") # BigQuery status ds_def = DatasetRegistry.get(label) if ds_def: bq_status = "✅" if ds_def.bigquery_dataset_ids else "❌" output.append(f" BigQuery Support: {bq_status}") output.append("") return ToolOutput(result="\n".join(output)) def is_compatible(self, dataset: DatasetDefinition) -> bool: """Management tools are always compatible.""" return True
- ListDatasetsInput dataclass: schema for tool input (empty, no parameters required).@dataclass class ListDatasetsInput(ToolInput): """Input for list_datasets tool.""" pass # No parameters needed
- src/m4/core/tools/__init__.py:60-60 (registration)Registration of ListDatasetsTool instance in ToolRegistry during init_tools().ToolRegistry.register(ListDatasetsTool())
- src/m4/mcp_server.py:66-77 (registration)MCP tool registration: @mcp.tool() def list_datasets() wrapper that retrieves and invokes the ListDatasetsTool.@mcp.tool() def list_datasets() -> str: """📋 List all available datasets and their status. Returns: A formatted string listing available datasets, indicating which one is active, and showing availability of local database and BigQuery support. """ tool = ToolRegistry.get("list_datasets") dataset = DatasetRegistry.get_active() return tool.invoke(dataset, ListDatasetsInput()).result