get_info
Retrieve comprehensive details about Hugging Face datasets, including descriptions, features, splits, and statistics. Validate dataset accessibility before fetching information.
Instructions
Get detailed information about a Hugging Face dataset including description, features, splits, and statistics. Run validate first to check if the dataset exists and is accessible.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| auth_token | No | Hugging Face auth token for private/gated datasets | |
| dataset | Yes | Hugging Face dataset identifier in the format owner/dataset |
Input Schema (JSON Schema)
{
"properties": {
"auth_token": {
"description": "Hugging Face auth token for private/gated datasets",
"optional": true,
"type": "string"
},
"dataset": {
"description": "Hugging Face dataset identifier in the format owner/dataset",
"examples": [
"ylecun/mnist",
"stanfordnlp/imdb"
],
"pattern": "^[^/]+/[^/]+$",
"type": "string"
}
},
"required": [
"dataset"
],
"type": "object"
}
Implementation Reference
- src/dataset_viewer/server.py:472-492 (handler)Executes the get_info tool: queries HF datasets-server /info endpoint with dataset param, returns formatted JSON or 404 error message.if name == "get_info": dataset = arguments["dataset"] try: response = await DatasetViewerAPI(auth_token=auth_token).client.get("/info", params={"dataset": dataset}) response.raise_for_status() result = response.json() return [ types.TextContent( type="text", text=json.dumps(result, indent=2) ) ] except httpx.HTTPStatusError as e: if e.response.status_code == 404: return [ types.TextContent( type="text", text=f"Dataset '{dataset}' not found" ) ] raise
- src/dataset_viewer/server.py:222-241 (registration)Registers the get_info tool in list_tools() with name, description, and input schema definition.name="get_info", description="Get detailed information about a Hugging Face dataset including description, features, splits, and statistics. Run validate first to check if the dataset exists and is accessible.", inputSchema={ "type": "object", "properties": { "dataset": { "type": "string", "description": "Hugging Face dataset identifier in the format owner/dataset", "pattern": "^[^/]+/[^/]+$", "examples": ["ylecun/mnist", "stanfordnlp/imdb"] }, "auth_token": { "type": "string", "description": "Hugging Face auth token for private/gated datasets", "optional": True } }, "required": ["dataset"], } ),
- src/dataset_viewer/server.py:224-240 (schema)Input schema for get_info tool defining dataset (required, pattern-validated) and optional auth_token.inputSchema={ "type": "object", "properties": { "dataset": { "type": "string", "description": "Hugging Face dataset identifier in the format owner/dataset", "pattern": "^[^/]+/[^/]+$", "examples": ["ylecun/mnist", "stanfordnlp/imdb"] }, "auth_token": { "type": "string", "description": "Hugging Face auth token for private/gated datasets", "optional": True } }, "required": ["dataset"], }
- src/dataset_viewer/server.py:57-68 (helper)DatasetViewerAPI.get_info helper method performs the core API call for dataset info, reused in state caching.async def get_info(self, dataset: str) -> dict: """Get detailed information about a dataset""" try: # Get detailed dataset info response = await self.client.get("/info", params={"dataset": dataset}) response.raise_for_status() return response.json() except httpx.HTTPStatusError as e: if e.response.status_code == 404: raise ValueError(f"Dataset '{dataset}' not found") raise