get_info
Retrieve detailed information about Hugging Face datasets including description, features, splits, and statistics. Validate dataset accessibility before fetching comprehensive metadata.
Instructions
Get detailed information about a Hugging Face dataset including description, features, splits, and statistics. Run validate first to check if the dataset exists and is accessible.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset | Yes | Hugging Face dataset identifier in the format owner/dataset | |
| auth_token | No | Hugging Face auth token for private/gated datasets |
Implementation Reference
- src/dataset_viewer/server.py:472-492 (handler)The handler logic within the MCP tool dispatcher (@server.call_tool) that executes the get_info tool. It queries the Hugging Face datasets-server API /info endpoint directly and returns the dataset information as formatted JSON text content, handling 404 errors gracefully.if name == "get_info": dataset = arguments["dataset"] try: response = await DatasetViewerAPI(auth_token=auth_token).client.get("/info", params={"dataset": dataset}) response.raise_for_status() result = response.json() return [ types.TextContent( type="text", text=json.dumps(result, indent=2) ) ] except httpx.HTTPStatusError as e: if e.response.status_code == 404: return [ types.TextContent( type="text", text=f"Dataset '{dataset}' not found" ) ] raise
- src/dataset_viewer/server.py:221-241 (registration)Registration of the 'get_info' MCP tool in the @server.list_tools() handler, defining the tool name, description, and input schema for dataset parameter.types.Tool( name="get_info", description="Get detailed information about a Hugging Face dataset including description, features, splits, and statistics. Run validate first to check if the dataset exists and is accessible.", inputSchema={ "type": "object", "properties": { "dataset": { "type": "string", "description": "Hugging Face dataset identifier in the format owner/dataset", "pattern": "^[^/]+/[^/]+$", "examples": ["ylecun/mnist", "stanfordnlp/imdb"] }, "auth_token": { "type": "string", "description": "Hugging Face auth token for private/gated datasets", "optional": True } }, "required": ["dataset"], } ),
- src/dataset_viewer/server.py:224-240 (schema)Input schema definition for the get_info tool, specifying the required 'dataset' parameter and optional auth_token.inputSchema={ "type": "object", "properties": { "dataset": { "type": "string", "description": "Hugging Face dataset identifier in the format owner/dataset", "pattern": "^[^/]+/[^/]+$", "examples": ["ylecun/mnist", "stanfordnlp/imdb"] }, "auth_token": { "type": "string", "description": "Hugging Face auth token for private/gated datasets", "optional": True } }, "required": ["dataset"], }
- src/dataset_viewer/server.py:57-68 (helper)Helper method in DatasetViewerAPI class that performs the core API call for dataset info (similar to tool handler logic), used for caching dataset state.async def get_info(self, dataset: str) -> dict: """Get detailed information about a dataset""" try: # Get detailed dataset info response = await self.client.get("/info", params={"dataset": dataset}) response.raise_for_status() return response.json() except httpx.HTTPStatusError as e: if e.response.status_code == 404: raise ValueError(f"Dataset '{dataset}' not found") raise