get_info

get_info

Retrieve detailed information about Hugging Face datasets including description, features, splits, and statistics. Validate dataset accessibility before fetching comprehensive metadata.

Instructions

Get detailed information about a Hugging Face dataset including description, features, splits, and statistics. Run validate first to check if the dataset exists and is accessible.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`dataset`	Yes	Hugging Face dataset identifier in the format owner/dataset
`auth_token`	No	Hugging Face auth token for private/gated datasets

Implementation Reference

src/dataset_viewer/server.py:472-492 (handler)
The handler logic within the MCP tool dispatcher (@server.call_tool) that executes the get_info tool. It queries the Hugging Face datasets-server API /info endpoint directly and returns the dataset information as formatted JSON text content, handling 404 errors gracefully.
if name == "get_info": dataset = arguments["dataset"] try: response = await DatasetViewerAPI(auth_token=auth_token).client.get("/info", params={"dataset": dataset}) response.raise_for_status() result = response.json() return [ types.TextContent( type="text", text=json.dumps(result, indent=2) ) ] except httpx.HTTPStatusError as e: if e.response.status_code == 404: return [ types.TextContent( type="text", text=f"Dataset '{dataset}' not found" ) ] raise
src/dataset_viewer/server.py:221-241 (registration)
Registration of the 'get_info' MCP tool in the @server.list_tools() handler, defining the tool name, description, and input schema for dataset parameter.
types.Tool( name="get_info", description="Get detailed information about a Hugging Face dataset including description, features, splits, and statistics. Run validate first to check if the dataset exists and is accessible.", inputSchema={ "type": "object", "properties": { "dataset": { "type": "string", "description": "Hugging Face dataset identifier in the format owner/dataset", "pattern": "^[^/]+/[^/]+$", "examples": ["ylecun/mnist", "stanfordnlp/imdb"] }, "auth_token": { "type": "string", "description": "Hugging Face auth token for private/gated datasets", "optional": True } }, "required": ["dataset"], } ),
src/dataset_viewer/server.py:224-240 (schema)
Input schema definition for the get_info tool, specifying the required 'dataset' parameter and optional auth_token.
inputSchema={ "type": "object", "properties": { "dataset": { "type": "string", "description": "Hugging Face dataset identifier in the format owner/dataset", "pattern": "^[^/]+/[^/]+$", "examples": ["ylecun/mnist", "stanfordnlp/imdb"] }, "auth_token": { "type": "string", "description": "Hugging Face auth token for private/gated datasets", "optional": True } }, "required": ["dataset"], }
src/dataset_viewer/server.py:57-68 (helper)
Helper method in DatasetViewerAPI class that performs the core API call for dataset info (similar to tool handler logic), used for caching dataset state.
async def get_info(self, dataset: str) -> dict: """Get detailed information about a dataset""" try: # Get detailed dataset info response = await self.client.get("/info", params={"dataset": dataset}) response.raise_for_status() return response.json() except httpx.HTTPStatusError as e: if e.response.status_code == 404: raise ValueError(f"Dataset '{dataset}' not found") raise

Dataset Viewer MCP Server

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API