Skip to main content
Glama
privetin

Dataset Viewer MCP Server

by privetin

get_info

Retrieve detailed information about Hugging Face datasets including description, features, splits, and statistics. Validate dataset accessibility before fetching comprehensive metadata.

Instructions

Get detailed information about a Hugging Face dataset including description, features, splits, and statistics. Run validate first to check if the dataset exists and is accessible.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
datasetYesHugging Face dataset identifier in the format owner/dataset
auth_tokenNoHugging Face auth token for private/gated datasets

Implementation Reference

  • The handler logic within the MCP tool dispatcher (@server.call_tool) that executes the get_info tool. It queries the Hugging Face datasets-server API /info endpoint directly and returns the dataset information as formatted JSON text content, handling 404 errors gracefully.
    if name == "get_info": dataset = arguments["dataset"] try: response = await DatasetViewerAPI(auth_token=auth_token).client.get("/info", params={"dataset": dataset}) response.raise_for_status() result = response.json() return [ types.TextContent( type="text", text=json.dumps(result, indent=2) ) ] except httpx.HTTPStatusError as e: if e.response.status_code == 404: return [ types.TextContent( type="text", text=f"Dataset '{dataset}' not found" ) ] raise
  • Registration of the 'get_info' MCP tool in the @server.list_tools() handler, defining the tool name, description, and input schema for dataset parameter.
    types.Tool( name="get_info", description="Get detailed information about a Hugging Face dataset including description, features, splits, and statistics. Run validate first to check if the dataset exists and is accessible.", inputSchema={ "type": "object", "properties": { "dataset": { "type": "string", "description": "Hugging Face dataset identifier in the format owner/dataset", "pattern": "^[^/]+/[^/]+$", "examples": ["ylecun/mnist", "stanfordnlp/imdb"] }, "auth_token": { "type": "string", "description": "Hugging Face auth token for private/gated datasets", "optional": True } }, "required": ["dataset"], } ),
  • Input schema definition for the get_info tool, specifying the required 'dataset' parameter and optional auth_token.
    inputSchema={ "type": "object", "properties": { "dataset": { "type": "string", "description": "Hugging Face dataset identifier in the format owner/dataset", "pattern": "^[^/]+/[^/]+$", "examples": ["ylecun/mnist", "stanfordnlp/imdb"] }, "auth_token": { "type": "string", "description": "Hugging Face auth token for private/gated datasets", "optional": True } }, "required": ["dataset"], }
  • Helper method in DatasetViewerAPI class that performs the core API call for dataset info (similar to tool handler logic), used for caching dataset state.
    async def get_info(self, dataset: str) -> dict: """Get detailed information about a dataset""" try: # Get detailed dataset info response = await self.client.get("/info", params={"dataset": dataset}) response.raise_for_status() return response.json() except httpx.HTTPStatusError as e: if e.response.status_code == 404: raise ValueError(f"Dataset '{dataset}' not found") raise

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/privetin/dataset-viewer'

If you have feedback or need assistance with the MCP directory API, please join our Discord server