Skip to main content
Glama
privetin

Dataset Viewer MCP Server

by privetin

get_first_rows

Retrieve initial rows from a Hugging Face dataset split to preview data structure and content for analysis or validation.

Instructions

Get first rows from a Hugging Face dataset split

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
datasetYesHugging Face dataset identifier in the format owner/dataset
configYesDataset configuration/subset name. Use get_info to list available configs
splitYesDataset split name. Splits partition the data for training/evaluation
auth_tokenNoHugging Face auth token for private/gated datasets

Implementation Reference

  • MCP tool handler execution logic for 'get_first_rows': extracts arguments, instantiates DatasetViewerAPI, calls get_first_rows method, and returns JSON-formatted result as TextContent.
    elif name == "get_first_rows":
        dataset = arguments["dataset"]
        config = arguments["config"]
        split = arguments["split"]
        first_rows = await DatasetViewerAPI(auth_token=auth_token).get_first_rows(dataset, config=config, split=split)
        return [
            types.TextContent(
                type="text",
                text=json.dumps(first_rows, indent=2)
            )
        ]
  • Tool registration in @server.list_tools(), defining name, description, and input schema for 'get_first_rows'.
    types.Tool(
        name="get_first_rows",
        description="Get first rows from a Hugging Face dataset split",
        inputSchema={
            "type": "object",
            "properties": {
                "dataset": {
                    "type": "string",
                    "description": "Hugging Face dataset identifier in the format owner/dataset",
                    "pattern": "^[^/]+/[^/]+$",
                    "examples": ["ylecun/mnist", "stanfordnlp/imdb"]
                },
                "config": {
                    "type": "string",
                    "description": "Dataset configuration/subset name. Use get_info to list available configs",
                    "examples": ["default", "en", "es"]
                },
                "split": {
                    "type": "string",
                    "description": "Dataset split name. Splits partition the data for training/evaluation",
                    "examples": ["train", "validation", "test"]
                },
                "auth_token": {
                    "type": "string",
                    "description": "Hugging Face auth token for private/gated datasets",
                    "optional": True
                }
            },
            "required": ["dataset", "config", "split"],
        }
    ),
  • Core helper function in DatasetViewerAPI class that makes HTTP GET request to '/first-rows' endpoint of Hugging Face datasets-server to retrieve first rows of specified dataset/config/split.
    async def get_first_rows(self, dataset: str, config: str, split: str) -> dict:
        """Get first few rows of a dataset split"""
        params = {
            "dataset": dataset,
            "config": config,
            "split": split
        }
        response = await self.client.get("/first-rows", params=params)
        response.raise_for_status()
        return response.json()

Tool Definition Quality

Score is being calculated. Check back soon.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/privetin/dataset-viewer'

If you have feedback or need assistance with the MCP directory API, please join our Discord server