This server allows browsing, analyzing, searching, filtering, and retrieving data from Hugging Face datasets.
Browse datasets: Access datasets using the
dataset://URI scheme, including configurations and splits.Validate datasets: Check if a dataset exists and is accessible.
Get detailed info: Retrieve dataset descriptions, features, splits, and statistics.
Paginate rows: Fetch dataset contents in paginated chunks.
Fetch first rows: Retrieve the initial rows from a dataset split.
Extract statistics: Analyze dataset statistics for splits.
Search dataset: Find text within a dataset using search queries.
Filter rows: Apply SQL-like conditions to filter dataset rows.
Download datasets: Export entire datasets in Parquet format.
Handle authentication: Access private datasets using Hugging Face authentication tokens.
Allows interaction with the Hugging Face Dataset Viewer API, providing tools for browsing, searching, filtering, and analyzing datasets hosted on the Hugging Face Hub, along with support for authentication for private datasets.
Dataset Viewer MCP Server
An MCP server for interacting with the Hugging Face Dataset Viewer API, providing capabilities to browse and analyze datasets hosted on the Hugging Face Hub.
Features
Resources
Uses
dataset://URI scheme for accessing Hugging Face datasetsSupports dataset configurations and splits
Provides paginated access to dataset contents
Handles authentication for private datasets
Supports searching and filtering dataset contents
Provides dataset statistics and analysis
Tools
The server provides the following tools:
validate
Check if a dataset exists and is accessible
Parameters:
dataset: Dataset identifier (e.g. 'stanfordnlp/imdb')auth_token(optional): For private datasets
get_info
Get detailed information about a dataset
Parameters:
dataset: Dataset identifierauth_token(optional): For private datasets
get_rows
Get paginated contents of a dataset
Parameters:
dataset: Dataset identifierconfig: Configuration namesplit: Split namepage(optional): Page number (0-based)auth_token(optional): For private datasets
get_first_rows
Get first rows from a dataset split
Parameters:
dataset: Dataset identifierconfig: Configuration namesplit: Split nameauth_token(optional): For private datasets
get_statistics
Get statistics about a dataset split
Parameters:
dataset: Dataset identifierconfig: Configuration namesplit: Split nameauth_token(optional): For private datasets
search_dataset
Search for text within a dataset
Parameters:
dataset: Dataset identifierconfig: Configuration namesplit: Split namequery: Text to search forauth_token(optional): For private datasets
filter
Filter rows using SQL-like conditions
Parameters:
dataset: Dataset identifierconfig: Configuration namesplit: Split namewhere: SQL WHERE clause (e.g. "score > 0.5")orderby(optional): SQL ORDER BY clausepage(optional): Page number (0-based)auth_token(optional): For private datasets
get_parquet
Download entire dataset in Parquet format
Parameters:
dataset: Dataset identifierauth_token(optional): For private datasets
Related MCP server: Hugging Face Hub Semantic Search MCP
Installation
Prerequisites
Python 3.12 or higher
uv - Fast Python package installer and resolver
Setup
Clone the repository:
Create a virtual environment and install:
Configuration
Environment Variables
HUGGINGFACE_TOKEN: Your Hugging Face API token for accessing private datasets
Claude Desktop Integration
Add the following to your Claude Desktop config file:
On Windows: %APPDATA%\Claude\claude_desktop_config.json
On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
License
MIT License - see LICENSE for details