Skip to main content
Glama
michaelwaves

Hugging Face Hub MCP Server

by michaelwaves

hf_list_datasets

Search, filter, and retrieve detailed metadata for datasets on the Hugging Face Hub, including downloads, likes, and tags. Refine results by author, search terms, or tags for targeted exploration.

Instructions

Get information from all datasets in the Hub. Supports filtering by search terms, authors, tags, and more. Returns paginated results with dataset metadata including downloads, likes, and tags.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
authorNoFilter datasets by author or organization (e.g., 'huggingface', 'microsoft')
configNoWhether to also fetch the repo config
directionNoSort direction: '-1' for descending, anything else for ascending
filterNoFilter based on tags (e.g., 'task_categories:text-classification', 'languages:en')
fullNoWhether to fetch most dataset data including all tags and files
limitNoLimit the number of datasets fetched
searchNoFilter based on substrings for repos and their usernames (e.g., 'pets', 'microsoft')
sortNoProperty to use when sorting (e.g., 'downloads', 'author')

Implementation Reference

  • The MCP tool handler for 'hf_list_datasets': validates arguments using isDatasetSearchArgs, calls client.getDatasets(), and formats the CallToolResult.
    export async function handleListDatasets(client: HuggingFaceClient, args: unknown): Promise<CallToolResult> { try { if (!isDatasetSearchArgs(args)) { throw new Error("Invalid arguments for hf_list_datasets"); } const results = await client.getDatasets(args as Record<string, any>); return { content: [{ type: "text", text: results }], isError: false, }; } catch (error) { return { content: [ { type: "text", text: `Error: ${error instanceof Error ? error.message : String(error)}`, }, ], isError: true, }; } }
  • The tool definition for 'hf_list_datasets' including name, description, and detailed inputSchema for filtering and pagination parameters.
    export const listDatasetsToolDefinition: Tool = { name: "hf_list_datasets", description: "Get information from all datasets in the Hub. Supports filtering by search terms, authors, tags, and more. " + "Returns paginated results with dataset metadata including downloads, likes, and tags.", inputSchema: { type: "object", properties: { search: { type: "string", description: "Filter based on substrings for repos and their usernames (e.g., 'pets', 'microsoft')" }, author: { type: "string", description: "Filter datasets by author or organization (e.g., 'huggingface', 'microsoft')" }, filter: { type: "string", description: "Filter based on tags (e.g., 'task_categories:text-classification', 'languages:en')" }, sort: { type: "string", description: "Property to use when sorting (e.g., 'downloads', 'author')" }, direction: { type: "string", description: "Sort direction: '-1' for descending, anything else for ascending" }, limit: { type: "number", description: "Limit the number of datasets fetched" }, full: { type: "boolean", description: "Whether to fetch most dataset data including all tags and files" }, config: { type: "boolean", description: "Whether to also fetch the repo config" } }, required: [] } };
  • Core helper method in HuggingFaceClient: performs HTTP GET to Hugging Face Hub API '/api/datasets' endpoint with query params, returns pretty-printed JSON string of the response data.
    async getDatasets(params: Record<string, any> = {}): Promise<string> { try { const response: AxiosResponse = await this.httpClient.get('/api/datasets', { params }); return JSON.stringify(response.data, null, 2); } catch (error) { throw new Error(`Failed to fetch datasets: ${error instanceof Error ? error.message : String(error)}`); } }
  • src/server.ts:81-82 (registration)
    Registration and dispatch of 'hf_list_datasets' handler in the main HuggingFaceServer's CallToolRequestHandler switch statement.
    case 'hf_list_datasets': return handleListDatasets(this.client, args);
  • src/server.ts:55-66 (registration)
    Registration of 'hf_list_datasets' tool definition (listDatasetsToolDefinition) in the ListToolsRequestHandler for tool discovery.
    this.server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ listModelsToolDefinition, getModelInfoToolDefinition, getModelTagsToolDefinition, listDatasetsToolDefinition, getDatasetInfoToolDefinition, getDatasetParquetToolDefinition, getCroissantToolDefinition, getDatasetTagsToolDefinition ], }));
Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/michaelwaves/hf-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server