list_parquet_files
Retrieve direct download URLs for Parquet files from Hugging Face datasets to enable data processing and analysis.
Instructions
Get URLs for the dataset's Parquet files for direct download or processing
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset | Yes | Dataset ID (e.g., 'stanfordnlp/imdb') |
Implementation Reference
- src/tools/list-parquet-files.ts:26-40 (handler)The handler function that executes the logic for listing parquet files for a given dataset.
async ({ dataset }) => { const data = await fetchDatasetViewer<ParquetResponse>("/parquet", { dataset, }); return { content: [ { type: "text" as const, text: JSON.stringify(data.parquet_files, null, 2), }, ], }; } ); - src/tools/list-parquet-files.ts:19-41 (registration)Registration function that defines the "list_parquet_files" tool and its parameters.
export function registerListParquetFiles(server: McpServer) { server.tool( "list_parquet_files", "Get URLs for the dataset's Parquet files for direct download or processing", { dataset: z.string().describe("Dataset ID (e.g., 'stanfordnlp/imdb')"), }, async ({ dataset }) => { const data = await fetchDatasetViewer<ParquetResponse>("/parquet", { dataset, }); return { content: [ { type: "text" as const, text: JSON.stringify(data.parquet_files, null, 2), }, ], }; } ); } - src/tools/list-parquet-files.ts:5-17 (schema)Type definition for the response structure returned by the parquet file API.
interface ParquetResponse { parquet_files: Array<{ dataset: string; config: string; split: string; url: string; filename: string; size: number; }>; pending: unknown[]; failed: unknown[]; partial: boolean; }