Skip to main content
Glama

deepghs_generate_cheesechaser_script

Generate a Python script to selectively download specific images from indexed HuggingFace datasets without downloading entire archives, using post IDs to extract only needed files.

Instructions

Generate a cheesechaser Python script to download images from an indexed DeepGHS dataset.

cheesechaser is DeepGHS's tool for selectively downloading images from HuggingFace datasets that are stored as indexed tar archives. Instead of downloading entire multi-GB tar files, you provide a list of post IDs and it extracts only those images.

This is the most efficient way to get specific images from datasets like:

  • deepghs/danbooru2024 (~8M images, hundreds of GB total)

  • deepghs/gelbooru-webp-4Mpixel (~millions of images)

  • deepghs/sankaku_full (~millions of images)

Args: params (GenerateCheesechaserScriptInput): - repo_id (str): HF dataset repo ID (e.g. 'deepghs/danbooru2024') - output_dir (str): Local directory to save downloaded images - post_ids (Optional[list[int]]): Specific post IDs to download - max_workers (int): Parallel download threads (1–16, default: 4)

Returns: str: Complete cheesechaser Python script with inline comments, plus guidance on how to find post IDs from Danbooru/Gelbooru search results.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
paramsYes

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/citronlegacy/deepghs-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server