Skip to main content
Glama

Batch Search and Download

encode_batch_download
Idempotent

Search for ENCODE genomic data files matching specific criteria and download them in batch. Preview downloads first with dry-run mode, then execute to retrieve files like FASTQ, BAM, or BED formats based on organism, assay, or tissue parameters.

Instructions

Search for files and download them all in batch.

First searches for files matching the criteria, then downloads them. By default runs in dry_run mode to preview what would be downloaded. Set dry_run=False to actually download.

WHEN TO USE: Use for searching and downloading files in one step. Always use dry_run=True first to preview. For specific file accessions, use encode_download_files. RELATED TOOLS: encode_download_files, encode_search_files

Examples:

  • Download all BED files from human pancreas ChIP-seq: file_format="bed", assay_title="Histone ChIP-seq", organ="pancreas", download_dir="/data/encode", dry_run=False

  • Preview FASTQ downloads for mouse brain RNA-seq: file_format="fastq", assay_title="RNA-seq", organ="brain", organism="Mus musculus", download_dir="/data/encode"

  • Download IDR peaks for H3K27me3 in GRCh38: output_type="IDR thresholded peaks", target="H3K27me3", assembly="GRCh38", download_dir="/data/encode", dry_run=False

Args: download_dir: Local directory to save files file_format: File format filter ("fastq", "bam", "bed", "bigWig", etc.) output_type: Output type filter ("reads", "peaks", "signal", etc.) output_category: Output category ("raw data", "alignment", "annotation", etc.) assembly: Genome assembly ("GRCh38", "mm10", etc.) assay_title: Assay type ("Histone ChIP-seq", "ATAC-seq", "RNA-seq", etc.) organism: Organism (default: "Homo sapiens") organ: Organ/tissue ("pancreas", "brain", "liver", etc.) biosample_type: Biosample type ("tissue", "cell line", "primary cell", etc.) target: ChIP/CUT&RUN target ("H3K27me3", "CTCF", etc.) preferred_default: If True, only download default/recommended files organize_by: File organization ("flat", "experiment", "format", "experiment_format") verify_md5: Verify downloads with MD5 checksums (default True) limit: Max files to download (default 100, safety limit) dry_run: If True (default), only preview what would be downloaded. Set False to download.

Returns: JSON with download preview (dry_run=True) or download results (dry_run=False).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
download_dirYes
file_formatNo
output_typeNo
output_categoryNo
assemblyNo
assay_titleNo
organismNoHomo sapiens
organNo
biosample_typeNo
targetNo
preferred_defaultNo
organize_byNoexperiment
verify_md5No
limitNo
dry_runNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it explains the two-step process (search then download), the default dry-run mode for previewing, and the safety limit parameter. While annotations cover idempotency and non-destructiveness, the description provides practical usage details that enhance transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (purpose, usage guidelines, examples, args, returns) and front-loaded key information. While comprehensive, it remains efficient with no wasted sentences; each section serves a distinct purpose in guiding the agent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (15 parameters, batch operations) and the presence of an output schema (returns JSON), the description is complete. It covers purpose, usage, examples, parameter semantics, and behavioral details, providing all necessary context for the agent to use the tool effectively without redundancy.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description compensates well by listing all 15 parameters with brief explanations in the 'Args' section. It adds meaning beyond the schema by explaining defaults (e.g., organism default, verify_md5 default True) and providing context for parameters like dry_run and limit. However, some parameters lack detailed semantic context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as 'Search for files and download them all in batch,' specifying both the search and download actions. It distinguishes from siblings by mentioning encode_download_files for specific file accessions and encode_search_files as a related tool, showing clear differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes an explicit 'WHEN TO USE' section that provides clear guidance: use for searching and downloading in one step, always use dry_run=True first to preview, and use encode_download_files for specific file accessions. It also lists related tools, offering clear alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ammawla/encode-toolkit'

If you have feedback or need assistance with the MCP directory API, please join our Discord server