encode-toolkit

Batch Search and Download

encode_batch_download

Idempotent

Search for ENCODE files by criteria like format, assay, or organism, then preview or download all matching files in batch.

Instructions

Search for files and download them all in batch.

First searches for files matching the criteria, then downloads them. By default runs in dry_run mode to preview what would be downloaded. Set dry_run=False to actually download.

WHEN TO USE: Use for searching and downloading files in one step. Always use dry_run=True first to preview. For specific file accessions, use encode_download_files. RELATED TOOLS: encode_download_files, encode_search_files

Examples:

Download all BED files from human pancreas ChIP-seq: file_format="bed", assay_title="Histone ChIP-seq", organ="pancreas", download_dir="/data/encode", dry_run=False
Preview FASTQ downloads for mouse brain RNA-seq: file_format="fastq", assay_title="total RNA-seq", organ="brain", organism="Mus musculus", download_dir="/data/encode"
Download IDR peaks for H3K27me3 in GRCh38: output_type="IDR thresholded peaks", target="H3K27me3", assembly="GRCh38", download_dir="/data/encode", dry_run=False

Args: download_dir: Local directory to save files file_format: File format filter ("fastq", "bam", "bed", "bigWig", etc.) output_type: Output type filter ("reads", "peaks", "signal", etc.) output_category: Output category ("raw data", "alignment", "annotation", etc.) assembly: Genome assembly ("GRCh38", "mm10", etc.) assay_title: Assay type ("Histone ChIP-seq", "ATAC-seq", "total RNA-seq", etc.) organism: Organism (default: "Homo sapiens") organ: Organ/tissue ("pancreas", "brain", "liver", etc.) biosample_type: Biosample type ("tissue", "cell line", "primary cell", etc.) target: ChIP/CUT&RUN target ("H3K27me3", "CTCF", etc.) preferred_default: If True, only download default/recommended files organize_by: File organization ("flat", "experiment", "format", "experiment_format") verify_md5: Verify downloads with MD5 checksums (default True) limit: Max files to download (default 100, safety limit) dry_run: If True (default), only preview what would be downloaded. Set False to download.

Returns: JSON with download preview (dry_run=True) or download results (dry_run=False).

Input Schema

TableJSON Schema

Name	Required	Default
`download_dir`	Yes
`file_format`	No
`output_type`	No
`output_category`	No
`assembly`	No
`assay_title`	No
`organism`	No	Homo sapiens
`organ`	No
`biosample_type`	No
`target`	No
`preferred_default`	No
`organize_by`	No	experiment
`verify_md5`	No
`limit`	No
`dry_run`	No

Output Schema

TableJSON Schema

Name	Required	Description	Default
`result`	Yes

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses the two-step process (search then download) and the default dry_run mode. The annotations (readOnlyHint=false, destructiveHint=false, idempotentHint=true, openWorldHint=true) are consistent with the description's behavior: downloading is not read-only but is idempotent and not destructive. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is comprehensive but slightly verbose. It includes a parameter list with repeated explanations; some could be shortened. However, it is well-structured with clear sections (purpose, behavior, when-to-use, examples, args) and front-loads key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (15 parameters, 1 required) and the presence of an output schema, the description covers all necessary aspects: purpose, behavior, when-to-use, examples, and parameter descriptions. The return value is described as 'JSON with download preview or download results.' No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description fully compensates by explaining each parameter's meaning, default values, and example values (e.g., file_format='bed', assay_title='Histone ChIP-seq'). This adds significant value beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Search for files and download them all in batch.' It uses specific verbs ('search and download') and resource ('files'). It distinguishes itself from siblings encode_download_files (for specific files) and encode_search_files (only search).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'Use for searching and downloading files in one step.' Provides guidance to always use dry_run=True first. Names alternative tool for specific file accessions: 'use encode_download_files.' Lists related tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ammawla/encode-toolkit'

If you have feedback or need assistance with the MCP directory API, please join our Discord server