entrez_fetch
Fetch biological records from NCBI databases using unique identifiers to retrieve data in formats like XML, FASTA, or abstracts for research analysis.
Instructions
Fetch full records from NCBI Entrez by UID.
Args: database: Database name (e.g., 'pubmed', 'nucleotide', 'gene', 'protein') ids: Single ID, comma-separated string, or list of IDs rettype: Return type - 'xml', 'gb', 'fasta', 'abstract', etc. (default: 'xml') retmode: Return mode - 'xml', 'text', 'json' (default: 'xml') use_cache: Whether to use cached results (default: True, TTL: 7 days)
Returns: Dictionary containing: - data: Raw data in requested format (parsed if XML, raw text otherwise) - ids: List of IDs fetched - count: Number of records retrieved - format: Return type/mode used - database: Database queried - cached: Whether result was from cache (if use_cache=True)
Examples: >>> entrez_fetch("pubmed", "12345678", rettype="abstract", retmode="xml") >>> entrez_fetch("nucleotide", ["NM_000207", "NM_001127"], rettype="fasta", retmode="text") >>> entrez_fetch("gene", "672", rettype="xml") >>> entrez_fetch("protein", "NP_000198.1", rettype="fasta", retmode="text")
Notes: - For >100 IDs, consider batching to avoid timeouts - Valid rettype/retmode combinations depend on database - XML mode returns parsed Python dict/list structure - Text mode returns raw string data - Rate limited to 3 req/sec (or 10 req/sec with API key) - Cached results have 7 day TTL since record data is relatively static
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| database | Yes | ||
| ids | Yes | ||
| rettype | No | xml | |
| retmode | No | xml | |
| use_cache | No |