NCBI Datasets MCP Server
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@NCBI Datasets MCP Serversearch for complete E. coli genomes"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.

Unofficial NCBI Datasets MCP Server
A Model Context Protocol (MCP) server that provides comprehensive access to the NCBI Datasets API. This server enables seamless integration with NCBI's vast collection of genomic, taxonomic, and biological data through 31 specialized tools.
Developed by Augmented Nature
Features
31 comprehensive tools covering all major NCBI Datasets functionality
9 organized categories of biological data operations
Resource templates for direct URI-based data access
Full TypeScript implementation with proper error handling
Rate limiting and caching for optimal performance
Environment variable configuration for API keys
Installation
npm install
npm run buildConfiguration
Environment Variables
NCBI_API_KEY(optional): Your NCBI API key for higher rate limits and priority access
MCP Configuration
Add to your MCP settings:
{
"mcpServers": {
"ncbi-datasets-server": {
"command": "node",
"args": ["/path/to/ncbi-datasets-server/build/index.js"],
"env": {
"NCBI_API_KEY": "your_api_key_here"
}
}
}
}Available Tools
๐งฌ Genome Operations
search_genomes- Search genome assemblies by organism, keywords, or criteriaget_genome_info- Get detailed information for a specific genome assemblyget_genome_summary- Get summary statistics for a genome assembly
๐งฌ Gene Operations
search_genes- Search genes by symbol, name, organism, or locationget_gene_info- Get detailed information for a specific geneget_gene_sequences- Retrieve sequences for a specific gene
๐ท๏ธ Taxonomy Operations
search_taxonomy- Search taxonomic information by organism nameget_taxonomy_info- Get detailed taxonomic information for a taxonget_organism_info- Get organism-specific information and datasets
๐๏ธ Assembly Operations
search_assemblies- Search genome assemblies with detailed filteringget_assembly_info- Get detailed metadata and statistics for assembliesget_assembly_reports- Get assembly quality reports and validation infodownload_genome_data- Get download URLs for genome data filesbatch_assembly_info- Get information for multiple assemblies
๐ฆ Virus Operations
search_virus_genomes- Search viral genome assembliesget_virus_info- Get detailed information for viral genomes
๐งช Protein Operations
search_proteins- Search protein sequences by name or functionget_protein_info- Get detailed information for specific proteins
๐ Annotation Operations
get_genome_annotation- Get annotation information for assembliessearch_genome_features- Search for specific genomic features
๐ฌ Comparative Genomics
compare_genomes- Compare two or more genome assembliesfind_orthologs- Find orthologous genes across organisms
๐งฌ Sequence Operations
get_sequence_data- Retrieve sequence data for genomes/genes/proteinsblast_search- Perform BLAST search against NCBI databases
๐ณ Phylogenetic Operations
get_phylogenetic_tree- Get phylogenetic tree data for organismsget_taxonomic_lineage- Get complete taxonomic lineage
๐ Statistics Operations
get_database_stats- Get statistics about NCBI Datasets contentsearch_by_bioproject- Search datasets by BioProject accessionsearch_by_biosample- Search datasets by BioSample accession
โ Quality Control
get_assembly_quality- Get quality metrics for genome assembliesvalidate_sequences- Validate sequence data and check for issues
Usage Examples
Genome Analysis
// Search for E. coli genomes
{
"tool": "search_genomes",
"arguments": {
"tax_id": 511145,
"assembly_level": "complete",
"max_results": 10
}
}
// Get detailed genome information
{
"tool": "get_genome_info",
"arguments": {
"accession": "GCF_000005845.2",
"include_annotation": true
}
}
// Get genome summary statistics
{
"tool": "get_genome_summary",
"arguments": {
"accession": "GCF_000005845.2"
}
}Gene Research
// Search for BRCA1 gene
{
"tool": "search_genes",
"arguments": {
"gene_symbol": "BRCA1",
"organism": "Homo sapiens",
"max_results": 5
}
}
// Get detailed gene information
{
"tool": "get_gene_info",
"arguments": {
"gene_id": 672,
"include_sequences": true
}
}
// Get gene sequences
{
"tool": "get_gene_sequences",
"arguments": {
"gene_id": 672,
"sequence_type": "transcript"
}
}Taxonomic Analysis
// Search taxonomy by organism name
{
"tool": "search_taxonomy",
"arguments": {
"query": "Escherichia coli",
"max_results": 10
}
}
// Get detailed taxonomic information
{
"tool": "get_taxonomy_info",
"arguments": {
"tax_id": 511145,
"include_lineage": true
}
}
// Get organism information
{
"tool": "get_organism_info",
"arguments": {
"organism": "Escherichia coli"
}
}Assembly Operations
// Search assemblies with filtering
{
"tool": "search_assemblies",
"arguments": {
"query": "human",
"assembly_level": "chromosome",
"assembly_source": "refseq",
"max_results": 20
}
}
// Get assembly information
{
"tool": "get_assembly_info",
"arguments": {
"assembly_accession": "GCF_000001405.40",
"include_annotation": true
}
}
// Batch assembly lookup
{
"tool": "batch_assembly_info",
"arguments": {
"accessions": ["GCF_000001405.40", "GCF_000005825.2", "GCF_000002305.1"]
}
}Comparative Genomics
// Compare multiple genomes
{
"tool": "compare_genomes",
"arguments": {
"accessions": ["GCF_000005845.2", "GCF_000001405.40"],
"comparison_type": "basic_stats",
"include_orthologs": true
}
}
// Find orthologous genes
{
"tool": "find_orthologs",
"arguments": {
"gene_symbol": "BRCA1",
"source_organism": "Homo sapiens",
"target_organisms": ["Mus musculus", "Rattus norvegicus"],
"similarity_threshold": 80
}
}Virus Research
// Search viral genomes
{
"tool": "search_virus_genomes",
"arguments": {
"virus_name": "SARS-CoV-2",
"host": "Homo sapiens",
"max_results": 50
}
}
// Get viral genome information
{
"tool": "get_virus_info",
"arguments": {
"accession": "NC_045512.2",
"include_proteins": true,
"include_metadata": true
}
}Resource Templates
The server provides resource templates for direct data access:
ncbi://genome/{accession}- Complete genome assembly informationncbi://gene/{gene_id}- Gene information with annotationsncbi://taxonomy/{tax_id}- Taxonomic classification and lineagencbi://assembly/{assembly_accession}- Assembly metadata and statisticsncbi://search/{data_type}/{query}- Search results for specified queries
API Rate Limits
Without API key: 3 requests per second
With API key: 10 requests per second with priority access
To obtain an API key, visit: https://www.ncbi.nlm.nih.gov/account/settings/
Error Handling
The server implements comprehensive error handling:
Network errors: Automatic retry with exponential backoff
Rate limiting: Intelligent request queuing and throttling
Invalid parameters: Clear validation error messages
API errors: Detailed error reporting with context
Data Sources
This server accesses data from:
NCBI Datasets API v2: Primary genomic and assembly data
NCBI Taxonomy: Taxonomic classifications and lineages
NCBI Gene: Gene annotations and sequences
NCBI Assembly: Assembly metadata and quality metrics
NCBI BioProject/BioSample: Project and sample information
License
MIT License - see LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.
Support
For issues related to:
Server functionality: Open an issue in this repository
NCBI data: Consult NCBI Datasets documentation
API access: Contact NCBI support for API-related questions
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Augmented-Nature/NCBI-Datasets-MCP-Server'
If you have feedback or need assistance with the MCP directory API, please join our Discord server