Skip to main content
Glama
pansapiens

uniprot-unipressed-mcp

by pansapiens

uniprot_search

Search the UniProt protein database using query syntax to find proteins by gene, organism, keyword, length, and other fields.

Instructions

Search the UniProt protein database using query syntax.

Args: query: UniProt query string. Examples: - gene:BRCA1 - Search by gene name - organism_id:9606 - Human proteins (NCBI taxonomy ID) - (gene:BRCA*) AND (organism_id:10090) - Mouse BRCA genes with wildcard - length:[500 TO 700] - Proteins of specific length range - keyword:kinase - By UniProt keyword - family:serpin - By protein family - ec:3.2.1.23 - By enzyme classification - database:pfam - With Pfam cross-references - reviewed:true - Only Swiss-Prot reviewed entries

    Available query fields (see https://www.uniprot.org/help/query-fields):
    - accession: Primary/canonical isoform accessions (e.g., accession:P62988)
    - active: Active/obsolete status (e.g., active:false)
    - lit_author: Reference author (e.g., lit_author:ashburner)
    - protein_name: Protein name (e.g., protein_name:CD233)
    - chebi: ChEBI identifier (e.g., chebi:18420)
    - xrefcount_pdb: Cross-reference count (e.g., xref_count_pdb:[20 TO *])
    - date_created: Creation date (e.g., date_created:[2012-10-01 TO *])
    - date_modified: Last modification date (e.g., date_modified:[2012-01-01 TO 2019-03-01])
    - date_sequence_modified: Sequence modification date (e.g., date_sequence_modified:[2012-01-01 TO 2012-03-01])
    - database: Database cross-reference (e.g., database:pfam)
    - xref: Cross-reference (e.g., xref:pdb-1aut)
    - ec: Enzyme Commission number (e.g., ec:3.2.1.23)
    - existence: Protein existence level (e.g., existence:3)
    - family: Protein family (e.g., family:serpin)
    - fragment: Fragment status (e.g., fragment:true)
    - gene: Gene name (e.g., gene:HPSE)
    - gene_exact: Exact gene name (e.g., gene_exact:HPSE)
    - go: Gene Ontology term (e.g., go:0015629)
    - virus_host_name: Virus host name
    - virus_host_id: Virus host ID (e.g., virus_host_id:10090)
    - accession_id: Primary accession (e.g., accession_id:P00750)
    - inchikey: InChIKey identifier (e.g., inchikey:WQZGKKKJIJFFOK-GASJEMHNSA-N)
    - interactor: Interacting protein (e.g., interactor:P00520)
    - keyword: Keyword (e.g., keyword:toxin or keyword:KW-0800)
    - length: Sequence length range (e.g., length:[500 TO 700])
    - mass: Molecular mass range (e.g., mass:[500000 TO *])
    - cc_mass_spectrometry: Mass spectrometry method (e.g., cc_mass_spectrometry:maldi)
    - encoded_in: Gene location (e.g., encoded_in:Mitochondrion)
    - organism_name: Organism name (e.g., organism_name:"Ovis aries")
    - organism_id: Organism taxonomy ID (e.g., organism_id:9940)
    - plasmid: Plasmid name (e.g., plasmid:ColE1)
    - proteome: Proteome ID (e.g., proteome:UP000005640)
    - proteomecomponent: Proteome component (e.g., proteomecomponent:"chromosome 1")
    - sec_acc: Secondary accession (e.g., sec_acc:P02023)
    - reviewed: Reviewed status (e.g., reviewed:true)
    - scope: Reference scope (e.g., scope:mutagenesis)
    - sequence: Sequence identifier (e.g., accession:P05067-9 AND is_isoform:true)
    - strain: Organism strain (e.g., strain:wistar)
    - taxonomy_name: Taxonomy name (e.g., taxonomy_name:mammal)
    - taxonomy_id: Taxonomy ID (e.g., taxonomy_id:40674)
    - tissue: Tissue type (e.g., tissue:liver)
    - cc_webresource: Web resource (e.g., cc_webresource:wikipedia)
    
database: UniProt database to search. One of: uniprotkb (default), uniparc, uniref

limit: Maximum number of results per page (1-100, default 10)

fields: Optional list of return fields to include. If not specified, all fields
    are returned. Available return fields (see https://www.uniprot.org/help/return_fields):
    
    Names & Taxonomy:
    - accession, id, gene_names, gene_primary, gene_synonym, gene_oln, gene_orf
    - organism_name, organism_id, protein_name, xref_proteomes
    - lineage, lineage_ids, virus_hosts
    
    Sequences:
    - cc_alternative_products, ft_var_seq, cc_sc_epred, fragment, encoded_in
    - length, mass, cc_mass_spectrometry, ft_variant, ft_non_cons, ft_non_std
    - ft_non_ter, cc_polymorphism, cc_rna_editing, sequence, cc_sequence_caution
    - ft_conflict, ft_unsure, sequence_version
    
    Function:
    - absorption, ft_act_site, cc_activity_regulation, ft_binding, cc_catalytic_activity
    - cc_cofactor, ft_dna_bind, ec, cc_function, kinetics, cc_pathway
    - ph_dependence, redox_potential, rhea, ft_site, temp_dependence
    
    Miscellaneous:
    - annotation_score, cc_caution, comment_count, feature_count, keywordid, keyword
    - cc_miscellaneous, protein_existence, reviewed, tools, uniparc_id
    
    Interaction:
    - cc_interaction, cc_subunit
    
    Expression:
    - cc_developmental_stage, cc_induction, cc_tissue_specificity
    
    Gene Ontology (GO):
    - go_p, go_c, go, go_f, go_id
    
    Pathology & Biotech:
    - cc_allergen, cc_biotechnology, cc_disruption_phenotype, cc_disease
    - ft_mutagen, cc_pharmaceutical, cc_toxic_dose
    
    Subcellular location:
    - ft_intramem, cc_subcellular_location, ft_topo_dom, ft_transmem
    
    PTM / Processing:
    - ft_chain, ft_crosslnk, ft_disulfid, ft_carbohyd, ft_init_met, ft_lipid
    - ft_mod_res, ft_peptide, cc_ptm, ft_propep, ft_signal, ft_transit
    
    Structure:
    - structure_3d, ft_strand, ft_helix, ft_turn
    
    Publications:
    - lit_pubmed_id
    
    Date:
    - date_created, date_modified, date_sequence_modified, version
    
    Family & Domains:
    - ft_coiled, ft_compbias, cc_domain, ft_domain, ft_motif, protein_families
    - ft_region, ft_repeat, ft_zn_fing
    
    Cross-references:
    - See https://www.uniprot.org/help/return_fields for cross-reference fields
    
cursor: Pagination cursor from a previous search result's 'nextCursor' field.
    Pass this to retrieve the next page of results.

response_format: Response format. One of: 'json' (default) or 'toon'.
    - 'json': Returns response in JSON format
    - 'toon': Returns response in TOON format

Returns: When response_format='json': JSON object with: - results: Array of matching protein entries - total: Total number of matching entries (if available) - nextCursor: Cursor string for retrieving the next page (if more results exist)

When response_format='toon': TOON-formatted string with:
- results: Array of matching protein entries
- total: Total number of matching entries (if available)
- nextCursor: Cursor string for retrieving the next page (if more results exist)

See https://www.uniprot.org/help/query-fields for full query syntax documentation.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
databaseNouniprotkb
limitNo
fieldsNo
cursorNo
response_formatNojson

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It explains that the tool queries UniProt databases, supports pagination via cursor, and describes response formats. It does not mention destructive behavior or rate limits, but for a search tool this is acceptable and transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy but well-structured with sections for query syntax, parameters, and return values. It is front-loaded with the basic purpose. While not concise, the structure makes it easy to navigate, and all information is relevant given the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple databases, extensive query fields, pagination, response format), the description is complete. It covers all parameters, explains output schema with results, total, and nextCursor, and provides links to external documentation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage, but the description compensates fully by detailing each parameter: query with numerous examples, database with allowed values, limit with range, fields with a categorized list, cursor for pagination, and response_format with enum values and descriptions. This adds significant meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it searches the UniProt protein database using query syntax, with a specific verb and resource. The sibling tool name 'uniprot_fetch' implies a fetch operation, so the search purpose is distinct and clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While the description provides extensive query examples and lists available fields, it does not explicitly state when to use this tool versus the sibling 'uniprot_fetch' tool. There is no guidance on when not to use it, leaving the agent to infer usage context from examples.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pansapiens/uniprot-unipressed-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server