Skip to main content
Glama

Search Compounds

pubchem_search_compounds
Read-onlyIdempotent

Search PubChem for compounds by identifier (name, SMILES, InChIKey), formula, substructure, superstructure, or Tanimoto similarity. Optionally retrieve properties to avoid a follow-up call.

Instructions

Search PubChem for chemical compounds by identifier (name, SMILES, or InChIKey, batched up to 25), molecular formula in Hill notation, substructure or superstructure containment, or 2D Tanimoto similarity. Optionally hydrate results with properties to avoid a follow-up pubchem_get_compound_details call.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
searchTypeYesSearch strategy. "identifier": name/SMILES/InChIKey lookup. "formula": molecular formula. "substructure": find compounds containing the query as a substructure. "superstructure": find compounds that are themselves substructures of the query. "similarity": 2D Tanimoto similarity to the query.
identifierTypeNoRequired for identifier search. Type of chemical identifier: "name", "smiles", or "inchikey".
identifiersNoRequired for identifier search. Array of identifiers to resolve (1-25). Examples: ["aspirin", "ibuprofen"] for name, ["CC(=O)OC1=CC=CC=C1C(=O)O"] for SMILES, ["BSYNRYMUTXBXSQ-UHFFFAOYSA-N"] for inchikey (27-char block format).
formulaNoRequired for formula search. Molecular formula in Hill notation (e.g. "C6H12O6", "CaH2O2").
allowOtherElementsNoFormula search only. When true, includes compounds with additional elements beyond the formula.
queryNoRequired for substructure/superstructure/similarity searches. A SMILES string (e.g. "CC(=O)O") or PubChem CID as a string (e.g. "2244").
queryTypeNoRequired for structure/similarity searches. Format of the query: "smiles" or "cid".
thresholdNoSimilarity search only. Minimum Tanimoto similarity (70-100). 90+ for close analogs, 70-80 for scaffold hops. Default: 90.
maxResultsNoMaximum CIDs to return (1-200). Default: 20.
propertiesNoOptional: fetch these properties for each result, avoiding a follow-up details call. E.g. ["MolecularFormula", "MolecularWeight", "CanonicalSMILES"].

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultsYesMatching compounds.
searchTypeYesSearch strategy used: identifier, formula, substructure, superstructure, or similarity.
totalFoundYesTotal CIDs found before the maxResults cap.
truncatedNoTrue when CIDs were capped at maxResults — more matches exist than returned.
shownNoCIDs returned after the maxResults cap.
capNoThe maxResults cap that was applied.
noticeNoRecovery guidance when no compounds matched — echoes search strategy and suggests how to broaden. Absent when results were returned.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already state readOnlyHint, idempotentHint, openWorldHint. Description adds batching limit (up to 25 identifiers) and the optional property hydration, providing useful behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence covering all search types and hydration, no wasted words. Front-loaded with core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists, description need not detail returns. It covers all search modes, batching, and hydration option. Complete for a search tool of this complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, baseline 3. Description adds practical guidance on threshold values ('90+ for close analogs, 70-80 for scaffold hops') and clarifies that identifiers examples include various types, enhancing understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description specifies verb 'Search PubChem' and resource 'chemical compounds' with explicit search strategies (identifier, formula, substructure, superstructure, similarity). It distinguishes from sibling pubchem_get_compound_details by noting that hydration can avoid a follow-up call.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context: lists search types and their use, and notes that hydrating properties can avoid a pubchem_get_compound_details call. Missing explicit 'when not to use' but the hydration hint serves as a comparative guideline.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cyanheads/pubchem-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server