Skip to main content
Glama

Server Details

MCP server for PubChem. Search compounds, properties, safety, bioactivity, xrefs, and summaries.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
cyanheads/pubchem-mcp-server
GitHub Stars
8
Server Listing
pubchem-mcp-server

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.3/5 across 8 of 8 tools scored.

Server CoherenceA
Disambiguation5/5

Every tool has a clearly distinct purpose with no ambiguity: get_bioactivity focuses on assay results, get_compound_details on properties, get_image on visual representation, get_safety on hazard data, get_xrefs on external references, get_summary on entity descriptions, search_assays on target-based assay discovery, and search_compounds on compound lookup. The descriptions reinforce non-overlapping scopes.

Naming Consistency5/5

All tools follow a consistent 'pubchem_verb_noun' pattern with snake_case throughout: pubchem_get_bioactivity, pubchem_get_compound_details, pubchem_get_compound_image, pubchem_get_compound_safety, pubchem_get_compound_xrefs, pubchem_get_summary, pubchem_search_assays, pubchem_search_compounds. This predictability aids agent navigation.

Tool Count5/5

With 8 tools, the server is well-scoped for PubChem data access, covering compound retrieval, search, safety, bioactivity, cross-references, and summaries. Each tool earns its place by addressing a specific aspect of chemical and biological data without bloat or redundancy.

Completeness4/5

The toolset provides comprehensive coverage for querying and retrieving PubChem data, including compound details, bioactivity, safety, and searches. Minor gaps exist, such as no explicit tools for batch operations beyond those mentioned or advanced filtering in searches, but agents can work around these with the provided capabilities.

Available Tools

8 tools
pubchem_get_bioactivityGet BioactivityA
Read-onlyIdempotent
Inspect

Get a compound's bioactivity profile: which assays tested it, activity outcomes (Active/Inactive/Inconclusive), target information (gene symbols, protein names), and quantitative values (IC50, EC50, Ki, etc.). Filter by outcome to focus on active results.

ParametersJSON Schema
NameRequiredDescriptionDefault
cidYesPubChem Compound ID.
maxResultsNoMax assay results to return (1-100). Well-studied compounds have thousands of records. Default: 20.
outcomeFilterNoFilter by activity outcome. "active" shows only assays where the compound showed activity — most useful for understanding biological profile. Default: "all".all

Output Schema

ParametersJSON Schema
NameRequiredDescription
cidYesPubChem Compound ID.
resultsYesAssay results matching the filter.
activeCountYesAssays with "Active" outcome.
totalAssaysYesTotal unique assays for this compound.
inactiveCountYesAssays with "Inactive" outcome.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint, idempotentHint, and openWorldHint. The description adds valuable context about the return structure (what constitutes a 'bioactivity profile': assays, outcomes, targets, quantitative values) but does not elaborate on open-world implications (incomplete data coverage) or rate limiting concerns beyond the schema's pagination hint.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two tightly constructed sentences with zero waste. First sentence front-loads the complete data model (assays, outcomes, targets, quantitative values). Second sentence provides actionable filtering guidance. Every clause earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description appropriately summarizes return content without redundant specification. Adequately covers the tool's scope for a data retrieval operation, though could be strengthened by noting edge cases (e.g., compounds with no bioactivity data).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing baseline 3. Description reinforces the outcomeFilter parameter's purpose ('Filter by outcome to focus on active results'), aligning with schema guidance, but does not add substantial semantic meaning beyond what the well-documented schema already provides for cid or maxResults.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description provides specific verb ('Get') and resource ('compound's bioactivity profile'), and clearly distinguishes from siblings by detailing unique content: assays tested, activity outcomes, target information, and quantitative values (IC50, EC50, Ki), which none of the other compound tools (details, image, safety, xrefs) provide.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides implied usage guidance by highlighting the outcome filter ('Filter by outcome to focus on active results'), but lacks explicit when-to-use/when-not-to-use distinctions versus siblings like pubchem_search_assays (which searches for assays by criteria) or pubchem_get_compound_details (general metadata).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_compound_detailsGet Compound DetailsA
Read-onlyIdempotent
Inspect

Get detailed compound information by CID. Returns physicochemical properties (molecular weight, SMILES, InChIKey, XLogP, TPSA, etc.), optionally with a textual description (pharmacology, mechanism, therapeutic use), all known synonyms, drug-likeness assessment (Lipinski/Veber rules), and/or pharmacological classification (FDA classes, MeSH classes, ATC codes). Efficiently batches up to 100 CIDs.

ParametersJSON Schema
NameRequiredDescriptionDefault
cidsYesPubChem Compound IDs to fetch (1-100). Batched efficiently.
propertiesNoProperties to retrieve. Defaults to a core set: MolecularFormula, MolecularWeight, IUPACName, CanonicalSMILES, IsomericSMILES, InChIKey, XLogP, TPSA, HBondDonorCount, HBondAcceptorCount, RotatableBondCount, HeavyAtomCount, Charge, Complexity.
includeSynonymsNoFetch all known names and synonyms (trade names, systematic names, registry numbers).
maxDescriptionsNoMax number of distinct description entries per compound (1-20). PubChem returns near-duplicate summaries from many depositors; we dedup and cap to keep responses focused. Default: 3.
includeDescriptionNoInclude textual descriptions (pharmacology, mechanism, therapeutic use) attributed by source. Well-studied compounds have many overlapping summaries — capped via maxDescriptions. Slower when enabled — prefer small CID batches.
includeDrugLikenessNoCompute drug-likeness assessment: Lipinski Rule of Five (MW, XLogP, HBD, HBA) and Veber rules (TPSA, rotatable bonds). No extra API calls — computed from properties.
includeClassificationNoInclude pharmacological classification: FDA Established Pharmacologic Classes, mechanisms of action, MeSH classes, and ATC codes. Slower when enabled — prefer small CID batches.

Output Schema

ParametersJSON Schema
NameRequiredDescription
compoundsYesCompound detail records.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare read-only/idempotent safety; description adds crucial cost/performance behavior missing from annotations: 'Adds one API call per CID' for description/classification flags, and 'Efficiently batches' for throughput optimization. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two dense sentences with zero waste. Front-loaded with core purpose ('Get detailed compound information by CID'), followed by parenthetical enumeration of return types, and closes with operational constraint ('Efficiently batches up to 100 CIDs'). Every clause earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists and annotations cover safety profile, description provides complete operational context: batch limits, default property sets, API cost implications for expensive flags, and content scope. No gaps remain for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% so baseline is 3. Description adds semantic grouping (physicochemical vs pharmacological vs classification) and concrete examples (XLogP, TPSA, Lipinski rules) that help the agent map user requests to the correct boolean flags and property selections beyond raw schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action ('Get detailed compound information by CID') and distinguishes from siblings: contrasts with search_compounds (retrieval by ID vs search), get_compound_image (data vs media), and get_compound_safety (general properties vs safety). Specific resource and verb are clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear operational context through 'by CID' (implies prerequisite IDs) and 'Efficiently batches up to 100 CIDs' (usage constraint). Lists optional flags with their content domains, implicitly guiding when to enable each. Lacks explicit contrast with pubchem_get_summary but has strong implicit guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_compound_imageGet Compound ImageA
Read-onlyIdempotent
Inspect

Fetch a 2D structure diagram (PNG image) for a compound by CID.

ParametersJSON Schema
NameRequiredDescriptionDefault
cidYesPubChem Compound ID.
sizeNoImage size: "small" (100x100) or "large" (300x300). Default: "large".large

Output Schema

ParametersJSON Schema
NameRequiredDescription
cidYesPubChem Compound ID.
widthYesImage width in pixels.
heightYesImage height in pixels.
mimeTypeYesImage MIME type.
imageBase64YesBase64-encoded PNG image data.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, idempotentHint=true, and openWorldHint=true. The description adds valuable context beyond these by specifying the output is a PNG format 2D diagram, which helps the agent understand the binary/image nature of the response.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with zero waste. It front-loads the action (Fetch) and precisely defines the deliverable (PNG image) and identifier (CID) without filler words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple two-parameter fetch operation with an output schema present, the description is complete. It adequately explains what the tool returns (PNG image) without needing to detail return values, given the schema covers inputs completely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema fully documents both parameters (CID and size with dimensions). The description mentions 'by CID', reinforcing the required parameter, but adds no additional semantic detail beyond the comprehensive schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states a specific verb (Fetch), resource (2D structure diagram), format (PNG), and lookup method (by CID). It clearly distinguishes this image-fetching tool from siblings that retrieve 'details', 'safety', 'bioactivity', or perform searches.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While it doesn't explicitly name alternatives or 'when-not-to-use' clauses, the description provides clear context by specifying '2D structure diagram (PNG image)', which implicitly distinguishes it from data-retrieval siblings like get_compound_details or get_compound_safety.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_compound_safetyGet Compound SafetyA
Read-onlyIdempotent
Inspect

Get GHS (Globally Harmonized System) hazard classification and safety data for a compound. Returns signal word, pictograms, hazard statements (H-codes), and precautionary statements (P-codes). Data sourced from PubChem depositors — source attribution included.

ParametersJSON Schema
NameRequiredDescriptionDefault
cidYesPubChem Compound ID.

Output Schema

ParametersJSON Schema
NameRequiredDescription
cidYesPubChem Compound ID.
ghsNoGHS classification data.
sourceNoData source attribution.
hasDataYesWhether GHS safety data is available for this compound.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations establish read-only, idempotent, and open-world traits. The description adds valuable behavioral context not present in annotations: data provenance ('sourced from PubChem depositors') and the inclusion of source attribution. It also previews the return structure without contradicting the output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three clauses: purpose declaration, return value enumeration, and data source attribution. Every clause earns its place; there is no redundancy or unnecessary verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple input schema (one well-documented parameter), presence of annotations, and existence of an output schema, the description is complete. It appropriately focuses on value-add elements (data source, specific safety data types) rather than repeating structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage for the single 'cid' parameter, the schema carries the semantic burden. The description neither repeats nor extends the parameter documentation, which is acceptable given the complete schema coverage, meeting the baseline expectation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Get') and clearly identifies the resource (GHS hazard classification and safety data). It distinguishes itself from sibling tools like get_compound_details and get_bioactivity by explicitly targeting safety-specific data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by enumerating specific return values (signal word, pictograms, H-codes, P-codes), suggesting when to use the tool. However, it lacks explicit guidance on when to prefer this over get_compound_details or other siblings, and states no prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_compound_xrefsGet Compound Cross-ReferencesA
Read-onlyIdempotent
Inspect

Get external database cross-references for a compound: PubMed citations, patent IDs, gene/protein associations, registry numbers, and taxonomy IDs. Results are capped per type with total counts reported.

ParametersJSON Schema
NameRequiredDescriptionDefault
cidYesPubChem Compound ID.
xrefTypesYesCross-reference types to retrieve. Options: RegistryID, RN (CAS numbers), PubMedID, PatentID, GeneID, ProteinGI, TaxonomyID.
maxPerTypeNoMax IDs to return per xref type (1-500). A compound may have thousands of PubMed references — this cap prevents bloat. Total count always reported. Default: 50.

Output Schema

ParametersJSON Schema
NameRequiredDescription
cidYesPubChem Compound ID.
xrefsYesCross-references grouped by type.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare read-only, idempotent, and open-world traits. The description adds valuable behavioral context not in annotations: the capping mechanism ('capped per type with total counts reported') and explicitly maps xref types to human-readable categories (e.g., 'RN' to 'registry numbers').

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste: first sentence defines purpose and scope with specific examples; second sentence discloses the critical capping limitation. Information is front-loaded and dense.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (which handles return value documentation) and 100% parameter coverage, the description adequately covers the tool's purpose, xref types, and result-limiting behavior without redundancy.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description reinforces the xrefTypes by listing them in prose, translating technical enum values (e.g., 'RN', 'ProteinGI') to domain concepts ('registry numbers', 'gene/protein associations'), but does not add significant semantic depth beyond the comprehensive schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool 'Get[s] external database cross-references for a compound' and enumerates specific xref types (PubMed citations, patent IDs, gene/protein associations, registry numbers, taxonomy IDs), distinguishing it from sibling tools that retrieve bioactivity, images, or general compound details.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While it lacks explicit 'use this instead of X' comparisons, the description provides clear contextual scope (external database cross-references only) and explains the capping behavior ('Results are capped per type'), which implicitly guides usage for large result sets.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_summaryGet Entity SummaryA
Read-onlyIdempotent
Inspect

Get descriptive summaries for PubChem entities by ID. Supports assays (AID), genes (Gene ID), proteins (UniProt accession), and taxonomy (Tax ID). Up to 10 per call.

ParametersJSON Schema
NameRequiredDescriptionDefault
entityTypeYesEntity type. Determines ID format and returned fields.
identifiersYesEntity identifiers (1-10). Type depends on entityType: - assay: AID (number), e.g. [1000] - gene: Gene ID (number), e.g. [1956] - protein: UniProt accession (string), e.g. ["P00533"] - taxonomy: Tax ID (number), e.g. [9606]

Output Schema

ParametersJSON Schema
NameRequiredDescription
summariesYesSummary results.
entityTypeYesEntity type queried.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint, openWorldHint, and idempotentHint. The description adds valuable behavioral context beyond these: the batch limit constraint (10 per call) and clarifies that returned data consists of 'descriptive summaries.' No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences total with zero waste. First sentence covers purpose and entity scope; second covers batch limit. Information is front-loaded and dense.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-parameter read operation with 100% schema coverage and existing output schema, the description is complete. It covers tool purpose, supported entity scope, and batch constraints without needing to detail return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage with detailed type mapping for identifiers. Description reinforces this with examples in prose (AID, Gene ID, etc.) but adds no significant semantic information beyond what the schema already provides. Baseline 3 appropriate for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool 'Get[s] descriptive summaries for PubChem entities by ID' with specific verb and resource. It explicitly distinguishes from compound-focused siblings by listing supported entity types: assays, genes, proteins, and taxonomy.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context by enumerating supported entity types (AID, Gene ID, UniProt, Tax ID) and the 'Up to 10 per call' constraint. Lacks explicit 'when not to use' guidance regarding compounds, though the enum and sibling tool names make this implicitly clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_search_assaysSearch AssaysA
Read-onlyIdempotent
Inspect

Find PubChem bioassays associated with a biological target. Search by gene symbol (e.g. "EGFR"), protein name, NCBI Gene ID, or UniProt accession. Returns assay IDs (AIDs) which can be explored further with pubchem_get_summary.

ParametersJSON Schema
NameRequiredDescriptionDefault
maxResultsNoMax AIDs to return (1-200). Popular targets may have thousands of assays. Default: 50.
targetTypeYesTarget identifier type. "genesymbol" and "proteinname" accept text names. "geneid" accepts NCBI Gene IDs. "proteinaccession" accepts UniProt accessions.
targetQueryYesTarget identifier. Examples: "EGFR" (genesymbol), "Epidermal growth factor receptor" (proteinname), "1956" (geneid), "P00533" (proteinaccession).

Output Schema

ParametersJSON Schema
NameRequiredDescription
aidsYesPubChem Assay IDs.
targetTypeYesTarget identifier type used.
totalFoundYesTotal AIDs found.
targetQueryYesTarget identifier searched.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnly/idempotent/openWorld traits, so the description appropriately focuses on adding return-value context (AIDs) and search scope constraints. It clarifies what the tool produces without repeating safety annotations, though it could mention handling of unmatched targets.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences: purpose declaration, input specification with examples, and output/next-step guidance. No redundant information; every sentence earns its place. Front-loaded with the core action 'Find PubChem bioassays'.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the 100% schema coverage and existence of output schema, the description provides sufficient context by explaining the biological target search paradigm and the AID return format. It appropriately avoids duplicating detailed parameter documentation already present in the schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema fully documents all three parameters including examples and ranges. The description reinforces the target types (gene symbol, protein name, etc.) but primarily provides conceptual framing rather than new semantic details beyond the structured schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly states 'Find PubChem bioassays associated with a biological target' with specific verb and resource. It clearly distinguishes from compound-focused siblings (pubchem_search_compounds) by specifying 'biological target' and from getter tools by describing the search functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear workflow guidance by noting that returned AIDs 'can be explored further with pubchem_get_summary', establishing the tool's position in the chain. However, it lacks explicit 'when not to use' guidance regarding similar tools like pubchem_get_bioactivity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_search_compoundsSearch CompoundsA
Read-onlyIdempotent
Inspect

Search PubChem for chemical compounds. Five search modes:

  • identifier: Resolve compound names, SMILES, or InChIKeys to CIDs (batch up to 25)

  • formula: Find compounds by molecular formula (Hill notation, e.g. "C6H12O6")

  • substructure: Find compounds containing a substructure (SMILES or CID)

  • superstructure: Find compounds that are substructures of the query

  • similarity: Find structurally similar compounds by 2D Tanimoto similarity

Optionally hydrate results with properties to avoid a follow-up details call.

ParametersJSON Schema
NameRequiredDescriptionDefault
queryNoRequired for substructure/superstructure/similarity searches. A SMILES string or PubChem CID (as string) for the query structure.
formulaNoRequired for formula search. Molecular formula in Hill notation (e.g. "C6H12O6", "CaH2O2").
queryTypeNoRequired for structure/similarity searches. Format of the query: "smiles" or "cid".
thresholdNoSimilarity search only. Minimum Tanimoto similarity (70-100). 90+ for close analogs, 70-80 for scaffold hops. Default: 90.
maxResultsNoMaximum CIDs to return (1-200). Default: 20.
propertiesNoOptional: fetch these properties for each result, avoiding a follow-up details call. E.g. ["MolecularFormula", "MolecularWeight", "CanonicalSMILES"].
searchTypeYesSearch strategy: "identifier" (name/SMILES/InChIKey lookup), "formula", "substructure", "superstructure", or "similarity".
identifiersNoRequired for identifier search. Array of identifiers to resolve (1-25). Examples: ["aspirin", "ibuprofen"] for name, ["CC(=O)OC1=CC=CC=C1C(=O)O"] for SMILES.
identifierTypeNoRequired for identifier search. Type of chemical identifier: "name", "smiles", or "inchikey".
allowOtherElementsNoFormula search only. When true, includes compounds with additional elements beyond the formula.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultsYesMatching compounds.
searchTypeYesThe search strategy used.
totalFoundYesTotal CIDs found (before maxResults cap).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and idempotentHint=true. The description adds valuable behavioral context by explaining batch constraints ('up to 25' for identifiers), the Hill notation requirement for formulas, and the hydration behavior. It does not contradict annotations, though it could mention pagination or rate limiting for a 5.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely efficient structure: one sentence framing, bulleted list of five modes with parenthetical examples, and a final sentence on hydration. No redundant words; every phrase conveys specific search behavior or constraints. Excellent use of formatting for readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (10 parameters, 5 distinct search modes) and presence of an output schema, the description adequately covers search strategy selection and result optimization. It appropriately omits return value details (covered by output schema) but could briefly mention pagination behavior for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description elevates this by providing conceptual groupings of the searchType parameter (explaining the five modes conceptually) and noting the constraint 'batch up to 25' for identifiers, adding strategic context beyond the schema's individual field descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Search PubChem for chemical compounds' and distinguishes five specific search modes (identifier, formula, substructure, superstructure, similarity). It clearly differentiates from sibling 'get' tools by focusing on search functionality and discovery rather than retrieval of specific records.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance on when to use each of the five search modes (e.g., 'Resolve compound names...', 'Find compounds by molecular formula'). Critically, it notes the hydration option to 'avoid a follow-up details call,' directly referencing the sibling pubchem_get_compound_details tool and guiding optimization decisions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.