Skip to main content
Glama

Server Details

Search PubChem compounds, properties, safety data, bioactivity, and cross-references.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
cyanheads/pubchem-mcp-server
GitHub Stars
9
Server Listing
pubchem-mcp-server

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.3/5 across 8 of 8 tools scored.

Server CoherenceA
Disambiguation5/5

Every tool has a clearly distinct purpose with no ambiguity: get_bioactivity focuses on assay results, get_compound_details on properties, get_image on visual representation, get_safety on hazard data, get_xrefs on external references, get_summary on entity descriptions, search_assays on target-based assay discovery, and search_compounds on compound lookup. The descriptions reinforce non-overlapping scopes.

Naming Consistency5/5

All tools follow a consistent 'pubchem_verb_noun' pattern with snake_case throughout: pubchem_get_bioactivity, pubchem_get_compound_details, pubchem_get_compound_image, pubchem_get_compound_safety, pubchem_get_compound_xrefs, pubchem_get_summary, pubchem_search_assays, pubchem_search_compounds. This predictability aids agent navigation.

Tool Count5/5

With 8 tools, the server is well-scoped for PubChem data access, covering compound retrieval, search, safety, bioactivity, cross-references, and summaries. Each tool earns its place by addressing a specific aspect of chemical and biological data without bloat or redundancy.

Completeness4/5

The toolset provides comprehensive coverage for querying and retrieving PubChem data, including compound details, bioactivity, safety, and searches. Minor gaps exist, such as no explicit tools for batch operations beyond those mentioned or advanced filtering in searches, but agents can work around these with the provided capabilities.

Available Tools

10 tools
pubchem_get_bioactivityGet BioactivityA
Read-onlyIdempotent
Inspect

Get a compound's bioactivity profile: which assays tested it, activity outcomes (Active/Inactive/Inconclusive), target identifiers (NCBI Gene ID, UniProt/GenBank accession), and quantitative values (IC50, EC50, Ki, etc.). Filter by outcome and/or a specific molecular target (NCBI Gene ID or protein accession) to focus the profile — e.g. "is this compound active against target T?".

ParametersJSON Schema
NameRequiredDescriptionDefault
cidYesPubChem Compound ID. Resolve from name/SMILES with pubchem_search_compounds.
maxResultsNoMax assay results to return (1-100). Well-studied compounds have thousands of records. Default: 20.
targetGeneIdNoFilter to assays against this NCBI Gene ID. Obtain Gene IDs from pubchem_search_assays or the targetGeneId field of an unfiltered result here. Combine with outcomeFilter="active" to answer "is this compound active against target T?".
outcomeFilterNoFilter by activity outcome. "active" shows only assays where the compound showed activity — most useful for understanding biological profile. Default: "all".all
targetAccessionNoFilter to assays against this target protein accession (UniProt/GenBank), e.g. "P35354". Obtain accessions from pubchem_search_assays or the targetAccession field of an unfiltered result here.

Output Schema

ParametersJSON Schema
NameRequiredDescription
cidYesPubChem Compound ID.
noticeNoRecovery guidance when the filter yields no results or the compound has no bioactivity data.
resultsYesAssay results matching the filter.
activeCountYesAssays with "Active" outcome.
totalAssaysYesTotal unique assays for this compound.
targetFilterNoTarget filter applied (gene ID and/or protein accession), when set.
filteredCountYesAssays matching the outcome and target filters, before the maxResults cap.
inactiveCountYesAssays with "Inactive" outcome.
outcomeFilterYesOutcome filter applied: active, inactive, or all.
returnedCountYesAssays returned after the maxResults cap.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint, idempotentHint, and openWorldHint. The description adds valuable context about the return structure (what constitutes a 'bioactivity profile': assays, outcomes, targets, quantitative values) but does not elaborate on open-world implications (incomplete data coverage) or rate limiting concerns beyond the schema's pagination hint.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two tightly constructed sentences with zero waste. First sentence front-loads the complete data model (assays, outcomes, targets, quantitative values). Second sentence provides actionable filtering guidance. Every clause earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description appropriately summarizes return content without redundant specification. Adequately covers the tool's scope for a data retrieval operation, though could be strengthened by noting edge cases (e.g., compounds with no bioactivity data).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing baseline 3. Description reinforces the outcomeFilter parameter's purpose ('Filter by outcome to focus on active results'), aligning with schema guidance, but does not add substantial semantic meaning beyond what the well-documented schema already provides for cid or maxResults.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description provides specific verb ('Get') and resource ('compound's bioactivity profile'), and clearly distinguishes from siblings by detailing unique content: assays tested, activity outcomes, target information, and quantitative values (IC50, EC50, Ki), which none of the other compound tools (details, image, safety, xrefs) provide.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides implied usage guidance by highlighting the outcome filter ('Filter by outcome to focus on active results'), but lacks explicit when-to-use/when-not-to-use distinctions versus siblings like pubchem_search_assays (which searches for assays by criteria) or pubchem_get_compound_details (general metadata).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_compound_3d_structureGet Compound 3D StructureA
Read-onlyIdempotent
Inspect

Get a compound's default 3D conformer — atomic coordinates and bonds — for one CID. format="json" (default) returns parsed atoms and bonds the model can reason over directly; format="sdf" returns the raw V2000 SDF text for passthrough to docking, rendering, or conformer tools. Optionally lists alternate conformer IDs. Not every compound has computed 3D coordinates (large molecules, mixtures, and some salts do not).

ParametersJSON Schema
NameRequiredDescriptionDefault
cidYesPubChem Compound ID. Resolve from name/SMILES with pubchem_search_compounds.
formatNoOutput format. "json" (default) returns parsed atoms and bonds. "sdf" returns the raw V2000 SDF text for passthrough to other tools.json
includeAlternateConformerIdsNoList the IDs of additional computed conformers beyond the default. Adds one extra API call. Default: false.

Output Schema

ParametersJSON Schema
NameRequiredDescription
cidYesPubChem Compound ID.
sdfNoRaw V2000 SDF text. Populated when format="sdf".
atomsNoParsed atoms. Populated when format="json".
bondsNoParsed bonds. Populated when format="json".
atomCountYesNumber of atoms in the conformer.
bondCountYesNumber of bonds in the conformer.
conformerIdNoDefault (primary) conformer ID. Present when includeAlternateConformerIds is set.
alternateConformerIdsNoConformer IDs beyond the default. Present when includeAlternateConformerIds is set and alternates exist.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds behavioral details beyond annotations (readOnlyHint, idempotentHint, openWorldHint). It explains output formats (json for reasoning, sdf for passthrough) and that includeAlternateConformerIds adds an API call. It also notes the limitation about missing 3D coordinates.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three concise sentences, each adding essential information: primary action, format behaviors, and limitations. No redundant or vague language. Front-loaded with the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (3 parameters, existing output schema) and the combination of schema and annotations, the description is complete. It covers purpose, usage constraints, format outcomes, and optional parameter effects. The agent has sufficient context to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While schema coverage is 100%, the description adds practical context: json returns parsed atoms/bonds, sdf returns raw SDF for passthrough, and includeAlternateConformerIds triggers an extra API call. This enriches the parameter meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Get a compound's default 3D conformer — atomic coordinates and bonds — for one CID.' This specific verb+resource combination distinguishes it from sibling tools like pubchem_get_compound_details or pubchem_get_compound_image.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains that not every compound has 3D coordinates (large molecules, mixtures, salts), guiding when not to use. It also mentions resolving CIDs from pubchem_search_compounds in the parameter description. However, it lacks explicit comparisons to other siblings for when to choose this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_compound_detailsGet Compound DetailsA
Read-onlyIdempotent
Inspect

Get detailed compound information by CID. Returns physicochemical properties (molecular weight, SMILES, InChIKey, XLogP, TPSA, etc.), optionally with a textual description (pharmacology, mechanism, therapeutic use), all known synonyms, drug-likeness assessment (Lipinski/Veber rules), and/or pharmacological classification (FDA classes, MeSH classes, ATC codes). Efficiently batches up to 100 CIDs.

ParametersJSON Schema
NameRequiredDescriptionDefault
cidsYesPubChem Compound IDs to fetch (1-100). Batched efficiently. Resolve from names/SMILES with pubchem_search_compounds.
propertiesNoProperties to retrieve. Defaults to a core set: MolecularFormula, MolecularWeight, IUPACName, CanonicalSMILES, IsomericSMILES, InChIKey, XLogP, TPSA, HBondDonorCount, HBondAcceptorCount, RotatableBondCount, HeavyAtomCount, Charge, Complexity.
includeSynonymsNoFetch all known names and synonyms (trade names, systematic names, registry numbers). One API call per CID — slower than the property batch for large CID lists.
maxDescriptionsNoMax number of distinct description entries per compound (1-20). PubChem returns near-duplicate summaries from many depositors; we dedup and cap to keep responses focused. Default: 3.
includeDescriptionNoInclude textual descriptions (pharmacology, mechanism, therapeutic use) attributed by source. Well-studied compounds have many overlapping summaries — capped via maxDescriptions. Fetched only for the first 10 CIDs in the batch; remaining CIDs return without descriptions.
includeDrugLikenessNoCompute drug-likeness assessment: Lipinski Rule of Five (MW, XLogP, HBD, HBA) and Veber rules (TPSA, rotatable bonds). No extra API calls — computed from properties.
includeClassificationNoInclude pharmacological classification: FDA Established Pharmacologic Classes, mechanisms of action, MeSH classes, and ATC codes. Fetched only for the first 10 CIDs in the batch; remaining CIDs return without classification.

Output Schema

ParametersJSON Schema
NameRequiredDescription
compoundsYesCompound detail records.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare read-only/idempotent safety; description adds crucial cost/performance behavior missing from annotations: 'Adds one API call per CID' for description/classification flags, and 'Efficiently batches' for throughput optimization. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two dense sentences with zero waste. Front-loaded with core purpose ('Get detailed compound information by CID'), followed by parenthetical enumeration of return types, and closes with operational constraint ('Efficiently batches up to 100 CIDs'). Every clause earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists and annotations cover safety profile, description provides complete operational context: batch limits, default property sets, API cost implications for expensive flags, and content scope. No gaps remain for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% so baseline is 3. Description adds semantic grouping (physicochemical vs pharmacological vs classification) and concrete examples (XLogP, TPSA, Lipinski rules) that help the agent map user requests to the correct boolean flags and property selections beyond raw schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action ('Get detailed compound information by CID') and distinguishes from siblings: contrasts with search_compounds (retrieval by ID vs search), get_compound_image (data vs media), and get_compound_safety (general properties vs safety). Specific resource and verb are clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear operational context through 'by CID' (implies prerequisite IDs) and 'Efficiently batches up to 100 CIDs' (usage constraint). Lists optional flags with their content domains, implicitly guiding when to enable each. Lacks explicit contrast with pubchem_get_summary but has strong implicit guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_compound_imageGet Compound ImageA
Read-onlyIdempotent
Inspect

Fetch a 2D structure diagram (PNG image) for a compound by CID.

ParametersJSON Schema
NameRequiredDescriptionDefault
cidYesPubChem Compound ID. Resolve from name/SMILES with pubchem_search_compounds.
sizeNoImage size: "small" (100x100) or "large" (300x300). Default: "large".large

Output Schema

ParametersJSON Schema
NameRequiredDescription
cidYesPubChem Compound ID.
widthYesImage width in pixels.
heightYesImage height in pixels.
mimeTypeYesMIME type — always "image/png".
imageBase64YesBase64-encoded PNG image data.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, idempotentHint=true, and openWorldHint=true. The description adds valuable context beyond these by specifying the output is a PNG format 2D diagram, which helps the agent understand the binary/image nature of the response.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with zero waste. It front-loads the action (Fetch) and precisely defines the deliverable (PNG image) and identifier (CID) without filler words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple two-parameter fetch operation with an output schema present, the description is complete. It adequately explains what the tool returns (PNG image) without needing to detail return values, given the schema covers inputs completely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema fully documents both parameters (CID and size with dimensions). The description mentions 'by CID', reinforcing the required parameter, but adds no additional semantic detail beyond the comprehensive schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states a specific verb (Fetch), resource (2D structure diagram), format (PNG), and lookup method (by CID). It clearly distinguishes this image-fetching tool from siblings that retrieve 'details', 'safety', 'bioactivity', or perform searches.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While it doesn't explicitly name alternatives or 'when-not-to-use' clauses, the description provides clear context by specifying '2D structure diagram (PNG image)', which implicitly distinguishes it from data-retrieval siblings like get_compound_details or get_compound_safety.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_compound_interactionsGet Compound InteractionsA
Read-onlyIdempotent
Inspect

Get a compound's interaction data: drug-drug interactions (DrugBank), drug-food interactions, and chemical-target interactions (binding/activity from BindingDB, ChEMBL, and others). Each entry carries its originating source. Richest for approved drugs; many compounds have no deposited interaction records.

ParametersJSON Schema
NameRequiredDescriptionDefault
cidYesPubChem Compound ID. Resolve from name/SMILES with pubchem_search_compounds.
kindsNoInteraction kinds to fetch. "drug-drug" (interactions with other drugs), "drug-food" (dietary interactions), "target" (binding/activity against molecular targets). Default: ["drug-drug"].
maxEntriesNoMax entries per kind (1-50). Well-studied drugs have a long tail of interactions. Default: 10.

Output Schema

ParametersJSON Schema
NameRequiredDescription
cidYesPubChem Compound ID.
noticeNoGuidance when no interaction data was found for the requested kinds.
entriesYesInteraction entries across the requested kinds.
failedKindsNoInteraction kinds that could not be retrieved (comma-separated). The returned entries cover the kinds that succeeded; retry to re-attempt the failed ones.
returnedCountYesTotal interaction entries returned across all kinds.
requestedKindsYesInteraction kinds requested (comma-separated).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and openWorldHint. The description adds context: each entry carries its source, and data coverage is uneven. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences plus a brief caveat. All information is essential, front-loaded, and no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and rich annotations, the description fully covers what the tool does, its parameter semantics, data sources, and coverage limitations. An agent can confidently select and invoke it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions. The description enriches the 'kinds' parameter by listing specific source databases (DrugBank, BindingDB, ChEMBL). It also reinforces the cid resolution hint from the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a compound's interaction data, listing specific types (drug-drug, drug-food, target) and naming authoritative sources (DrugBank, BindingDB, ChEMBL). It distinguishes from siblings like pubchem_get_bioactivity by focusing on interactions rather than single bioassays.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description notes that richness varies (richest for approved drugs, many have no records) and implicitly advises use for interaction queries. However, it does not explicitly compare with specific sibling tools or state when to avoid it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_compound_safetyGet Compound SafetyA
Read-onlyIdempotent
Inspect

Get GHS (Globally Harmonized System) hazard classification and safety data for one or more compounds by CID. Returns signal word, pictograms, hazard statements (H-codes), and precautionary statements (P-codes) per compound. Data sourced from PubChem depositors — source attribution included.

ParametersJSON Schema
NameRequiredDescriptionDefault
cidsYesPubChem Compound IDs to fetch safety data for (1-25). Resolve from names/SMILES with pubchem_search_compounds.

Output Schema

ParametersJSON Schema
NameRequiredDescription
noticeNoCross-tool guidance when one or more CIDs have no GHS data, pointing to an alternative source.
resultsYesSafety results, one per requested CID (input order preserved).
withDataCountYesCIDs with GHS safety data available.
requestedCountYesCIDs requested.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations establish read-only, idempotent, and open-world traits. The description adds valuable behavioral context not present in annotations: data provenance ('sourced from PubChem depositors') and the inclusion of source attribution. It also previews the return structure without contradicting the output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three clauses: purpose declaration, return value enumeration, and data source attribution. Every clause earns its place; there is no redundancy or unnecessary verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple input schema (one well-documented parameter), presence of annotations, and existence of an output schema, the description is complete. It appropriately focuses on value-add elements (data source, specific safety data types) rather than repeating structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage for the single 'cid' parameter, the schema carries the semantic burden. The description neither repeats nor extends the parameter documentation, which is acceptable given the complete schema coverage, meeting the baseline expectation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Get') and clearly identifies the resource (GHS hazard classification and safety data). It distinguishes itself from sibling tools like get_compound_details and get_bioactivity by explicitly targeting safety-specific data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by enumerating specific return values (signal word, pictograms, H-codes, P-codes), suggesting when to use the tool. However, it lacks explicit guidance on when to prefer this over get_compound_details or other siblings, and states no prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_compound_xrefsGet Compound Cross-ReferencesA
Read-onlyIdempotent
Inspect

Get external database cross-references for a compound: PubMed citations, patent IDs, gene/protein associations, registry numbers, and taxonomy IDs. Results are capped per type with total counts reported.

ParametersJSON Schema
NameRequiredDescriptionDefault
cidYesPubChem Compound ID. Resolve from name/SMILES with pubchem_search_compounds.
xrefTypesYesCross-reference types to retrieve. String IDs: RegistryID (DSSTox/EPA registry numbers), RN (CAS numbers), PatentID. Numeric IDs: PubMedID, GeneID (NCBI Gene), ProteinGI (legacy NCBI Protein GI), TaxonomyID.
maxPerTypeNoMax IDs to return per xref type (1-500). A compound may have thousands of PubMed references. Total count always reported. Default: 50.

Output Schema

ParametersJSON Schema
NameRequiredDescription
cidYesPubChem Compound ID.
xrefsYesCross-references grouped by type.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare read-only, idempotent, and open-world traits. The description adds valuable behavioral context not in annotations: the capping mechanism ('capped per type with total counts reported') and explicitly maps xref types to human-readable categories (e.g., 'RN' to 'registry numbers').

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste: first sentence defines purpose and scope with specific examples; second sentence discloses the critical capping limitation. Information is front-loaded and dense.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (which handles return value documentation) and 100% parameter coverage, the description adequately covers the tool's purpose, xref types, and result-limiting behavior without redundancy.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description reinforces the xrefTypes by listing them in prose, translating technical enum values (e.g., 'RN', 'ProteinGI') to domain concepts ('registry numbers', 'gene/protein associations'), but does not add significant semantic depth beyond the comprehensive schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool 'Get[s] external database cross-references for a compound' and enumerates specific xref types (PubMed citations, patent IDs, gene/protein associations, registry numbers, taxonomy IDs), distinguishing it from sibling tools that retrieve bioactivity, images, or general compound details.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While it lacks explicit 'use this instead of X' comparisons, the description provides clear contextual scope (external database cross-references only) and explains the capping behavior ('Results are capped per type'), which implicitly guides usage for large result sets.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_get_summaryGet Entity SummaryA
Read-onlyIdempotent
Inspect

Get descriptive summaries for PubChem entities by ID. Supports assays (AID), genes (Gene ID), proteins (UniProt accession), and taxonomy (Tax ID). Up to 10 per call.

ParametersJSON Schema
NameRequiredDescriptionDefault
entityTypeYesEntity type. Determines ID format and returned fields.
identifiersYesEntity identifiers (1-10). Type depends on entityType: - assay: AID (number), e.g. [1000] - gene: Gene ID (number), e.g. [1956] - protein: UniProt accession (string), e.g. ["P00533"] - taxonomy: Tax ID (number), e.g. [9606]

Output Schema

ParametersJSON Schema
NameRequiredDescription
noticeNoRecovery guidance when one or more identifiers were not found.
summariesYesSummary results.
entityTypeYesEntity type queried.
foundCountYesIdentifiers resolved to a summary.
requestedCountYesIdentifiers requested.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint, openWorldHint, and idempotentHint. The description adds valuable behavioral context beyond these: the batch limit constraint (10 per call) and clarifies that returned data consists of 'descriptive summaries.' No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences total with zero waste. First sentence covers purpose and entity scope; second covers batch limit. Information is front-loaded and dense.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-parameter read operation with 100% schema coverage and existing output schema, the description is complete. It covers tool purpose, supported entity scope, and batch constraints without needing to detail return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage with detailed type mapping for identifiers. Description reinforces this with examples in prose (AID, Gene ID, etc.) but adds no significant semantic information beyond what the schema already provides. Baseline 3 appropriate for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool 'Get[s] descriptive summaries for PubChem entities by ID' with specific verb and resource. It explicitly distinguishes from compound-focused siblings by listing supported entity types: assays, genes, proteins, and taxonomy.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context by enumerating supported entity types (AID, Gene ID, UniProt, Tax ID) and the 'Up to 10 per call' constraint. Lacks explicit 'when not to use' guidance regarding compounds, though the enum and sibling tool names make this implicitly clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_search_assaysSearch AssaysA
Read-onlyIdempotent
Inspect

Find PubChem bioassays associated with a biological target. Search by gene symbol (e.g. "EGFR"), protein name, NCBI Gene ID, or UniProt accession. Returns assay IDs (AIDs) which can be explored further with pubchem_get_summary.

ParametersJSON Schema
NameRequiredDescriptionDefault
maxResultsNoMax AIDs to return (1-200). Popular targets may have thousands of assays. Default: 50.
targetTypeYesTarget identifier type. "genesymbol" and "proteinname" accept text names. "geneid" accepts NCBI Gene IDs. "proteinaccession" accepts UniProt accessions.
targetQueryYesTarget identifier. Examples: "EGFR" (genesymbol), "Epidermal growth factor receptor" (proteinname), "1956" (geneid), "P00533" (proteinaccession).

Output Schema

ParametersJSON Schema
NameRequiredDescription
aidsYesPubChem Assay IDs.
noticeNoRecovery guidance when no assays matched — echoes the target and suggests alternative search types. Absent when assays were returned.
targetTypeYesTarget identifier type used: genesymbol, proteinname, geneid, or proteinaccession.
totalFoundYesTotal AIDs found before the maxResults cap.
targetQueryYesTarget identifier searched.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnly/idempotent/openWorld traits, so the description appropriately focuses on adding return-value context (AIDs) and search scope constraints. It clarifies what the tool produces without repeating safety annotations, though it could mention handling of unmatched targets.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences: purpose declaration, input specification with examples, and output/next-step guidance. No redundant information; every sentence earns its place. Front-loaded with the core action 'Find PubChem bioassays'.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the 100% schema coverage and existence of output schema, the description provides sufficient context by explaining the biological target search paradigm and the AID return format. It appropriately avoids duplicating detailed parameter documentation already present in the schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema fully documents all three parameters including examples and ranges. The description reinforces the target types (gene symbol, protein name, etc.) but primarily provides conceptual framing rather than new semantic details beyond the structured schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly states 'Find PubChem bioassays associated with a biological target' with specific verb and resource. It clearly distinguishes from compound-focused siblings (pubchem_search_compounds) by specifying 'biological target' and from getter tools by describing the search functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear workflow guidance by noting that returned AIDs 'can be explored further with pubchem_get_summary', establishing the tool's position in the chain. However, it lacks explicit 'when not to use' guidance regarding similar tools like pubchem_get_bioactivity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubchem_search_compoundsSearch CompoundsA
Read-onlyIdempotent
Inspect

Search PubChem for chemical compounds by identifier (name, SMILES, or InChIKey, batched up to 25), molecular formula in Hill notation, substructure or superstructure containment, or 2D Tanimoto similarity. Optionally hydrate results with properties to avoid a follow-up pubchem_get_compound_details call.

ParametersJSON Schema
NameRequiredDescriptionDefault
queryNoRequired for substructure/superstructure/similarity searches. A SMILES string (e.g. "CC(=O)O") or PubChem CID as a string (e.g. "2244").
formulaNoRequired for formula search. Molecular formula in Hill notation (e.g. "C6H12O6", "CaH2O2").
queryTypeNoRequired for structure/similarity searches. Format of the query: "smiles" or "cid".
thresholdNoSimilarity search only. Minimum Tanimoto similarity (70-100). 90+ for close analogs, 70-80 for scaffold hops. Default: 90.
maxResultsNoMaximum CIDs to return (1-200). Default: 20.
propertiesNoOptional: fetch these properties for each result, avoiding a follow-up details call. E.g. ["MolecularFormula", "MolecularWeight", "CanonicalSMILES"].
searchTypeYesSearch strategy. "identifier": name/SMILES/InChIKey lookup. "formula": molecular formula. "substructure": find compounds containing the query as a substructure. "superstructure": find compounds that are themselves substructures of the query. "similarity": 2D Tanimoto similarity to the query.
identifiersNoRequired for identifier search. Array of identifiers to resolve (1-25). Examples: ["aspirin", "ibuprofen"] for name, ["CC(=O)OC1=CC=CC=C1C(=O)O"] for SMILES, ["BSYNRYMUTXBXSQ-UHFFFAOYSA-N"] for inchikey (27-char block format).
identifierTypeNoRequired for identifier search. Type of chemical identifier: "name", "smiles", or "inchikey".
allowOtherElementsNoFormula search only. When true, includes compounds with additional elements beyond the formula.

Output Schema

ParametersJSON Schema
NameRequiredDescription
noticeNoRecovery guidance when no compounds matched — echoes search strategy and suggests how to broaden. Absent when results were returned.
resultsYesMatching compounds.
searchTypeYesSearch strategy used: identifier, formula, substructure, superstructure, or similarity.
totalFoundYesTotal CIDs found before the maxResults cap.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and idempotentHint=true. The description adds valuable behavioral context by explaining batch constraints ('up to 25' for identifiers), the Hill notation requirement for formulas, and the hydration behavior. It does not contradict annotations, though it could mention pagination or rate limiting for a 5.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely efficient structure: one sentence framing, bulleted list of five modes with parenthetical examples, and a final sentence on hydration. No redundant words; every phrase conveys specific search behavior or constraints. Excellent use of formatting for readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (10 parameters, 5 distinct search modes) and presence of an output schema, the description adequately covers search strategy selection and result optimization. It appropriately omits return value details (covered by output schema) but could briefly mention pagination behavior for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description elevates this by providing conceptual groupings of the searchType parameter (explaining the five modes conceptually) and noting the constraint 'batch up to 25' for identifiers, adding strategic context beyond the schema's individual field descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Search PubChem for chemical compounds' and distinguishes five specific search modes (identifier, formula, substructure, superstructure, similarity). It clearly differentiates from sibling 'get' tools by focusing on search functionality and discovery rather than retrieval of specific records.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance on when to use each of the five search modes (e.g., 'Resolve compound names...', 'Find compounds by molecular formula'). Critically, it notes the hydration option to 'avoid a follow-up details call,' directly referencing the sibling pubchem_get_compound_details tool and guiding optimization decisions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.