pubchem-mcp-server
Server Details
MCP server for PubChem. Search compounds, properties, safety, bioactivity, xrefs, and summaries.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- cyanheads/pubchem-mcp-server
- GitHub Stars
- 8
- Server Listing
- pubchem-mcp-server
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.3/5 across 8 of 8 tools scored.
Every tool has a clearly distinct purpose with no ambiguity: get_bioactivity focuses on assay results, get_compound_details on properties, get_image on visual representation, get_safety on hazard data, get_xrefs on external references, get_summary on entity descriptions, search_assays on target-based assay discovery, and search_compounds on compound lookup. The descriptions reinforce non-overlapping scopes.
All tools follow a consistent 'pubchem_verb_noun' pattern with snake_case throughout: pubchem_get_bioactivity, pubchem_get_compound_details, pubchem_get_compound_image, pubchem_get_compound_safety, pubchem_get_compound_xrefs, pubchem_get_summary, pubchem_search_assays, pubchem_search_compounds. This predictability aids agent navigation.
With 8 tools, the server is well-scoped for PubChem data access, covering compound retrieval, search, safety, bioactivity, cross-references, and summaries. Each tool earns its place by addressing a specific aspect of chemical and biological data without bloat or redundancy.
The toolset provides comprehensive coverage for querying and retrieving PubChem data, including compound details, bioactivity, safety, and searches. Minor gaps exist, such as no explicit tools for batch operations beyond those mentioned or advanced filtering in searches, but agents can work around these with the provided capabilities.
Available Tools
8 toolspubchem_get_bioactivityGet BioactivityARead-onlyIdempotentInspect
Get a compound's bioactivity profile: which assays tested it, activity outcomes (Active/Inactive/Inconclusive), target information (gene symbols, protein names), and quantitative values (IC50, EC50, Ki, etc.). Filter by outcome to focus on active results.
| Name | Required | Description | Default |
|---|---|---|---|
| cid | Yes | PubChem Compound ID. | |
| maxResults | No | Max assay results to return (1-100). Well-studied compounds have thousands of records. Default: 20. | |
| outcomeFilter | No | Filter by activity outcome. "active" shows only assays where the compound showed activity — most useful for understanding biological profile. Default: "all". | all |
Output Schema
| Name | Required | Description |
|---|---|---|
| cid | Yes | PubChem Compound ID. |
| results | Yes | Assay results matching the filter. |
| activeCount | Yes | Assays with "Active" outcome. |
| totalAssays | Yes | Total unique assays for this compound. |
| inactiveCount | Yes | Assays with "Inactive" outcome. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint, idempotentHint, and openWorldHint. The description adds valuable context about the return structure (what constitutes a 'bioactivity profile': assays, outcomes, targets, quantitative values) but does not elaborate on open-world implications (incomplete data coverage) or rate limiting concerns beyond the schema's pagination hint.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two tightly constructed sentences with zero waste. First sentence front-loads the complete data model (assays, outcomes, targets, quantitative values). Second sentence provides actionable filtering guidance. Every clause earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, the description appropriately summarizes return content without redundant specification. Adequately covers the tool's scope for a data retrieval operation, though could be strengthened by noting edge cases (e.g., compounds with no bioactivity data).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. Description reinforces the outcomeFilter parameter's purpose ('Filter by outcome to focus on active results'), aligning with schema guidance, but does not add substantial semantic meaning beyond what the well-documented schema already provides for cid or maxResults.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description provides specific verb ('Get') and resource ('compound's bioactivity profile'), and clearly distinguishes from siblings by detailing unique content: assays tested, activity outcomes, target information, and quantitative values (IC50, EC50, Ki), which none of the other compound tools (details, image, safety, xrefs) provide.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implied usage guidance by highlighting the outcome filter ('Filter by outcome to focus on active results'), but lacks explicit when-to-use/when-not-to-use distinctions versus siblings like pubchem_search_assays (which searches for assays by criteria) or pubchem_get_compound_details (general metadata).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubchem_get_compound_detailsGet Compound DetailsARead-onlyIdempotentInspect
Get detailed compound information by CID. Returns physicochemical properties (molecular weight, SMILES, InChIKey, XLogP, TPSA, etc.), optionally with a textual description (pharmacology, mechanism, therapeutic use), all known synonyms, drug-likeness assessment (Lipinski/Veber rules), and/or pharmacological classification (FDA classes, MeSH classes, ATC codes). Efficiently batches up to 100 CIDs.
| Name | Required | Description | Default |
|---|---|---|---|
| cids | Yes | PubChem Compound IDs to fetch (1-100). Batched efficiently. | |
| properties | No | Properties to retrieve. Defaults to a core set: MolecularFormula, MolecularWeight, IUPACName, CanonicalSMILES, IsomericSMILES, InChIKey, XLogP, TPSA, HBondDonorCount, HBondAcceptorCount, RotatableBondCount, HeavyAtomCount, Charge, Complexity. | |
| includeSynonyms | No | Fetch all known names and synonyms (trade names, systematic names, registry numbers). | |
| maxDescriptions | No | Max number of distinct description entries per compound (1-20). PubChem returns near-duplicate summaries from many depositors; we dedup and cap to keep responses focused. Default: 3. | |
| includeDescription | No | Include textual descriptions (pharmacology, mechanism, therapeutic use) attributed by source. Well-studied compounds have many overlapping summaries — capped via maxDescriptions. Slower when enabled — prefer small CID batches. | |
| includeDrugLikeness | No | Compute drug-likeness assessment: Lipinski Rule of Five (MW, XLogP, HBD, HBA) and Veber rules (TPSA, rotatable bonds). No extra API calls — computed from properties. | |
| includeClassification | No | Include pharmacological classification: FDA Established Pharmacologic Classes, mechanisms of action, MeSH classes, and ATC codes. Slower when enabled — prefer small CID batches. |
Output Schema
| Name | Required | Description |
|---|---|---|
| compounds | Yes | Compound detail records. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare read-only/idempotent safety; description adds crucial cost/performance behavior missing from annotations: 'Adds one API call per CID' for description/classification flags, and 'Efficiently batches' for throughput optimization. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two dense sentences with zero waste. Front-loaded with core purpose ('Get detailed compound information by CID'), followed by parenthetical enumeration of return types, and closes with operational constraint ('Efficiently batches up to 100 CIDs'). Every clause earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given output schema exists and annotations cover safety profile, description provides complete operational context: batch limits, default property sets, API cost implications for expensive flags, and content scope. No gaps remain for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% so baseline is 3. Description adds semantic grouping (physicochemical vs pharmacological vs classification) and concrete examples (XLogP, TPSA, Lipinski rules) that help the agent map user requests to the correct boolean flags and property selections beyond raw schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action ('Get detailed compound information by CID') and distinguishes from siblings: contrasts with search_compounds (retrieval by ID vs search), get_compound_image (data vs media), and get_compound_safety (general properties vs safety). Specific resource and verb are clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear operational context through 'by CID' (implies prerequisite IDs) and 'Efficiently batches up to 100 CIDs' (usage constraint). Lists optional flags with their content domains, implicitly guiding when to enable each. Lacks explicit contrast with pubchem_get_summary but has strong implicit guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubchem_get_compound_imageGet Compound ImageARead-onlyIdempotentInspect
Fetch a 2D structure diagram (PNG image) for a compound by CID.
| Name | Required | Description | Default |
|---|---|---|---|
| cid | Yes | PubChem Compound ID. | |
| size | No | Image size: "small" (100x100) or "large" (300x300). Default: "large". | large |
Output Schema
| Name | Required | Description |
|---|---|---|
| cid | Yes | PubChem Compound ID. |
| width | Yes | Image width in pixels. |
| height | Yes | Image height in pixels. |
| mimeType | Yes | Image MIME type. |
| imageBase64 | Yes | Base64-encoded PNG image data. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, idempotentHint=true, and openWorldHint=true. The description adds valuable context beyond these by specifying the output is a PNG format 2D diagram, which helps the agent understand the binary/image nature of the response.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It front-loads the action (Fetch) and precisely defines the deliverable (PNG image) and identifier (CID) without filler words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple two-parameter fetch operation with an output schema present, the description is complete. It adequately explains what the tool returns (PNG image) without needing to detail return values, given the schema covers inputs completely.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema fully documents both parameters (CID and size with dimensions). The description mentions 'by CID', reinforcing the required parameter, but adds no additional semantic detail beyond the comprehensive schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb (Fetch), resource (2D structure diagram), format (PNG), and lookup method (by CID). It clearly distinguishes this image-fetching tool from siblings that retrieve 'details', 'safety', 'bioactivity', or perform searches.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While it doesn't explicitly name alternatives or 'when-not-to-use' clauses, the description provides clear context by specifying '2D structure diagram (PNG image)', which implicitly distinguishes it from data-retrieval siblings like get_compound_details or get_compound_safety.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubchem_get_compound_safetyGet Compound SafetyARead-onlyIdempotentInspect
Get GHS (Globally Harmonized System) hazard classification and safety data for a compound. Returns signal word, pictograms, hazard statements (H-codes), and precautionary statements (P-codes). Data sourced from PubChem depositors — source attribution included.
| Name | Required | Description | Default |
|---|---|---|---|
| cid | Yes | PubChem Compound ID. |
Output Schema
| Name | Required | Description |
|---|---|---|
| cid | Yes | PubChem Compound ID. |
| ghs | No | GHS classification data. |
| source | No | Data source attribution. |
| hasData | Yes | Whether GHS safety data is available for this compound. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations establish read-only, idempotent, and open-world traits. The description adds valuable behavioral context not present in annotations: data provenance ('sourced from PubChem depositors') and the inclusion of source attribution. It also previews the return structure without contradicting the output schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured in three clauses: purpose declaration, return value enumeration, and data source attribution. Every clause earns its place; there is no redundancy or unnecessary verbosity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input schema (one well-documented parameter), presence of annotations, and existence of an output schema, the description is complete. It appropriately focuses on value-add elements (data source, specific safety data types) rather than repeating structured metadata.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage for the single 'cid' parameter, the schema carries the semantic burden. The description neither repeats nor extends the parameter documentation, which is acceptable given the complete schema coverage, meeting the baseline expectation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Get') and clearly identifies the resource (GHS hazard classification and safety data). It distinguishes itself from sibling tools like get_compound_details and get_bioactivity by explicitly targeting safety-specific data.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage by enumerating specific return values (signal word, pictograms, H-codes, P-codes), suggesting when to use the tool. However, it lacks explicit guidance on when to prefer this over get_compound_details or other siblings, and states no prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubchem_get_compound_xrefsGet Compound Cross-ReferencesARead-onlyIdempotentInspect
Get external database cross-references for a compound: PubMed citations, patent IDs, gene/protein associations, registry numbers, and taxonomy IDs. Results are capped per type with total counts reported.
| Name | Required | Description | Default |
|---|---|---|---|
| cid | Yes | PubChem Compound ID. | |
| xrefTypes | Yes | Cross-reference types to retrieve. Options: RegistryID, RN (CAS numbers), PubMedID, PatentID, GeneID, ProteinGI, TaxonomyID. | |
| maxPerType | No | Max IDs to return per xref type (1-500). A compound may have thousands of PubMed references — this cap prevents bloat. Total count always reported. Default: 50. |
Output Schema
| Name | Required | Description |
|---|---|---|
| cid | Yes | PubChem Compound ID. |
| xrefs | Yes | Cross-references grouped by type. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare read-only, idempotent, and open-world traits. The description adds valuable behavioral context not in annotations: the capping mechanism ('capped per type with total counts reported') and explicitly maps xref types to human-readable categories (e.g., 'RN' to 'registry numbers').
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste: first sentence defines purpose and scope with specific examples; second sentence discloses the critical capping limitation. Information is front-loaded and dense.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (which handles return value documentation) and 100% parameter coverage, the description adequately covers the tool's purpose, xref types, and result-limiting behavior without redundancy.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description reinforces the xrefTypes by listing them in prose, translating technical enum values (e.g., 'RN', 'ProteinGI') to domain concepts ('registry numbers', 'gene/protein associations'), but does not add significant semantic depth beyond the comprehensive schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool 'Get[s] external database cross-references for a compound' and enumerates specific xref types (PubMed citations, patent IDs, gene/protein associations, registry numbers, taxonomy IDs), distinguishing it from sibling tools that retrieve bioactivity, images, or general compound details.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While it lacks explicit 'use this instead of X' comparisons, the description provides clear contextual scope (external database cross-references only) and explains the capping behavior ('Results are capped per type'), which implicitly guides usage for large result sets.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubchem_get_summaryGet Entity SummaryARead-onlyIdempotentInspect
Get descriptive summaries for PubChem entities by ID. Supports assays (AID), genes (Gene ID), proteins (UniProt accession), and taxonomy (Tax ID). Up to 10 per call.
| Name | Required | Description | Default |
|---|---|---|---|
| entityType | Yes | Entity type. Determines ID format and returned fields. | |
| identifiers | Yes | Entity identifiers (1-10). Type depends on entityType: - assay: AID (number), e.g. [1000] - gene: Gene ID (number), e.g. [1956] - protein: UniProt accession (string), e.g. ["P00533"] - taxonomy: Tax ID (number), e.g. [9606] |
Output Schema
| Name | Required | Description |
|---|---|---|
| summaries | Yes | Summary results. |
| entityType | Yes | Entity type queried. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint, openWorldHint, and idempotentHint. The description adds valuable behavioral context beyond these: the batch limit constraint (10 per call) and clarifies that returned data consists of 'descriptive summaries.' No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences total with zero waste. First sentence covers purpose and entity scope; second covers batch limit. Information is front-loaded and dense.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 2-parameter read operation with 100% schema coverage and existing output schema, the description is complete. It covers tool purpose, supported entity scope, and batch constraints without needing to detail return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with detailed type mapping for identifiers. Description reinforces this with examples in prose (AID, Gene ID, etc.) but adds no significant semantic information beyond what the schema already provides. Baseline 3 appropriate for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool 'Get[s] descriptive summaries for PubChem entities by ID' with specific verb and resource. It explicitly distinguishes from compound-focused siblings by listing supported entity types: assays, genes, proteins, and taxonomy.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context by enumerating supported entity types (AID, Gene ID, UniProt, Tax ID) and the 'Up to 10 per call' constraint. Lacks explicit 'when not to use' guidance regarding compounds, though the enum and sibling tool names make this implicitly clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubchem_search_assaysSearch AssaysARead-onlyIdempotentInspect
Find PubChem bioassays associated with a biological target. Search by gene symbol (e.g. "EGFR"), protein name, NCBI Gene ID, or UniProt accession. Returns assay IDs (AIDs) which can be explored further with pubchem_get_summary.
| Name | Required | Description | Default |
|---|---|---|---|
| maxResults | No | Max AIDs to return (1-200). Popular targets may have thousands of assays. Default: 50. | |
| targetType | Yes | Target identifier type. "genesymbol" and "proteinname" accept text names. "geneid" accepts NCBI Gene IDs. "proteinaccession" accepts UniProt accessions. | |
| targetQuery | Yes | Target identifier. Examples: "EGFR" (genesymbol), "Epidermal growth factor receptor" (proteinname), "1956" (geneid), "P00533" (proteinaccession). |
Output Schema
| Name | Required | Description |
|---|---|---|
| aids | Yes | PubChem Assay IDs. |
| targetType | Yes | Target identifier type used. |
| totalFound | Yes | Total AIDs found. |
| targetQuery | Yes | Target identifier searched. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnly/idempotent/openWorld traits, so the description appropriately focuses on adding return-value context (AIDs) and search scope constraints. It clarifies what the tool produces without repeating safety annotations, though it could mention handling of unmatched targets.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three well-structured sentences: purpose declaration, input specification with examples, and output/next-step guidance. No redundant information; every sentence earns its place. Front-loaded with the core action 'Find PubChem bioassays'.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 100% schema coverage and existence of output schema, the description provides sufficient context by explaining the biological target search paradigm and the AID return format. It appropriately avoids duplicating detailed parameter documentation already present in the schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema fully documents all three parameters including examples and ranges. The description reinforces the target types (gene symbol, protein name, etc.) but primarily provides conceptual framing rather than new semantic details beyond the structured schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description explicitly states 'Find PubChem bioassays associated with a biological target' with specific verb and resource. It clearly distinguishes from compound-focused siblings (pubchem_search_compounds) by specifying 'biological target' and from getter tools by describing the search functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear workflow guidance by noting that returned AIDs 'can be explored further with pubchem_get_summary', establishing the tool's position in the chain. However, it lacks explicit 'when not to use' guidance regarding similar tools like pubchem_get_bioactivity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubchem_search_compoundsSearch CompoundsARead-onlyIdempotentInspect
Search PubChem for chemical compounds. Five search modes:
identifier: Resolve compound names, SMILES, or InChIKeys to CIDs (batch up to 25)
formula: Find compounds by molecular formula (Hill notation, e.g. "C6H12O6")
substructure: Find compounds containing a substructure (SMILES or CID)
superstructure: Find compounds that are substructures of the query
similarity: Find structurally similar compounds by 2D Tanimoto similarity
Optionally hydrate results with properties to avoid a follow-up details call.
| Name | Required | Description | Default |
|---|---|---|---|
| query | No | Required for substructure/superstructure/similarity searches. A SMILES string or PubChem CID (as string) for the query structure. | |
| formula | No | Required for formula search. Molecular formula in Hill notation (e.g. "C6H12O6", "CaH2O2"). | |
| queryType | No | Required for structure/similarity searches. Format of the query: "smiles" or "cid". | |
| threshold | No | Similarity search only. Minimum Tanimoto similarity (70-100). 90+ for close analogs, 70-80 for scaffold hops. Default: 90. | |
| maxResults | No | Maximum CIDs to return (1-200). Default: 20. | |
| properties | No | Optional: fetch these properties for each result, avoiding a follow-up details call. E.g. ["MolecularFormula", "MolecularWeight", "CanonicalSMILES"]. | |
| searchType | Yes | Search strategy: "identifier" (name/SMILES/InChIKey lookup), "formula", "substructure", "superstructure", or "similarity". | |
| identifiers | No | Required for identifier search. Array of identifiers to resolve (1-25). Examples: ["aspirin", "ibuprofen"] for name, ["CC(=O)OC1=CC=CC=C1C(=O)O"] for SMILES. | |
| identifierType | No | Required for identifier search. Type of chemical identifier: "name", "smiles", or "inchikey". | |
| allowOtherElements | No | Formula search only. When true, includes compounds with additional elements beyond the formula. |
Output Schema
| Name | Required | Description |
|---|---|---|
| results | Yes | Matching compounds. |
| searchType | Yes | The search strategy used. |
| totalFound | Yes | Total CIDs found (before maxResults cap). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true and idempotentHint=true. The description adds valuable behavioral context by explaining batch constraints ('up to 25' for identifiers), the Hill notation requirement for formulas, and the hydration behavior. It does not contradict annotations, though it could mention pagination or rate limiting for a 5.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient structure: one sentence framing, bulleted list of five modes with parenthetical examples, and a final sentence on hydration. No redundant words; every phrase conveys specific search behavior or constraints. Excellent use of formatting for readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (10 parameters, 5 distinct search modes) and presence of an output schema, the description adequately covers search strategy selection and result optimization. It appropriately omits return value details (covered by output schema) but could briefly mention pagination behavior for completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description elevates this by providing conceptual groupings of the searchType parameter (explaining the five modes conceptually) and noting the constraint 'batch up to 25' for identifiers, adding strategic context beyond the schema's individual field descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Search PubChem for chemical compounds' and distinguishes five specific search modes (identifier, formula, substructure, superstructure, similarity). It clearly differentiates from sibling 'get' tools by focusing on search functionality and discovery rather than retrieval of specific records.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit guidance on when to use each of the five search modes (e.g., 'Resolve compound names...', 'Find compounds by molecular formula'). Critically, it notes the hydration option to 'avoid a follow-up details call,' directly referencing the sibling pubchem_get_compound_details tool and guiding optimization decisions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!
Your Connectors
Sign in to create a connector for this server.