Skip to main content
Glama
lukasmki

Chemspace MCP Server

by lukasmki

search_similarity

Find chemically similar compounds by SMILES string to identify building blocks and screening compounds for research and synthesis.

Instructions

Similarity search by SMILES

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
smilesYes
shipToCountryNoThe country you want your order to be shipped to as two-letter country ISO code, e.g DE, US, FRUS
countNoMaximum number of results on a page
pageNoNumber of the page
categoriesNoA list of product categories to searchCSSB - In-stock building blocksCSSS - In-stock screening compoundsCSMB - Make-on-demand building blocksCSMS - Make-on-demand screening compoundsCSCS - Custom request

Implementation Reference

  • Handler function for the search_similarity tool. Makes a POST request to Chemspace API (uses substructure endpoint labeled as similarity). Uses token from ChemspaceTokenManager.
    @mcp.tool(enabled=True)
    async def search_similarity(
        smiles: str,
        shipToCountry: Country = "US",
        count: ResultCount = 10,
        page: ResultPage = 1,
        categories: ProductCategories = ["CSSB", "CSMB"],
    ):
        """Similarity search by SMILES"""
        access_token = await mgr.get_token()
    
        async with httpx.AsyncClient() as client:
            r = await client.post(
                url="https://api.chem-space.com/v4/search/sub",
                headers={
                    "Accept": "application/json; version=4.1",
                    "Authorization": f"Bearer {access_token}",
                },
                params={
                    "shipToCountry": shipToCountry,
                    "count": count,
                    "page": page,
                    "categories": ",".join(categories),
                },
                files={
                    "SMILES": (None, smiles),
                },
            )
        r.raise_for_status()
        data = r.json()
    
        return data
  • Input schema definitions (Pydantic Annotated types) used for parameters in search_similarity and other search tools.
    Country = Annotated[
        str,
        Field(
            description="The country you want your order to be shipped to as two-letter country ISO code, e.g DE, US, FR"
        ),
    ]
    
    ResultCount = Annotated[
        int, Field(description="Maximum number of results on a page", ge=1)
    ]
    
    ResultPage = Annotated[int, Field(description="Number of the page", ge=1)]
    
    ProductCategory = Literal["CSSB", "CSSS", "CSMB", "CSMS", "CSCS"]
    
    ProductCategories = Annotated[
        List[ProductCategory],
        Field(
            description=(
                "A list of product categories to search"
                "CSSB - In-stock building blocks"
                "CSSS - In-stock screening compounds"
                "CSMB - Make-on-demand building blocks"
                "CSMS - Make-on-demand screening compounds"
                "CSCS - Custom request"
            ),
            min_length=1,
        ),
    ]
  • Creates MCP instance and token manager, then calls register_tools to register the search_similarity tool (and others).
    mgr = ChemspaceTokenManager()
    mcp = FastMCP(
        "Chemspace MCP",
        instructions="Tools for retrieving synthesizable building blocks via the Chemspace API",
    )
    register_tools(mcp, mgr)
  • ChemspaceTokenManager class providing get_token() method used by search_similarity handler for authentication.
    class ChemspaceTokenManager:
        def __init__(
            self,
            api_key: str = os.environ.get("CHEMSPACE_API_KEY"),
            base_url: str = "https://api.chem-space.com/",
        ):
            self.api_key = api_key
            self.base_url = base_url
            self.auth_url = base_url + "auth/token"
    
            self.access_token = None
            self.expires_at = 0
    
            self.token_cache = (
                pathlib.Path(tempfile.gettempdir()) / ".token_cache"
            ).resolve()
            if self.token_cache.exists():
                # read from cache
                with open(self.token_cache, "r") as fp:
                    data = json.load(fp)
                self.access_token = data["access_token"]
                self.expires_at = float(data["expires_at"])
    
        async def refresh_token(self):
            """Exchange API key for short-lived access token."""
            async with httpx.AsyncClient() as client:
                r = await client.get(
                    self.auth_url,
                    headers={
                        "Authorization": f"Bearer {self.api_key}",
                        "Accept": "application/json",
                    },
                    timeout=10,
                )
    
            r.raise_for_status()
            data = r.json()
    
            self.access_token = data["access_token"]
            self.expires_at = time.time() + data["expires_in"] - 30  # refresh early
    
            # write to token cache
            with open(self.token_cache, "w") as fp:
                json.dump(
                    {"access_token": self.access_token, "expires_at": self.expires_at},
                    fp,
                )
    
        async def get_token(self):
            """Return a valid token, refreshing if needed."""
            if time.time() >= self.expires_at:
                await self.refresh_token()
            return self.access_token
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'search' but doesn't clarify if this is a read-only operation, what the output format might be, potential rate limits, authentication needs, or any side effects. For a search tool with 5 parameters and no annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence with no wasted words, making it easy to parse and front-loaded with the core purpose. It's appropriately sized for the tool's complexity, though it could benefit from more detail without sacrificing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters, no annotations, and no output schema, the description is incomplete. It doesn't explain what the tool returns, how results are structured, or provide context for parameters like 'shipToCountry' that seem unrelated to similarity search, leaving gaps in understanding the tool's full behavior and use case.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no parameter semantics beyond the input schema, which has 80% coverage (4 out of 5 parameters have descriptions). Since schema coverage is high (>80%), the baseline score is 3, as the schema adequately documents most parameters, and the description doesn't compensate or add extra meaning for the undocumented 'smiles' parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Similarity search by SMILES' states the action (search) and resource (SMILES), but it's vague about what exactly is being searched (e.g., chemical compounds, databases) and doesn't differentiate from sibling tools like 'search_exact' or 'search_substructure', which likely perform different types of chemical searches. It provides a basic purpose but lacks specificity and distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description offers no guidance on when to use this tool versus alternatives like 'search_exact' or 'search_substructure', nor does it mention any prerequisites or contextual cues for its application. It's a standalone statement with no usage instructions, leaving the agent to infer based on the tool name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/lukasmki/chemspace-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server