Skip to main content
Glama
biocontext-ai

BioContextAI Knowledgebase MCP

Official

bc_get_biorxiv_preprint_details

Retrieve detailed preprint metadata from bioRxiv or medRxiv using DOI to access title, authors, abstract, publication date, version, category, and license information.

Instructions

Get detailed preprint metadata by DOI. Retrieves title, authors, abstract, date, version, category, license, and publication status.

Returns: dict: Preprint metadata including doi, title, authors, abstract, date, version, category, license, publication status or error message.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
doiYesPreprint DOI (e.g., '10.1101/2020.09.09.20191205')
serverNo'biorxiv' or 'medrxiv'biorxiv

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The handler function get_biorxiv_preprint_details decorated with @core_mcp.tool(), implementing the core logic to retrieve preprint details from bioRxiv or medRxiv APIs using their details endpoint. Includes input validation, DOI cleaning, API request, and structured response parsing.
    @core_mcp.tool()
    def get_biorxiv_preprint_details(
        doi: Annotated[str, Field(description="Preprint DOI (e.g., '10.1101/2020.09.09.20191205')")],
        server: Annotated[str, Field(description="'biorxiv' or 'medrxiv'")] = "biorxiv",
    ) -> Dict[str, Any]:
        """Get detailed preprint metadata by DOI. Retrieves title, authors, abstract, date, version, category, license, and publication status.
    
        Returns:
            dict: Preprint metadata including doi, title, authors, abstract, date, version, category, license, publication status or error message.
        """
        # Validate server
        if server.lower() not in ["biorxiv", "medrxiv"]:
            return {"error": "Server must be 'biorxiv' or 'medrxiv'"}
    
        server = server.lower()
    
        # Clean DOI - remove URL prefix if present
        if doi.startswith("https://doi.org/"):
            doi = doi.replace("https://doi.org/", "")
        elif doi.startswith("doi:"):
            doi = doi.replace("doi:", "")
    
        try:
            # Build URL for single DOI lookup
            url = f"https://api.biorxiv.org/details/{server}/{doi}/na/json"
    
            # Make request
            response = requests.get(url, timeout=30)
            response.raise_for_status()
            data = response.json()
    
            # Check if paper was found
            collection = data.get("collection", [])
            if not collection:
                return {"error": f"No preprint found with DOI {doi} on {server}", "messages": data.get("messages", [])}
    
            # Get the paper details
            paper = collection[0]
    
            # Return structured paper information
            result = {
                "doi": paper.get("doi", ""),
                "title": paper.get("title", ""),
                "authors": paper.get("authors", ""),
                "corresponding_author": paper.get("author_corresponding", ""),
                "corresponding_institution": paper.get("author_corresponding_institution", ""),
                "date": paper.get("date", ""),
                "version": paper.get("version", ""),
                "type": paper.get("type", ""),
                "license": paper.get("license", ""),
                "category": paper.get("category", ""),
                "abstract": paper.get("abstract", ""),
                "published": paper.get("published", ""),
                "server": paper.get("server", server),
                "jats_xml_path": paper.get("jats", ""),
            }
    
            return result
    
        except requests.exceptions.RequestException as e:
            logger.error(f"Error retrieving preprint {doi} from {server}: {e}")
            return {"error": f"Failed to retrieve preprint from {server}: {e!s}"}
        except Exception as e:
            logger.error(f"Unexpected error retrieving preprint {doi}: {e}")
            return {"error": f"Unexpected error: {e!s}"}
  • Defines the core_mcp FastMCP server instance named 'BC', which is used to register all core tools via @core_mcp.tool() decorators. This server is later imported into the main app with 'bc' prefix.
    from fastmcp import FastMCP
    
    core_mcp = FastMCP(  # type: ignore
        "BC",
        instructions="Provides access to biomedical knowledge bases.",
    )
  • Package init file that imports the get_biorxiv_preprint_details function, triggering its decorator-based registration in core_mcp when the package is imported.
    """bioRxiv and medRxiv preprint search tools.
    
    These tools provide access to bioRxiv and medRxiv preprint servers for searching
    and retrieving preprint metadata. bioRxiv focuses on biological sciences while
    medRxiv focuses on medical sciences.
    """
    
    from ._get_preprint_details import get_biorxiv_preprint_details
    from ._get_recent_biorxiv_preprints import get_recent_biorxiv_preprints
    
    __all__ = [
        "get_biorxiv_preprint_details",
        "get_recent_biorxiv_preprints",
    ]
  • The setup function imports the core_mcp server (including the biorxiv tools) into the main 'BioContextAI' MCP app using import_server with slugified prefix 'bc', effectively registering all core tools with 'bc_' prefix (e.g., 'bc_get_biorxiv_preprint_details'). Also logs the tools.
    async def setup(mcp_app: FastMCP):
        """Setup function to initialize the MCP server."""
        logger.info("Environment: %s", os.environ.get("MCP_ENVIRONMENT"))
    
        logger.info("Setting up MCP server...")
        for mcp in [core_mcp, *(await get_openapi_mcps())]:
            await mcp_app.import_server(
                mcp,
                slugify(mcp.name),
            )
        logger.info("MCP server setup complete.")
    
        logger.info("Checking MCP server for valid tools...")
        await get_mcp_tools(mcp_app)
        logger.info("MCP server tools check complete.")
  • Imports core_mcp which loads/registers all core tools (including biorxiv ones) via their decorators when modules are imported.
    from biocontext_kb.core import core_mcp
    from biocontext_kb.openapi import get_openapi_mcps
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It describes the return format ('dict: Preprint metadata... or error message'), which adds useful context beyond the input schema. However, it does not mention potential limitations like rate limits, authentication needs, or error conditions beyond the generic 'error message' reference.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the purpose and lists retrieved fields, and the second specifies the return format. Every sentence adds value without redundancy, making it front-loaded and appropriately sized for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema (implied by 'Has output schema: true'), the description need not detail return values, and it adequately covers the purpose and usage. With no annotations, it could benefit from more behavioral context like error handling, but the presence of an output schema reduces the burden. The description is mostly complete for a read-only metadata retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with clear documentation for both parameters (doi and server). The description does not add any parameter-specific details beyond what the schema already provides, such as format examples or usage nuances. With high schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate with extra semantic value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get detailed preprint metadata by DOI') and resource ('preprint'), distinguishing it from sibling tools like 'bc_get_recent_biorxiv_preprints' which retrieves recent preprints rather than details by DOI. It explicitly lists the metadata fields retrieved, making the purpose highly specific and differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying it retrieves metadata 'by DOI', but does not explicitly state when to use this tool versus alternatives like 'bc_get_recent_biorxiv_preprints' or other preprint-related tools. It provides context for the DOI parameter but lacks explicit guidance on when-not-to-use or named alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/biocontext-ai/knowledgebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server