Skip to main content
Glama
biocontext-ai

BioContextAI Knowledgebase MCP

Official

bc_get_recent_biorxiv_preprints

Retrieve recent bioRxiv/medRxiv preprints by date range, days, or count to access current biomedical research findings.

Instructions

Search bioRxiv/medRxiv preprints by date range or recent count. Specify one search method: date range, days, or recent_count.

Returns: dict: Search results with server, search_params, total_returned, papers list (each with title, authors, abstract, metadata), pagination info or error message.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
serverNo'biorxiv' or 'medrxiv'biorxiv
start_dateNoStart date (YYYY-MM-DD)
end_dateNoEnd date (YYYY-MM-DD)
daysNoSearch last N days (1-365, alternative to date range)
recent_countNoMost recent N preprints (1-1000, alternative to date range)
categoryNoFilter by subject (e.g., 'cell biology', 'neuroscience')
cursorNoPagination: starting position
max_resultsNoMax results per page (1-500)

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The handler function `get_recent_biorxiv_preprints` decorated with `@core_mcp.tool()`, which implements the core logic for retrieving recent bioRxiv or medRxiv preprints based on date range, days, or recent count. Includes input schema via Pydantic Annotated Fields.
    @core_mcp.tool()
    def get_recent_biorxiv_preprints(
        server: Annotated[str, Field(description="'biorxiv' or 'medrxiv'")] = "biorxiv",
        start_date: Annotated[Optional[str], Field(description="Start date (YYYY-MM-DD)")] = None,
        end_date: Annotated[Optional[str], Field(description="End date (YYYY-MM-DD)")] = None,
        days: Annotated[
            Optional[int], Field(description="Search last N days (1-365, alternative to date range)", ge=1, le=365)
        ] = None,
        recent_count: Annotated[
            Optional[int], Field(description="Most recent N preprints (1-1000, alternative to date range)", ge=1, le=1000)
        ] = None,
        category: Annotated[
            Optional[str], Field(description="Filter by subject (e.g., 'cell biology', 'neuroscience')")
        ] = None,
        cursor: Annotated[int, Field(description="Pagination: starting position", ge=0)] = 0,
        max_results: Annotated[int, Field(description="Max results per page (1-500)", ge=1, le=500)] = 100,
    ) -> Dict[str, Any]:
        """Search bioRxiv/medRxiv preprints by date range or recent count. Specify one search method: date range, days, or recent_count.
    
        Returns:
            dict: Search results with server, search_params, total_returned, papers list (each with title, authors, abstract, metadata), pagination info or error message.
        """
        # Validate server
        if server.lower() not in ["biorxiv", "medrxiv"]:
            return {"error": "Server must be 'biorxiv' or 'medrxiv'"}
    
        server = server.lower()
    
        # Validate input parameters - only one search method should be specified
        search_methods = [start_date and end_date, days, recent_count]
        if sum(bool(method) for method in search_methods) != 1:
            return {"error": "Specify exactly one of: date range (start_date + end_date), days, or recent_count"}
    
        try:
            # Build the interval parameter
            interval = ""
            if start_date and end_date:
                # Validate date format
                try:
                    start_date_obj = datetime.strptime(start_date, "%Y-%m-%d")
                    end_date_obj = datetime.strptime(end_date, "%Y-%m-%d")
                except ValueError:
                    return {"error": "Dates must be in YYYY-MM-DD format"}
    
                # Validate date range (start should be before or equal to end)
                if start_date_obj > end_date_obj:
                    return {"error": "Start date must be before or equal to end date"}
    
                interval = f"{start_date}/{end_date}"
            elif days:
                # Convert days to actual date range
                end_date_obj = datetime.now()
                start_date_obj = end_date_obj - timedelta(days=days)
                start_date_str = start_date_obj.strftime("%Y-%m-%d")
                end_date_str = end_date_obj.strftime("%Y-%m-%d")
                interval = f"{start_date_str}/{end_date_str}"
            elif recent_count:
                interval = str(recent_count)
    
            # Build URL
            base_url = f"https://api.biorxiv.org/details/{server}/{interval}/{cursor}/json"
    
            # Add category filter if specified
            params = {}
            if category and ((start_date and end_date) or days):  # Category works with date ranges
                params["category"] = category.replace(" ", "_")
    
            # Make request
            response = requests.get(base_url, params=params, timeout=30)
            response.raise_for_status()
            data = response.json()
    
            # Extract and limit results
            collection = data.get("collection", [])
            limited_results = collection[:max_results]
    
            # Clean up the results for better LLM consumption
            processed_results = []
            for paper in limited_results:
                processed_paper = {
                    "doi": paper.get("doi", ""),
                    "title": paper.get("title", ""),
                    "authors": paper.get("authors", ""),
                    "corresponding_author": paper.get("author_corresponding", ""),
                    "corresponding_institution": paper.get("author_corresponding_institution", ""),
                    "date": paper.get("date", ""),
                    "version": paper.get("version", ""),
                    "type": paper.get("type", ""),
                    "license": paper.get("license", ""),
                    "category": paper.get("category", ""),
                    "abstract": paper.get("abstract", ""),
                    "published": paper.get("published", ""),
                    "server": paper.get("server", server),
                }
                processed_results.append(processed_paper)
    
            # Get pagination info from messages
            messages = data.get("messages", [])
            pagination_info = {}
            for message in messages:
                if "cursor" in message.get("text", "").lower():
                    pagination_info["cursor_info"] = message.get("text", "")
                if "count" in message.get("text", "").lower():
                    pagination_info["count_info"] = message.get("text", "")
    
            return {
                "server": server,
                "search_params": {
                    "interval": interval,
                    "category": category,
                    "cursor": cursor,
                    "original_days": days if days else None,
                },
                "total_returned": len(processed_results),
                "papers": processed_results,
                "pagination": pagination_info,
                "messages": messages,
            }
    
        except requests.exceptions.RequestException as e:
            logger.error(f"Error searching {server}: {e}")
            return {"error": f"Failed to search {server}: {e!s}"}
        except Exception as e:
            logger.error(f"Unexpected error searching {server}: {e}")
            return {"error": f"Unexpected error: {e!s}"}
  • Imports the `get_recent_biorxiv_preprints` function in the package __init__.py, which triggers the tool registration via its decorator when the module is imported.
    from ._get_preprint_details import get_biorxiv_preprint_details
    from ._get_recent_biorxiv_preprints import get_recent_biorxiv_preprints
    
    __all__ = [
        "get_biorxiv_preprint_details",
        "get_recent_biorxiv_preprints",
    ]
  • Defines `core_mcp`, the FastMCP instance prefixed with 'BC' used to register all core tools including `bc_get_recent_biorxiv_preprints` via decorators.
    from fastmcp import FastMCP
    
    core_mcp = FastMCP(  # type: ignore
        "BC",
        instructions="Provides access to biomedical knowledge bases.",
    )
  • Imports the `core_mcp` server (containing the tool) into the main `mcp_app` FastMCP instance, finalizing the tool registration for the MCP server.
    for mcp in [core_mcp, *(await get_openapi_mcps())]:
        await mcp_app.import_server(
            mcp,
            slugify(mcp.name),
        )
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that the tool returns search results with details like papers list and pagination info, which adds behavioral context beyond the input schema. However, it doesn't mention rate limits, authentication needs, error conditions beyond a generic 'error message,' or whether it's read-only/destructive, leaving gaps in behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, with the first sentence stating the core functionality and the second detailing the return format. Every sentence earns its place by providing essential information without waste, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (8 parameters, no annotations, but with output schema), the description is mostly complete. It explains the search methods and return structure, and since an output schema exists, it needn't detail return values. However, it could improve by addressing behavioral aspects like rate limits or error handling to fully compensate for the lack of annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 8 parameters thoroughly. The description adds marginal value by emphasizing the exclusive choice among search methods (date range, days, or recent_count), but doesn't provide additional syntax, format, or meaning beyond what's in the schema. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches bioRxiv/medRxiv preprints by date range or recent count, which is a specific verb+resource combination. However, it doesn't explicitly differentiate from sibling tools like 'bc_get_biorxiv_preprint_details' or 'bc_search_google_scholar_publications', which might have overlapping search functionality. The purpose is clear but lacks sibling differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context by specifying 'Specify one search method: date range, days, or recent_count,' which helps guide parameter usage. However, it doesn't mention when to use this tool versus alternatives like 'bc_get_biorxiv_preprint_details' for specific preprint details or other search tools in the sibling list, so it lacks explicit exclusions or named alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/biocontext-ai/knowledgebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server