bc_get_recent_biorxiv_preprints
Retrieve recent bioRxiv/medRxiv preprints by date range, days, or count to access current biomedical research findings.
Instructions
Search bioRxiv/medRxiv preprints by date range or recent count. Specify one search method: date range, days, or recent_count.
Returns: dict: Search results with server, search_params, total_returned, papers list (each with title, authors, abstract, metadata), pagination info or error message.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| server | No | 'biorxiv' or 'medrxiv' | biorxiv |
| start_date | No | Start date (YYYY-MM-DD) | |
| end_date | No | End date (YYYY-MM-DD) | |
| days | No | Search last N days (1-365, alternative to date range) | |
| recent_count | No | Most recent N preprints (1-1000, alternative to date range) | |
| category | No | Filter by subject (e.g., 'cell biology', 'neuroscience') | |
| cursor | No | Pagination: starting position | |
| max_results | No | Max results per page (1-500) |
Implementation Reference
- The handler function `get_recent_biorxiv_preprints` decorated with `@core_mcp.tool()`, which implements the core logic for retrieving recent bioRxiv or medRxiv preprints based on date range, days, or recent count. Includes input schema via Pydantic Annotated Fields.@core_mcp.tool() def get_recent_biorxiv_preprints( server: Annotated[str, Field(description="'biorxiv' or 'medrxiv'")] = "biorxiv", start_date: Annotated[Optional[str], Field(description="Start date (YYYY-MM-DD)")] = None, end_date: Annotated[Optional[str], Field(description="End date (YYYY-MM-DD)")] = None, days: Annotated[ Optional[int], Field(description="Search last N days (1-365, alternative to date range)", ge=1, le=365) ] = None, recent_count: Annotated[ Optional[int], Field(description="Most recent N preprints (1-1000, alternative to date range)", ge=1, le=1000) ] = None, category: Annotated[ Optional[str], Field(description="Filter by subject (e.g., 'cell biology', 'neuroscience')") ] = None, cursor: Annotated[int, Field(description="Pagination: starting position", ge=0)] = 0, max_results: Annotated[int, Field(description="Max results per page (1-500)", ge=1, le=500)] = 100, ) -> Dict[str, Any]: """Search bioRxiv/medRxiv preprints by date range or recent count. Specify one search method: date range, days, or recent_count. Returns: dict: Search results with server, search_params, total_returned, papers list (each with title, authors, abstract, metadata), pagination info or error message. """ # Validate server if server.lower() not in ["biorxiv", "medrxiv"]: return {"error": "Server must be 'biorxiv' or 'medrxiv'"} server = server.lower() # Validate input parameters - only one search method should be specified search_methods = [start_date and end_date, days, recent_count] if sum(bool(method) for method in search_methods) != 1: return {"error": "Specify exactly one of: date range (start_date + end_date), days, or recent_count"} try: # Build the interval parameter interval = "" if start_date and end_date: # Validate date format try: start_date_obj = datetime.strptime(start_date, "%Y-%m-%d") end_date_obj = datetime.strptime(end_date, "%Y-%m-%d") except ValueError: return {"error": "Dates must be in YYYY-MM-DD format"} # Validate date range (start should be before or equal to end) if start_date_obj > end_date_obj: return {"error": "Start date must be before or equal to end date"} interval = f"{start_date}/{end_date}" elif days: # Convert days to actual date range end_date_obj = datetime.now() start_date_obj = end_date_obj - timedelta(days=days) start_date_str = start_date_obj.strftime("%Y-%m-%d") end_date_str = end_date_obj.strftime("%Y-%m-%d") interval = f"{start_date_str}/{end_date_str}" elif recent_count: interval = str(recent_count) # Build URL base_url = f"https://api.biorxiv.org/details/{server}/{interval}/{cursor}/json" # Add category filter if specified params = {} if category and ((start_date and end_date) or days): # Category works with date ranges params["category"] = category.replace(" ", "_") # Make request response = requests.get(base_url, params=params, timeout=30) response.raise_for_status() data = response.json() # Extract and limit results collection = data.get("collection", []) limited_results = collection[:max_results] # Clean up the results for better LLM consumption processed_results = [] for paper in limited_results: processed_paper = { "doi": paper.get("doi", ""), "title": paper.get("title", ""), "authors": paper.get("authors", ""), "corresponding_author": paper.get("author_corresponding", ""), "corresponding_institution": paper.get("author_corresponding_institution", ""), "date": paper.get("date", ""), "version": paper.get("version", ""), "type": paper.get("type", ""), "license": paper.get("license", ""), "category": paper.get("category", ""), "abstract": paper.get("abstract", ""), "published": paper.get("published", ""), "server": paper.get("server", server), } processed_results.append(processed_paper) # Get pagination info from messages messages = data.get("messages", []) pagination_info = {} for message in messages: if "cursor" in message.get("text", "").lower(): pagination_info["cursor_info"] = message.get("text", "") if "count" in message.get("text", "").lower(): pagination_info["count_info"] = message.get("text", "") return { "server": server, "search_params": { "interval": interval, "category": category, "cursor": cursor, "original_days": days if days else None, }, "total_returned": len(processed_results), "papers": processed_results, "pagination": pagination_info, "messages": messages, } except requests.exceptions.RequestException as e: logger.error(f"Error searching {server}: {e}") return {"error": f"Failed to search {server}: {e!s}"} except Exception as e: logger.error(f"Unexpected error searching {server}: {e}") return {"error": f"Unexpected error: {e!s}"}
- src/biocontext_kb/core/biorxiv/__init__.py:8-14 (registration)Imports the `get_recent_biorxiv_preprints` function in the package __init__.py, which triggers the tool registration via its decorator when the module is imported.from ._get_preprint_details import get_biorxiv_preprint_details from ._get_recent_biorxiv_preprints import get_recent_biorxiv_preprints __all__ = [ "get_biorxiv_preprint_details", "get_recent_biorxiv_preprints", ]
- src/biocontext_kb/core/_server.py:1-6 (registration)Defines `core_mcp`, the FastMCP instance prefixed with 'BC' used to register all core tools including `bc_get_recent_biorxiv_preprints` via decorators.from fastmcp import FastMCP core_mcp = FastMCP( # type: ignore "BC", instructions="Provides access to biomedical knowledge bases.", )
- src/biocontext_kb/app.py:35-39 (registration)Imports the `core_mcp` server (containing the tool) into the main `mcp_app` FastMCP instance, finalizing the tool registration for the MCP server.for mcp in [core_mcp, *(await get_openapi_mcps())]: await mcp_app.import_server( mcp, slugify(mcp.name), )