Skip to main content
Glama
openags

Paper Search MCP

by openags

download_with_fallback

Download academic papers by attempting source-native access first, then open-access repositories, with optional Sci-Hub fallback when other methods fail.

Instructions

Try source-native download, OA repositories, Unpaywall, then optional Sci-Hub.

Args: source: Source name (arxiv, biorxiv, medrxiv, iacr, semantic, crossref, pubmed, pmc, core, europepmc, citeseerx, doaj, base, zenodo, hal, ssrn). paper_id: Source-native paper identifier. doi: Optional DOI used for repository/unpaywall/Sci-Hub fallback. title: Optional title used for repository/Sci-Hub fallback when DOI is unavailable. save_path: Directory to save downloaded files. use_scihub: Whether to fallback to Sci-Hub after OA attempts fail. scihub_base_url: Sci-Hub mirror URL for fallback. Returns: Download path on success or explanatory error message.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
sourceYes
paper_idYes
doiNo
titleNo
save_pathNo./downloads
use_scihubNo
scihub_base_urlNohttps://sci-hub.se

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The implementation of the 'download_with_fallback' tool, which attempts to download a paper from a primary source, then tries repositories, Unpaywall, and optionally Sci-Hub.
    async def download_with_fallback(
        source: str,
        paper_id: str,
        doi: str = "",
        title: str = "",
        save_path: str = "./downloads",
        use_scihub: bool = True,
        scihub_base_url: str = "https://sci-hub.se",
    ) -> str:
        """Try source-native download, OA repositories, Unpaywall, then optional Sci-Hub.
    
        Args:
            source: Source name (arxiv, biorxiv, medrxiv, iacr, semantic, crossref, pubmed, pmc, core, europepmc, citeseerx, doaj, base, zenodo, hal, ssrn).
            paper_id: Source-native paper identifier.
            doi: Optional DOI used for repository/unpaywall/Sci-Hub fallback.
            title: Optional title used for repository/Sci-Hub fallback when DOI is unavailable.
            save_path: Directory to save downloaded files.
            use_scihub: Whether to fallback to Sci-Hub after OA attempts fail.
            scihub_base_url: Sci-Hub mirror URL for fallback.
        Returns:
            Download path on success or explanatory error message.
        """
        source_name = source.strip().lower()
    
        primary_downloaders = {
            "arxiv": arxiv_searcher.download_pdf,
            "biorxiv": biorxiv_searcher.download_pdf,
            "medrxiv": medrxiv_searcher.download_pdf,
            "iacr": iacr_searcher.download_pdf,
            "semantic": semantic_searcher.download_pdf,
            "pubmed": pubmed_searcher.download_pdf,
            "crossref": crossref_searcher.download_pdf,
            "pmc": pmc_searcher.download_pdf,
            "core": core_searcher.download_pdf,
            "europepmc": europepmc_searcher.download_pdf,
            "citeseerx": citeseerx_searcher.download_pdf,
            "doaj": doaj_searcher.download_pdf,
            "base": base_searcher.download_pdf,
            "zenodo": zenodo_searcher.download_pdf,
            "hal": hal_searcher.download_pdf,
            "ssrn": ssrn_searcher.download_pdf,
        }
    
        attempt_errors: List[str] = []
        primary_error = ""
        if source_name in primary_downloaders:
            try:
                primary_result = await asyncio.to_thread(primary_downloaders[source_name], paper_id, save_path)
                if isinstance(primary_result, str) and os.path.exists(primary_result):
                    return primary_result
                if isinstance(primary_result, str) and primary_result:
                    primary_error = primary_result
            except Exception as exc:
                primary_error = str(exc)
                logger.warning("Primary download failed for %s/%s: %s", source_name, paper_id, exc)
        else:
            primary_error = f"Unsupported source '{source_name}' for primary download."
    
        if primary_error:
            attempt_errors.append(f"primary: {primary_error}")
    
        repository_result, repository_error = await _try_repository_fallback(doi, title, save_path)
        if repository_result:
            return repository_result
        if repository_error:
            attempt_errors.append(f"repositories: {repository_error}")
    
        normalized_doi = (doi or "").strip()
        if normalized_doi:
            unpaywall_url = await asyncio.to_thread(unpaywall_resolver.resolve_best_pdf_url, normalized_doi)
            if unpaywall_url:
                unpaywall_result = await _download_from_url(unpaywall_url, save_path, f"unpaywall_{normalized_doi}")
                if unpaywall_result:
                    return unpaywall_result
                attempt_errors.append("unpaywall: resolved OA URL but download failed")
            else:
                attempt_errors.append("unpaywall: no OA URL found (or PAPER_SEARCH_MCP_UNPAYWALL_EMAIL/UNPAYWALL_EMAIL missing)")
        else:
            attempt_errors.append("unpaywall: DOI not provided")
    
        if not use_scihub:
            return "Download failed after OA fallback chain. Details: " + " | ".join(attempt_errors)
    
        fallback_identifier = (doi or "").strip() or (title or "").strip() or paper_id
        fetcher = SciHubFetcher(base_url=scihub_base_url, output_dir=save_path)
        fallback_result = await asyncio.to_thread(fetcher.download_pdf, fallback_identifier)
        if fallback_result:
            return fallback_result
    
        return "Download failed after OA fallback chain and Sci-Hub fallback. Details: " + " | ".join(attempt_errors)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure and does so effectively. It describes the multi-step fallback behavior, explains what happens on success (returns download path) versus failure (returns error message), and mentions the optional Sci-Hub fallback with configurable mirror URL. It doesn't cover rate limits or authentication needs, but provides substantial operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly structured and concise. The first sentence states the core functionality, followed by clear sections for Args and Returns. Every sentence earns its place by providing essential information without redundancy. The parameter explanations are terse but complete, and the overall length is appropriate for a 7-parameter tool with complex behavior.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, multi-step fallback behavior) and the presence of an output schema (which handles return value documentation), the description is complete. It explains the fallback sequence, parameter purposes, and success/failure outcomes. With no annotations, it provides all necessary operational context for the agent to understand when and how to use this tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description provides excellent parameter semantics. It explains the purpose of each parameter: 'source' specifies the source name with enumerated examples, 'paper_id' is the source-native identifier, 'doi' and 'title' are for fallback lookups, 'save_path' is the download directory, and the Sci-Hub parameters control optional fallback behavior. This fully compensates for the lack of schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Try source-native download, OA repositories, Unpaywall, then optional Sci-Hub') and distinguishes it from siblings by describing a multi-source fallback approach rather than single-source downloads like 'download_arxiv' or 'download_biorxiv'. It explicitly mentions the fallback hierarchy which differentiates it from simpler download tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool versus alternatives by stating it tries multiple sources in sequence (source-native → OA repositories → Unpaywall → Sci-Hub). This clearly indicates it should be used when you want comprehensive download attempts with fallbacks, rather than the single-source sibling tools. The Sci-Hub fallback is explicitly marked as optional with a parameter.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/openags/paper-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server