paper_download
Download research paper PDFs using DOI identifiers from multiple academic sources including Unpaywall, arXiv, and Sci-Hub.
Instructions
通过 DOI 下载一篇论文 PDF(本地多源下载:Unpaywall → arXiv → Sci-Hub)。
Args: doi: 论文的 DOI,例如 "10.1109/tim.2021.3106677" output_dir: 保存 PDF 的目录路径,默认为当前目录
Returns: 下载结果的 JSON 字符串,包含 success, doi, path, size_mb, source 等字段
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| doi | Yes | ||
| output_dir | No | . |
Implementation Reference
- downloader.py:244-270 (handler)The actual logic for paper_download is implemented here in download_paper, which coordinates multiple download sources.
def download_paper(doi: str, output_dir: str = ".") -> dict: """下载论文的主入口,按优先级尝试多个来源 优先级:Unpaywall (合法OA) → arXiv → Sci-Hub (自定义) → Sci-Hub (scidownl) """ os.makedirs(output_dir, exist_ok=True) output = os.path.join(output_dir, _safe_name(doi)) # 已存在则跳过 if os.path.exists(output): size_mb = os.path.getsize(output) / 1024 / 1024 return { "success": True, "doi": doi, "path": os.path.abspath(output), "size_mb": round(size_mb, 2), "source": "cached", } # 1. Unpaywall (合法 OA) oa_url = _try_unpaywall(doi) if oa_url: try: r = requests.get(oa_url, timeout=30, allow_redirects=True) if r.content[:5] == b"%PDF-": with open(output, "wb") as f: f.write(r.content) - scholar_mcp_server.py:37-49 (registration)The paper_download tool is registered as an MCP tool here.
@mcp.tool() def paper_download(doi: str, output_dir: str = ".") -> str: """通过 DOI 下载一篇论文 PDF(本地多源下载:Unpaywall → arXiv → Sci-Hub)。 Args: doi: 论文的 DOI,例如 "10.1109/tim.2021.3106677" output_dir: 保存 PDF 的目录路径,默认为当前目录 Returns: 下载结果的 JSON 字符串,包含 success, doi, path, size_mb, source 等字段 """ result = download_paper(doi, output_dir) return json.dumps(result, ensure_ascii=False, indent=2)