Skip to main content
Glama
h-lu

Paper Search MCP Server

by h-lu

download_arxiv

Download PDFs from arXiv using paper IDs to access academic papers for research and study purposes.

Instructions

Download PDF from arXiv (always free and available).

Args:
    paper_id: arXiv ID (e.g., '2106.12345', '2312.00001v2').
    save_path: Directory to save PDF (default: ~/paper_downloads).

Returns:
    Path to downloaded PDF file.

Example:
    download_arxiv("2106.12345")

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
paper_idYes
save_pathNo

Implementation Reference

  • MCP tool handler and registration for 'download_arxiv'. This is the entry point decorated with @mcp.tool(), defining input schema via args/docstring and delegating to the generic _download helper using the 'arxiv' searcher.
    @mcp.tool()
    async def download_arxiv(paper_id: str, save_path: Optional[str] = None) -> str:
        """Download PDF from arXiv (always free and available).
        
        Args:
            paper_id: arXiv ID (e.g., '2106.12345', '2312.00001v2').
            save_path: Directory to save PDF (default: ~/paper_downloads).
        
        Returns:
            Path to downloaded PDF file.
        
        Example:
            download_arxiv("2106.12345")
        """
        return await _download('arxiv', paper_id, save_path)
  • Generic download helper function used by all platform-specific download tools, including download_arxiv. Retrieves the searcher instance and calls its download_pdf method.
    async def _download(
        searcher_name: str, 
        paper_id: str, 
        save_path: Optional[str] = None
    ) -> str:
        """通用下载函数"""
        if save_path is None:
            save_path = get_download_path()
        
        searcher = SEARCHERS.get(searcher_name)
        if not searcher:
            return f"Error: Unknown searcher {searcher_name}"
        
        try:
            return searcher.download_pdf(paper_id, save_path)
        except NotImplementedError as e:
            return str(e)
        except Exception as e:
            logger.error(f"Download failed for {searcher_name}: {e}")
            return f"Error downloading: {str(e)}"
  • Core implementation of PDF download in ArxivSearcher.download_pdf method. Downloads from https://arxiv.org/pdf/{paper_id}.pdf, handles caching, sanitizes filename, and saves to specified directory.
    def download_pdf(self, paper_id: str, save_path: str) -> str:
        """下载 arXiv 论文 PDF
        
        Args:
            paper_id: arXiv 论文 ID (例如 '2106.12345')
            save_path: 保存目录
            
        Returns:
            str: PDF 文件路径
            
        Raises:
            RuntimeError: 下载失败时抛出
        """
        # 确保目录存在
        os.makedirs(save_path, exist_ok=True)
        
        # 构建文件路径
        # 处理带版本号的 ID (例如 2106.12345v2)
        safe_id = paper_id.replace('/', '_').replace(':', '_')
        output_file = os.path.join(save_path, f"{safe_id}.pdf")
        
        # 检查文件是否已存在
        if os.path.exists(output_file):
            logger.info(f"PDF already exists: {output_file}")
            return output_file
        
        # 下载 PDF
        pdf_url = f"https://arxiv.org/pdf/{paper_id}.pdf"
        
        try:
            response = requests.get(pdf_url, timeout=60)
            response.raise_for_status()
            
            with open(output_file, 'wb') as f:
                f.write(response.content)
            
            logger.info(f"PDF downloaded: {output_file}")
            return output_file
            
        except requests.RequestException as e:
            raise RuntimeError(f"Failed to download PDF: {e}")
  • Registration of searcher instances, including 'arxiv': ArxivSearcher(), which provides the download_pdf implementation used by download_arxiv.
    SEARCHERS = {
        'arxiv': ArxivSearcher(),
        'pubmed': PubMedSearcher(),
        'biorxiv': BioRxivSearcher(),
        'medrxiv': MedRxivSearcher(),
        'google_scholar': GoogleScholarSearcher(),
        'iacr': IACRSearcher(),
        'semantic': SemanticSearcher(),
        'crossref': CrossRefSearcher(),
        'repec': RePECSearcher(),
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/h-lu/paper-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server