download_medrxiv
Download PDF files from medRxiv by providing a paper DOI, saving them to a specified directory for local access to medical research preprints.
Instructions
Download PDF of a medRxiv paper.
Args: paper_id: medRxiv DOI. save_path: Directory to save the PDF (default: './downloads'). Returns: Path to the downloaded PDF file.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| paper_id | Yes | ||
| save_path | No | ./downloads |
Implementation Reference
- paper_search_mcp/server.py:494-504 (handler)MCP tool definition and handler for download_medrxiv in the server.
@mcp.tool() async def download_medrxiv(paper_id: str, save_path: str = "./downloads") -> str: """Download PDF of a medRxiv paper. Args: paper_id: medRxiv DOI. save_path: Directory to save the PDF (default: './downloads'). Returns: Path to the downloaded PDF file. """ return medrxiv_searcher.download_pdf(paper_id, save_path) - The actual implementation of the medRxiv PDF download logic within MedRxivSearcher.
def download_pdf(self, paper_id: str, save_path: str) -> str: """ Download a PDF for a given paper ID from medRxiv. Args: paper_id: The DOI of the paper. save_path: Directory to save the PDF. Returns: Path to the downloaded PDF file. """ if not paper_id: raise ValueError("Invalid paper_id: paper_id is empty") pdf_url = f"https://www.medrxiv.org/content/{paper_id}v1.full.pdf" tries = 0 while tries < self.max_retries: try: # Add User-Agent to avoid potential 403 errors headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36' } response = self.session.get(pdf_url, timeout=self.timeout, headers=headers) response.raise_for_status() os.makedirs(save_path, exist_ok=True) output_file = f"{save_path}/{paper_id.replace('/', '_')}.pdf" with open(output_file, 'wb') as f: f.write(response.content) return output_file