read_medrxiv_paper

Extract full text from medRxiv papers by providing a DOI, converting PDFs to Markdown format for analysis.

Instructions

Download and extract full text from medRxiv paper.

Args: paper_id: medRxiv DOI. save_path: Directory to save PDF. Returns: Full paper text in Markdown format.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`paper_id`	Yes
`save_path`	No

Implementation Reference

paper_find_mcp/server.py:415-426 (handler)
MCP tool handler for 'read_medrxiv_paper'. Decorated with @mcp.tool() for registration and execution. Delegates to the generic _read function using the 'medrxiv' searcher instance.
@mcp.tool() async def read_medrxiv_paper(paper_id: str, save_path: Optional[str] = None) -> str: """Download and extract full text from medRxiv paper. Args: paper_id: medRxiv DOI. save_path: Directory to save PDF. Returns: Full paper text in Markdown format. """ return await _read('medrxiv', paper_id, save_path)
paper_find_mcp/academic_platforms/medrxiv.py:213-238 (helper)
Core implementation of paper reading logic in MedRxivSearcher.read_paper(). Downloads the PDF from medRxiv.org if not present, then extracts full text as Markdown using pymupdf4llm.to_markdown()
def read_paper(self, paper_id: str, save_path: str) -> str: """下载并提取论文文本 Args: paper_id: medRxiv DOI save_path: 保存目录 Returns: 提取的 Markdown 文本 """ pdf_path = os.path.join(save_path, f"{paper_id.replace('/', '_')}.pdf") if not os.path.exists(pdf_path): result = self.download_pdf(paper_id, save_path) if result.startswith("Error"): return result pdf_path = result try: text = pymupdf4llm.to_markdown(pdf_path, show_progress=False) logger.info(f"Extracted {len(text)} characters from {pdf_path}") return text except Exception as e: logger.error(f"Failed to extract text: {e}") return f"Error extracting text: {e}"
paper_find_mcp/server.py:137-157 (helper)
Generic _read helper function called by all read_*_paper tools. Retrieves the platform-specific searcher from SEARCHERS dict and invokes its read_paper method.
async def _read( searcher_name: str, paper_id: str, save_path: Optional[str] = None ) -> str: """通用阅读函数""" if save_path is None: save_path = get_download_path() searcher = SEARCHERS.get(searcher_name) if not searcher: return f"Error: Unknown searcher {searcher_name}" try: return searcher.read_paper(paper_id, save_path) except NotImplementedError as e: return str(e) except Exception as e: logger.error(f"Read failed for {searcher_name}: {e}") return f"Error reading paper: {str(e)}"
paper_find_mcp/server.py:75-85 (registration)
Global SEARCHERS dictionary where MedRxivSearcher instance is registered under 'medrxiv' key, enabling the generic _read function to dispatch to the correct implementation.
SEARCHERS = { 'arxiv': ArxivSearcher(), 'pubmed': PubMedSearcher(), 'biorxiv': BioRxivSearcher(), 'medrxiv': MedRxivSearcher(), 'google_scholar': GoogleScholarSearcher(), 'iacr': IACRSearcher(), 'semantic': SemanticSearcher(), 'crossref': CrossRefSearcher(), 'repec': RePECSearcher(), }

Paper Search MCP Server

read_medrxiv_paper

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API