Skip to main content
Glama
h-lu
by h-lu

read_medrxiv_paper

Extract full text from medRxiv papers by providing a DOI, converting PDFs to Markdown format for analysis.

Instructions

Download and extract full text from medRxiv paper.

Args: paper_id: medRxiv DOI. save_path: Directory to save PDF. Returns: Full paper text in Markdown format.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
paper_idYes
save_pathNo

Implementation Reference

  • MCP tool handler for 'read_medrxiv_paper'. Decorated with @mcp.tool() for registration and execution. Delegates to the generic _read function using the 'medrxiv' searcher instance.
    @mcp.tool() async def read_medrxiv_paper(paper_id: str, save_path: Optional[str] = None) -> str: """Download and extract full text from medRxiv paper. Args: paper_id: medRxiv DOI. save_path: Directory to save PDF. Returns: Full paper text in Markdown format. """ return await _read('medrxiv', paper_id, save_path)
  • Core implementation of paper reading logic in MedRxivSearcher.read_paper(). Downloads the PDF from medRxiv.org if not present, then extracts full text as Markdown using pymupdf4llm.to_markdown()
    def read_paper(self, paper_id: str, save_path: str) -> str: """下载并提取论文文本 Args: paper_id: medRxiv DOI save_path: 保存目录 Returns: 提取的 Markdown 文本 """ pdf_path = os.path.join(save_path, f"{paper_id.replace('/', '_')}.pdf") if not os.path.exists(pdf_path): result = self.download_pdf(paper_id, save_path) if result.startswith("Error"): return result pdf_path = result try: text = pymupdf4llm.to_markdown(pdf_path, show_progress=False) logger.info(f"Extracted {len(text)} characters from {pdf_path}") return text except Exception as e: logger.error(f"Failed to extract text: {e}") return f"Error extracting text: {e}"
  • Generic _read helper function called by all read_*_paper tools. Retrieves the platform-specific searcher from SEARCHERS dict and invokes its read_paper method.
    async def _read( searcher_name: str, paper_id: str, save_path: Optional[str] = None ) -> str: """通用阅读函数""" if save_path is None: save_path = get_download_path() searcher = SEARCHERS.get(searcher_name) if not searcher: return f"Error: Unknown searcher {searcher_name}" try: return searcher.read_paper(paper_id, save_path) except NotImplementedError as e: return str(e) except Exception as e: logger.error(f"Read failed for {searcher_name}: {e}") return f"Error reading paper: {str(e)}"
  • Global SEARCHERS dictionary where MedRxivSearcher instance is registered under 'medrxiv' key, enabling the generic _read function to dispatch to the correct implementation.
    SEARCHERS = { 'arxiv': ArxivSearcher(), 'pubmed': PubMedSearcher(), 'biorxiv': BioRxivSearcher(), 'medrxiv': MedRxivSearcher(), 'google_scholar': GoogleScholarSearcher(), 'iacr': IACRSearcher(), 'semantic': SemanticSearcher(), 'crossref': CrossRefSearcher(), 'repec': RePECSearcher(), }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/h-lu/paper-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server