read_citeseerx_paper
Extract text content from CiteSeerX academic papers by providing a paper identifier, enabling access to research material for analysis and reference.
Instructions
Read and extract text content from a CiteSeerX paper.
Args: paper_id: CiteSeerX paper identifier. save_path: Directory where the PDF is/will be saved (default: './downloads'). Returns: str: Extracted text or fallback abstract/error message.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| paper_id | Yes | ||
| save_path | No | ./downloads |
Implementation Reference
- The actual implementation of the read_paper logic for CiteSeerX.
def read_paper(self, paper_id: str, save_path: str = "./downloads") -> str: """ Download and extract text from a CiteSeerX paper. Note: CiteSeerX provides abstracts but not always full text. This method tries to download PDF and extract text if available. Args: paper_id: CiteSeerX paper identifier save_path: Directory where PDF is/will be saved Returns: Extracted text content of the paper (abstract if PDF not available) Raises: Exception: If paper reading fails """ try: # First get paper details paper = self.get_paper_details(paper_id) if not paper: raise Exception(f"Paper {paper_id} not found") - paper_search_mcp/server.py:1107-1117 (registration)Registration and tool definition for read_citeseerx_paper in the MCP server.
@mcp.tool() async def read_citeseerx_paper(paper_id: str, save_path: str = "./downloads") -> str: """Read and extract text content from a CiteSeerX paper. Args: paper_id: CiteSeerX paper identifier. save_path: Directory where the PDF is/will be saved (default: './downloads'). Returns: str: Extracted text or fallback abstract/error message. """ return citeseerx_searcher.read_paper(paper_id, save_path)