Skip to main content
Glama
openags

Paper Search MCP

by openags

download_citeseerx

Download academic paper PDFs from CiteSeerX by providing the paper identifier, saving files to your specified directory for research access.

Instructions

Download PDF for a paper from CiteSeerX.

Args: paper_id: CiteSeerX paper identifier. save_path: Directory to save the PDF (default: './downloads'). Returns: str: Path to downloaded PDF or error message.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
paper_idYes
save_pathNo./downloads

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The 'download_citeseerx' tool handler in the MCP server, which wraps the platform-specific implementation.
    @mcp.tool()
    async def download_citeseerx(paper_id: str, save_path: str = "./downloads") -> str:
        """Download PDF for a paper from CiteSeerX.
    
        Args:
            paper_id: CiteSeerX paper identifier.
            save_path: Directory to save the PDF (default: './downloads').
        Returns:
            str: Path to downloaded PDF or error message.
        """
        return citeseerx_searcher.download_pdf(paper_id, save_path)
  • The actual implementation of the PDF download logic for CiteSeerX papers.
    def download_pdf(self, paper_id: str, save_path: str) -> str:
        """
        Download PDF for a CiteSeerX paper.
    
        Args:
            paper_id: CiteSeerX paper identifier
            save_path: Directory to save the PDF
    
        Returns:
            Path to the saved PDF file
    
        Raises:
            Exception: If download fails or no PDF available
        """
        # First get paper details to find PDF URL
        paper = self.get_paper_details(paper_id)
        if not paper or not paper.pdf_url:
            raise Exception(f"No PDF available for paper {paper_id}")
    
        try:
            # Download PDF
            response = self._get(paper.pdf_url, stream=True)
            response.raise_for_status()
    
            # Create save directory if it doesn't exist
            os.makedirs(save_path, exist_ok=True)
    
            # Generate filename
            filename = f"{paper_id.replace('/', '_')}.pdf"
            if paper.doi:
                filename = f"{paper.doi.replace('/', '_')}.pdf"
            filepath = os.path.join(save_path, filename)
    
            # Save PDF
            with open(filepath, 'wb') as f:
                for chunk in response.iter_content(chunk_size=8192):
                    f.write(chunk)
    
            logger.info(f"Downloaded PDF to {filepath}")
            return filepath
    
        except requests.RequestException as e:
            logger.error(f"Error downloading PDF: {e}")
            raise Exception(f"Failed to download PDF: {e}")
        except Exception as e:
            logger.error(f"Unexpected error downloading PDF: {e}")
            raise
    
    def read_paper(self, paper_id: str, save_path: str = "./downloads") -> str:
        """
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions that the tool returns a path or error message, which is helpful, but doesn't cover critical aspects like whether it requires authentication, rate limits, network dependencies, file overwriting behavior, or error conditions. For a download operation, this leaves significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear purpose statement followed by Args and Returns sections. It's efficient with minimal waste, though the 'Args' and 'Returns' labels could be integrated more smoothly. Every sentence adds value, making it appropriately concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (download operation with 2 parameters), no annotations, and an output schema (implied by 'Returns: str'), the description is partially complete. It covers the basic purpose and parameters but lacks behavioral details and usage context. The output schema helps by specifying the return type, but more guidance is needed for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the schema provides no parameter descriptions. The description adds some value by explaining 'paper_id' as a 'CiteSeerX paper identifier' and 'save_path' with its default, but it doesn't specify format requirements (e.g., paper_id structure, path validity) or constraints. This partially compensates but falls short of fully documenting the parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Download PDF for a paper from CiteSeerX.' It specifies the verb ('Download'), resource ('PDF for a paper'), and source ('from CiteSeerX'), making it easy to understand what the tool does. However, it doesn't explicitly differentiate from sibling tools like 'download_arxiv' or 'read_citeseerx_paper', which would require a 5.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. With many sibling tools like 'download_arxiv', 'read_citeseerx_paper', and 'search_citeseerx', there's no indication of when this specific download tool is appropriate, what prerequisites might exist, or when other tools might be better suited.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/openags/paper-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server