Skip to main content
Glama
andybrandt

mcp-simple-pubmed

by andybrandt

Get a paper's full text

get_paper_fulltext
Read-only

Retrieve complete research paper text from PubMed Central using article IDs. Provides full content when available or explains access requirements if not.

Instructions

Get full text of a PubMed article using its ID.

This tool attempts to retrieve the complete text of the paper if available through PubMed Central. If the paper is not available in PMC, it will return a message explaining why and provide information about where the text might be available (e.g., through DOI).

Example usage: get_paper_fulltext(pmid="39661433")

Returns:

  • If successful: The complete text of the paper

  • If not available: A clear message explaining why (e.g., "not in PMC", "requires journal access")

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pmidYes

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The main async handler function implementing the get_paper_fulltext tool logic. It checks PMC availability, fetches full text if possible, or returns helpful alternatives using DOI/PubMed links.
    async def get_paper_fulltext(pmid: str) -> str:
        """Get full text of a PubMed article using its ID.
    
        This tool attempts to retrieve the complete text of the paper if available through PubMed Central.
        If the paper is not available in PMC, it will return a message explaining why and provide information
        about where the text might be available (e.g., through DOI).
    
        Example usage:
        get_paper_fulltext(pmid="39661433")
    
        Returns:
        - If successful: The complete text of the paper
        - If not available: A clear message explaining why (e.g., "not in PMC", "requires journal access")
        """
        try:
            logger.info(f"Attempting to get full text for PMID: {pmid}")
    
            # First check PMC availability
            available, pmc_id = await fulltext_client.check_full_text_availability(pmid)
            
            if available:
                full_text = await fulltext_client.get_full_text(pmid)
                if full_text:
                    logger.info(f"Successfully retrieved full text from PMC for PMID {pmid}")
                    return full_text
    
            # Get article details to provide alternative locations
            article = await pubmed_client.get_article_details(pmid)
            
            message = "Full text is not available in PubMed Central.\n\n"
            message += "The article may be available at these locations:\n"
            message += f"- PubMed page: https://pubmed.ncbi.nlm.nih.gov/{pmid}/\n"
            
            if article and "doi" in article:
                message += f"- Publisher's site (via DOI): https://doi.org/{article['doi']}\n"
                
            logger.info(f"Full text not available in PMC for PMID {pmid}, provided alternative locations")
            return message
            
        except Exception as e:
            logger.exception(f"Error in get_paper_fulltext")
            raise ValueError(f"Error retrieving full text: {str(e)}")
  • MCP tool registration decorator that registers the get_paper_fulltext handler with metadata annotations including title and API hints.
    @app.tool(
        annotations={
            "title": "Get a paper's full text",
            "readOnlyHint": True,
            "openWorldHint": True  # Calls external PubMed API
        }
    )
  • Function signature (pmid: str -> str) and docstring defining input/output schema, description, and usage examples.
    async def get_paper_fulltext(pmid: str) -> str:
        """Get full text of a PubMed article using its ID.
    
        This tool attempts to retrieve the complete text of the paper if available through PubMed Central.
        If the paper is not available in PMC, it will return a message explaining why and provide information
        about where the text might be available (e.g., through DOI).
    
        Example usage:
        get_paper_fulltext(pmid="39661433")
    
        Returns:
        - If successful: The complete text of the paper
        - If not available: A clear message explaining why (e.g., "not in PMC", "requires journal access")
        """
  • Key helper method in FullTextClient that performs the actual full text retrieval from PubMed Central (PMC) using Entrez EFetch API, handling large/truncated responses.
    async def get_full_text(self, pmid: str) -> Optional[str]:
        """Get full text of the article if available through PMC.
        
        Handles truncated responses by making additional requests.
        
        Args:
            pmid: PubMed ID of the article
            
        Returns:
            Full text content if available, None otherwise
        """
        try:
            # First check availability and get PMC ID
            available, pmc_id = await self.check_full_text_availability(pmid)
            if not available or pmc_id is None:
                logger.info(f"Full text not available in PMC for PMID {pmid}")
                return None
    
            logger.info(f"Fetching full text for PMC ID {pmc_id}")
            content = ""
            retstart = 0
            
            while True:
                full_text_handle = Entrez.efetch(
                    db="pmc", 
                    id=pmc_id, 
                    rettype="xml",
                    retstart=retstart
                )
                
                if not full_text_handle:
                    break
                    
                chunk = full_text_handle.read()
                full_text_handle.close()
                
                if isinstance(chunk, bytes):
                    chunk = chunk.decode('utf-8')
                
                content += chunk
                
                # Check if there might be more content
                if "[truncated]" not in chunk and "Result too long" not in chunk:
                    break
                    
                # Increment retstart for next chunk
                retstart += len(chunk)
                
                # Add small delay to respect API rate limits
                time.sleep(0.5)
                
            return content
            
        except Exception as e:
            logger.exception(f"Error getting full text for PMID {pmid}: {str(e)}")
            return None
  • Helper method to check if full text is available in PMC and retrieve the PMC ID using Entrez ELink API.
    async def check_full_text_availability(self, pmid: str) -> Tuple[bool, Optional[str]]:
        """Check if full text is available in PMC and get PMC ID if it exists.
        
        Args:
            pmid: PubMed ID of the article
            
        Returns:
            Tuple of (availability boolean, PMC ID if available)
        """
        try:
            logger.info(f"Checking PMC availability for PMID {pmid}")
            handle = Entrez.elink(dbfrom="pubmed", db="pmc", id=pmid)
            
            if not handle:
                logger.info(f"No PMC link found for PMID {pmid}")
                return False, None
                
            xml_content = handle.read()
            handle.close()
            
            # Parse XML to get PMC ID
            root = ET.fromstring(xml_content)
            linksetdb = root.find(".//LinkSetDb")
            if linksetdb is None:
                logger.info(f"No PMC ID found for PMID {pmid}")
                return False, None
                
            id_elem = linksetdb.find(".//Id")
            if id_elem is None:
                logger.info(f"No PMC ID element found for PMID {pmid}")
                return False, None
                
            pmc_id = id_elem.text
            logger.info(f"Found PMC ID {pmc_id} for PMID {pmid}")
            return True, pmc_id
            
        except Exception as e:
            logger.exception(f"Error checking PMC availability for PMID {pmid}: {str(e)}")
            return False, None
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=true, but the description adds valuable behavioral context: it explains that retrieval depends on availability in PubMed Central, describes fallback behavior (returning messages with explanations), and mentions alternative sources like DOI. This enhances transparency beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the core purpose, followed by behavioral details and example usage. Every sentence adds value without redundancy, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (fetching full text with fallbacks), the description is complete: it covers purpose, usage, behavior, and output scenarios. With an output schema present, it appropriately omits detailed return value explanations, focusing on high-level outcomes like success/failure messages.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the description compensates by explaining that the 'pmid' parameter is used to identify the PubMed article. However, it does not provide additional details like format constraints or examples beyond the basic usage. With one parameter and no schema descriptions, the baseline is met but not exceeded.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('retrieve the complete text') and resource ('PubMed article using its ID'), distinguishing it from the sibling tool 'search_pubmed' which likely searches rather than fetches full text. It explicitly mentions PubMed Central as the source, adding specificity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use this tool (to get full text of a PubMed article by ID) and implies when not to use it (if you need to search, use 'search_pubmed'). However, it does not explicitly name the alternative or detail exclusions, such as handling non-PubMed IDs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/andybrandt/mcp-simple-pubmed'

If you have feedback or need assistance with the MCP directory API, please join our Discord server