Skip to main content
Glama
jbdamask

NIH RePORTER MCP

by jbdamask

search_publications

Find publications associated with NIH-funded research projects using PubMed IDs or core project numbers to access relevant scientific literature.

Instructions

Search for publications linked to NIH projects

Args:
    pmids: Comma-separated list of PubMed IDs
    core_project_nums: Comma-separated list of NIH core project numbers
    limit: Maximum number of results to return (default: 10, max: 50)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pmidsNo
core_project_numsNo
limitNo

Implementation Reference

  • Main handler for the 'search_publications' tool. Processes input parameters to build API criteria, fetches data using NIHReporterClient.get_publications, and formats output with format_publication_results. The @mcp.tool() decorator registers it with the MCP server.
    @mcp.tool()
    async def search_publications(
        pmids: Optional[str] = None,
        core_project_nums: Optional[str] = None,
        limit: Optional[int] = 10
    ) -> str:
        """
        Search for publications linked to NIH projects
        
        Args:
            pmids: Comma-separated list of PubMed IDs
            core_project_nums: Comma-separated list of NIH core project numbers
            limit: Maximum number of results to return (default: 10, max: 50)
        """
        try:
            logger.info(f"Publication search request received with parameters: {locals()}")
            
            criteria = {}
            
            # Handle PMIDs
            if pmids:
                pmid_list = [pmid.strip() for pmid in pmids.split(",")]
                criteria["pmids"] = pmid_list
            
            # Handle core project numbers
            if core_project_nums:
                logger.info(f"Processing core_project_nums input: {core_project_nums}")
                # Clean the input string of any quotes
                clean_input = core_project_nums.strip().strip('"').strip("'")
                logger.info(f"Cleaned input: {clean_input}")
                proj_list = [num.strip() for num in clean_input.split(",")]
                logger.info(f"Created project list: {proj_list}")
                criteria["core_project_nums"] = proj_list
            
            # Ensure limit is within bounds
            criteria["limit"] = min(max(1, limit), 50)
            
            logger.info(f"Constructed publication search criteria: {json.dumps(criteria, indent=2)}")
            
            results = await api_client.get_publications({"criteria": criteria})
            return api_client.format_publication_results(results)
            
        except Exception as e:
            logger.error(f"Publication search failed: {str(e)}", exc_info=True)
            return f"Publication search failed: {str(e)}\nPlease check the logs for more details."
  • Core helper method in NIHReporterClient class that performs the actual API call to NIH RePORTER publications/search endpoint, optionally fetches additional PubMed details for PMIDs, and returns structured data.
    async def get_publications(self, criteria: Optional[Dict[str, Any]] = None) -> Dict[str, Any]:
        """Get publications from NIH RePORTER API"""
        logger.info(f"Fetching publications from NIH RePORTER with criteria: {criteria}")
        async with httpx.AsyncClient() as client:
            # Construct the payload according to API specification
            payload = {
                "criteria": criteria.get("criteria", {}),
                "limit": criteria.get("limit", 50),
                "offset": criteria.get("offset", 0),
                "sort_field": criteria.get("sort_field", "core_project_nums"),
                "sort_order": criteria.get("sort_order", "desc")
            }
            
            # Add publication years if specified
            if "publication_years" in criteria.get("criteria", {}):
                payload["criteria"]["publication_years"] = criteria["criteria"]["publication_years"]
            
            logger.debug(f"Sending payload to NIH Publications API: {json.dumps(payload, indent=2)}")
            
            try:
                response = await client.post(
                    f"{API_BASE}/publications/search",
                    headers=self.headers,
                    json=payload
                )
                response.raise_for_status()
                response_data = response.json()
                
                # If we got PMIDs, fetch the full publication details from PubMed
                if response_data.get("results"):
                    pmids = [str(result.get("pmid")) for result in response_data["results"] if result.get("pmid")]
                    if pmids:
                        async with httpx.AsyncClient() as pubmed_client:
                            # Use E-utilities to get full publication details
                            pubmed_url = f"https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=pubmed&id={','.join(pmids)}&retmode=json"
                            pubmed_response = await pubmed_client.get(pubmed_url)
                            pubmed_data = pubmed_response.json()
                            
                            # Update our results with PubMed data
                            for result in response_data["results"]:
                                if result.get("pmid"):
                                    pmid = str(result["pmid"])
                                    if pmid in pubmed_data.get("result", {}):
                                        pub_details = pubmed_data["result"][pmid]
                                        result.update({
                                            "title": pub_details.get("title", ""),
                                            "authors": [author.get("name", "") for author in pub_details.get("authors", [])],
                                            "journal_title": pub_details.get("fulljournalname", ""),
                                            "publication_year": pub_details.get("pubdate", "").split()[0] if pub_details.get("pubdate") else None
                                        })
                
                logger.debug(f"Received response: {json.dumps(response_data, indent=2)}")
                return response_data
            except httpx.HTTPStatusError as e:
                logger.error(f"HTTP error occurred: {e.response.status_code} - {e.response.text}")
                raise
            except json.JSONDecodeError as e:
                logger.error(f"Failed to parse API response: {e}")
                logger.error(f"Raw response: {response.text}")
                raise
            except Exception as e:
                logger.error(f"Unexpected error during API call: {str(e)}")
                raise
  • Helper method that converts raw publication API results into user-friendly Markdown formatted string, including links to PubMed and DOIs.
    def format_publication_results(self, results: Dict[str, Any], include_projects: bool = False) -> str:
        """Format publication results into markdown string with optional project links"""
        logger.debug(f"Formatting publication results: {json.dumps(results, indent=2)}")
        
        if not results.get("results"):
            logger.info("No publications found in API response")
            return "No publications found."
        
        try:
            formatted_results = []
            for pub in results["results"]:
                # Format authors safely
                authors = pub.get('authors', [])
                author_str = ", ".join(authors) if authors else "N/A"
                
                pub_info = [
                    f"### {pub.get('title', 'Untitled Publication')}",
                    "",
                    f"**Authors:** {author_str}",
                    f"**PMID:** `{pub.get('pmid', 'N/A')}`",
                    f"**Core Project Number:** `{pub.get('core_project_num', 'N/A')}`"
                ]
                
                # Add publication year if available
                if pub.get('publication_year'):
                    pub_info.append(f"**Publication Year:** {pub['publication_year']}")
                
                # Add journal info if available
                if pub.get('journal_title'):
                    pub_info.append(f"**Journal:** {pub['journal_title']}")
                
                # Add DOI if available
                if pub.get('doi'):
                    pub_info.append(f"**DOI:** [{pub['doi']}](https://doi.org/{pub['doi']})")
                
                # Add project links if available
                if pub.get('core_project_num'):
                    pub_info.extend([
                        "",
                        "#### Related NIH Projects",
                        f"- Core Project: `{pub['core_project_num']}`"
                    ])
                
                pub_info.extend(["", "---", ""])
                formatted_results.append("\n".join(filter(None, pub_info)))
            
            total = f"# NIH RePORTER Publication Results\n\n**Total matching publications:** {results.get('meta', {}).get('total', 0)}"
            return f"{total}\n\n" + "\n".join(formatted_results)
            
        except Exception as e:
            logger.error(f"Error formatting publication results: {str(e)}")
            logger.error(f"Results that caused error: {json.dumps(results, indent=2)}")
            raise
  • Input schema defined in the tool's docstring, describing parameters and their types/usage. Function signature provides type hints (Optional[str], Optional[int]).
    """
    Search for publications linked to NIH projects
    
    Args:
        pmids: Comma-separated list of PubMed IDs
        core_project_nums: Comma-separated list of NIH core project numbers
        limit: Maximum number of results to return (default: 10, max: 50)
    """
  • The @mcp.tool() decorator registers the search_publications function as an MCP tool.
    @mcp.tool()
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the search functionality but doesn't describe what the search returns (e.g., publication metadata, project links), whether results are paginated, error conditions, or performance characteristics. For a search tool with zero annotation coverage, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with a clear purpose statement followed by parameter explanations. Each sentence adds value, and there's no redundant information. The formatting with 'Args:' section helps readability, though it could be slightly more polished.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a search tool with 3 parameters and no annotations or output schema, the description adequately covers parameter semantics but lacks information about return values, error handling, and behavioral characteristics. It's minimally viable but has clear gaps in explaining what the tool actually returns and how it behaves.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description provides clear semantic information about all three parameters beyond what the schema offers. It explains that pmids are 'PubMed IDs', core_project_nums are 'NIH core project numbers', and limit controls 'maximum number of results to return' with default and maximum values. With 0% schema description coverage, this description fully compensates by explaining what each parameter means.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Search for publications linked to NIH projects'. It specifies the resource (publications) and scope (linked to NIH projects), which distinguishes it from sibling tools like search_projects. However, it doesn't explicitly differentiate from search_combined, which might be a more general search tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like search_combined or search_projects. It doesn't mention prerequisites, constraints, or typical use cases. The only implicit guidance is that it searches publications linked to NIH projects, but this doesn't help an agent choose between available search tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jbdamask/mcp-nih-reporter'

If you have feedback or need assistance with the MCP directory API, please join our Discord server