Skip to main content
Glama
chrismannina

PubMed MCP Server

by chrismannina

search_pubmed

Search PubMed for scientific articles using advanced filters like date ranges, article types, authors, journals, and MeSH terms to find relevant research.

Instructions

Search PubMed for articles with advanced filtering options

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesSearch query using PubMed syntax
max_resultsNoMaximum number of results to return
sort_orderNoSort order for resultsrelevance
date_fromNoStart date (YYYY/MM/DD, YYYY/MM, or YYYY)
date_toNoEnd date (YYYY/MM/DD, YYYY/MM, or YYYY)
date_rangeNoPredefined date range
article_typesNoFilter by article types
authorsNoFilter by author names
journalsNoFilter by journal names
mesh_termsNoFilter by MeSH terms
languageNoLanguage filter (e.g., 'eng', 'fre', 'ger')
has_abstractNoOnly include articles with abstracts
has_full_textNoOnly include articles with full text available
humans_onlyNoOnly include human studies

Implementation Reference

  • The main handler function _handle_search_pubmed that processes the tool call arguments, performs validation, calls PubMedClient.search_articles with the parameters, formats the search results into an MCPResponse with summaries and article details.
    async def _handle_search_pubmed(self, arguments: Dict[str, Any]) -> MCPResponse:
        """Handle PubMed search with advanced filtering."""
        try:
            # Parse arguments
            query = arguments.get("query", "")
            if not query:
                return MCPResponse(
                    content=[{"type": "text", "text": "Query parameter is required"}], is_error=True
                )
    
            max_results = arguments.get("max_results", 20)
            # Handle negative max_results
            if max_results < 0:
                max_results = 0
    
            sort_order = SortOrder(arguments.get("sort_order", "relevance"))
            date_from = arguments.get("date_from")
            date_to = arguments.get("date_to")
            date_range = (
                DateRange(arguments.get("date_range")) if arguments.get("date_range") else None
            )
    
            # Parse article types
            article_types = None
            if arguments.get("article_types"):
                article_types = [ArticleType(at) for at in arguments["article_types"]]
    
            # Perform search
            search_result = await self.pubmed_client.search_articles(
                query=query,
                max_results=max_results,
                sort_order=sort_order,
                date_from=date_from,
                date_to=date_to,
                date_range=date_range,
                article_types=article_types,
                authors=arguments.get("authors"),
                journals=arguments.get("journals"),
                mesh_terms=arguments.get("mesh_terms"),
                language=arguments.get("language"),
                has_abstract=arguments.get("has_abstract"),
                has_full_text=arguments.get("has_full_text"),
                humans_only=arguments.get("humans_only"),
                cache=self.cache,
            )
    
            # Format response
            content = []
    
            # Summary
            content.append(
                {
                    "type": "text",
                    "text": f"**PubMed Search Results**\n\n"
                    f"Query: {query}\n"
                    f"Total Results: {search_result.total_results:,}\n"
                    f"Returned: {search_result.returned_results}\n"
                    f"Search Time: {search_result.search_time:.2f}s\n",
                }
            )
    
            # Articles
            if search_result.articles:
                for i, article_data in enumerate(search_result.articles, 1):
                    article_text = self._format_article_summary(article_data, i)
                    content.append({"type": "text", "text": article_text})
            else:
                content.append({"type": "text", "text": "No articles found for this query."})
    
            return MCPResponse(
                content=content,
                metadata={
                    "total_results": search_result.total_results,
                    "search_time": search_result.search_time,
                },
            )
    
        except Exception as e:
            logger.error(f"Error in search_pubmed: {e}")
            return MCPResponse(
                content=[{"type": "text", "text": f"Search error: {str(e)}"}], is_error=True
            )
  • The schema definition for the 'search_pubmed' tool, including inputSchema with all parameters, types, descriptions, and required fields. Part of the TOOL_DEFINITIONS list used by get_tools().
    {
        "name": "search_pubmed",
        "description": ("Search PubMed for articles with advanced filtering options"),
        "inputSchema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query using PubMed syntax"},
                "max_results": {
                    "type": "integer",
                    "minimum": 1,
                    "maximum": 200,
                    "default": 20,
                    "description": "Maximum number of results to return",
                },
                "sort_order": {
                    "type": "string",
                    "enum": ["relevance", "pub_date", "author", "journal", "title"],
                    "default": "relevance",
                    "description": "Sort order for results",
                },
                "date_from": {
                    "type": "string",
                    "description": "Start date (YYYY/MM/DD, YYYY/MM, or YYYY)",
                },
                "date_to": {
                    "type": "string",
                    "description": "End date (YYYY/MM/DD, YYYY/MM, or YYYY)",
                },
                "date_range": {
                    "type": "string",
                    "enum": ["1y", "5y", "10y", "all"],
                    "description": "Predefined date range",
                },
                "article_types": {
                    "type": "array",
                    "items": {
                        "type": "string",
                        "enum": [
                            "Journal Article",
                            "Review",
                            "Systematic Review",
                            "Meta-Analysis",
                            "Clinical Trial",
                            "Randomized Controlled Trial",
                            "Case Reports",
                            "Letter",
                            "Editorial",
                            "Comment",
                        ],
                    },
                    "description": "Filter by article types",
                },
                "authors": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Filter by author names",
                },
                "journals": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Filter by journal names",
                },
                "mesh_terms": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Filter by MeSH terms",
                },
                "language": {
                    "type": "string",
                    "description": ("Language filter (e.g., 'eng', 'fre', 'ger')"),
                },
                "has_abstract": {
                    "type": "boolean",
                    "description": "Only include articles with abstracts",
                },
                "has_full_text": {
                    "type": "boolean",
                    "description": ("Only include articles with full text available"),
                },
                "humans_only": {"type": "boolean", "description": "Only include human studies"},
            },
            "required": ["query"],
        },
  • The handler_map dictionary in handle_tool_call method that registers and routes 'search_pubmed' tool calls to the _handle_search_pubmed function.
    handler_map = {
        "search_pubmed": self._handle_search_pubmed,
        "get_article_details": self._handle_get_article_details,
        "search_by_author": self._handle_search_by_author,
        "find_related_articles": self._handle_find_related_articles,
        "export_citations": self._handle_export_citations,
        "search_mesh_terms": self._handle_search_mesh_terms,
        "search_by_journal": self._handle_search_by_journal,
        "get_trending_topics": self._handle_get_trending_topics,
        "analyze_research_trends": self._handle_analyze_research_trends,
        "compare_articles": self._handle_compare_articles,
        "get_journal_metrics": self._handle_get_journal_metrics,
        "advanced_search": self._handle_advanced_search,
    }
  • Core helper method PubMedClient.search_articles that executes the PubMed API search using ESearch, fetches details with EFetch, handles caching, and returns a SearchResult. Directly called by the tool handler.
    async def search_articles(
        self,
        query: str,
        max_results: int = 20,
        sort_order: SortOrder = SortOrder.RELEVANCE,
        date_from: Optional[str] = None,
        date_to: Optional[str] = None,
        date_range: Optional[DateRange] = None,
        article_types: Optional[List[ArticleType]] = None,
        authors: Optional[List[str]] = None,
        journals: Optional[List[str]] = None,
        mesh_terms: Optional[List[str]] = None,
        language: Optional[str] = None,
        has_abstract: Optional[bool] = None,
        has_full_text: Optional[bool] = None,
        humans_only: Optional[bool] = None,
        cache: Optional[CacheManager] = None,
    ) -> SearchResult:
        """
        Search PubMed with advanced filtering.
    
        Args:
            query: Search query
            max_results: Maximum results to return
            sort_order: Sort order for results
            date_from: Start date filter
            date_to: End date filter
            date_range: Predefined date range
            article_types: Article type filters
            authors: Author filters
            journals: Journal filters
            mesh_terms: MeSH term filters
            language: Language filter
            has_abstract: Only articles with abstracts
            has_full_text: Only articles with full text
            humans_only: Only human studies
            cache: Cache manager instance
    
        Returns:
            SearchResult containing articles and metadata
        """
        start_time = time.time()
    
        # Check cache first
        if cache:
            cache_key = cache.generate_key(
                "search",
                query=query,
                max_results=max_results,
                sort_order=sort_order.value,
                date_from=date_from,
                date_to=date_to,
                date_range=date_range.value if date_range else None,
                article_types=[at.value for at in article_types] if article_types else None,
                authors=authors,
                journals=journals,
                mesh_terms=mesh_terms,
                language=language,
                has_abstract=has_abstract,
                has_full_text=has_full_text,
                humans_only=humans_only,
            )
            cached_result = cache.get(cache_key)
            if cached_result:
                # Convert cached article dicts back to Article objects
                cached_articles = [
                    Article(**article_data) for article_data in cached_result["articles"]
                ]
                cached_result["articles"] = cached_articles
                return SearchResult(**cached_result)
    
        # Handle date range shortcuts
        if date_range and not (date_from or date_to):
            date_to = datetime.now().strftime("%Y/%m/%d")
            if date_range == DateRange.LAST_YEAR:
                date_from = (datetime.now() - timedelta(days=365)).strftime("%Y/%m/%d")
            elif date_range == DateRange.LAST_5_YEARS:
                date_from = (datetime.now() - timedelta(days=365 * 5)).strftime("%Y/%m/%d")
            elif date_range == DateRange.LAST_10_YEARS:
                date_from = (datetime.now() - timedelta(days=365 * 10)).strftime("%Y/%m/%d")
    
        # Build complex search query
        search_query = build_search_query(
            query,
            authors=authors,
            journals=journals,
            mesh_terms=mesh_terms,
            article_types=([at.value for at in article_types] if article_types else None),
            date_from=date_from,
            date_to=date_to,
            language=language,
            has_abstract=has_abstract,
            has_full_text=has_full_text,
            humans_only=humans_only,
        )
    
        logger.info(f"Executing PubMed search: {search_query}")
    
        # Search for article IDs
        search_params = self._build_params(
            db="pubmed",
            term=search_query,
            retmax=str(max_results),
            retmode="json",
            sort=sort_order.value,
        )
    
        search_response = await self._make_request("esearch.fcgi", search_params)
        search_data = search_response.json()
    
        search_result = search_data.get("esearchresult", {})
        pmids = search_result.get("idlist", [])
        total_results = int(search_result.get("count", 0))
    
        # Get detailed article information
        articles = []
        if pmids:
            articles = await self._fetch_article_details(pmids, include_full_details=True)
    
        # Build result
        result_data = {
            "query": query,
            "total_results": total_results,
            "returned_results": len(articles),
            "articles": articles,  # Store Article objects directly
            "search_time": time.time() - start_time,
            "suggestions": [],  # Could implement spelling suggestions
        }
    
        # Cache the result (store as dicts for serialization)
        if cache:
            cache_data = {**result_data, "articles": [article.model_dump() for article in articles]}
            cache.set(cache_key, cache_data)
    
        return SearchResult(**result_data)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure but offers minimal information. It mentions 'advanced filtering options' but doesn't describe critical behaviors such as rate limits, authentication needs, pagination, error handling, or what the output looks like (e.g., article metadata). This is inadequate for a tool with 14 parameters and no output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's front-loaded and wastes no space, making it easy for an agent to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (14 parameters, no annotations, no output schema, multiple sibling tools), the description is incomplete. It doesn't explain the tool's behavior, output format, or usage context, leaving significant gaps for an agent to understand how to invoke it effectively compared to alternatives.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds no additional semantic context beyond implying filtering capabilities, which is already covered by the schema. This meets the baseline of 3 when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as 'Search PubMed for articles with advanced filtering options,' which specifies the verb (search), resource (PubMed articles), and scope (advanced filtering). However, it doesn't explicitly differentiate from sibling tools like 'search_by_author' or 'search_by_journal,' which are more specific variants.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'advanced_search' or 'search_by_author.' It lacks context about use cases, prerequisites, or exclusions, leaving the agent to infer usage from the tool name and parameters alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/chrismannina/pubmed-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server