Skip to main content
Glama
openags

Paper Search MCP

by openags

search_unpaywall

Find open access academic papers by DOI using Unpaywall to retrieve metadata and availability information for scholarly articles.

Instructions

Lookup a DOI via Unpaywall and return OA metadata.

Unpaywall is DOI-centric and does not support generic keyword search. This tool extracts the first DOI from query and returns at most one record.

Args: query: DOI string or text containing a DOI. max_results: Kept for API consistency; Unpaywall returns max 1 record. Returns: List with one paper metadata dict when DOI is resolvable, else empty list.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
max_resultsNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The MCP tool handler function 'search_unpaywall' which uses 'async_search' with the 'unpaywall_searcher' instance to perform the lookup.
    async def search_unpaywall(query: str, max_results: int = 10) -> List[Dict]:
        """Lookup a DOI via Unpaywall and return OA metadata.
    
        Unpaywall is DOI-centric and does not support generic keyword search.
        This tool extracts the first DOI from `query` and returns at most one record.
    
        Args:
            query: DOI string or text containing a DOI.
            max_results: Kept for API consistency; Unpaywall returns max 1 record.
        Returns:
            List with one paper metadata dict when DOI is resolvable, else empty list.
        """
        papers = await async_search(unpaywall_searcher, query, max_results)
        return papers if papers else []
  • The 'UnpaywallSearcher.search' method, which contains the core implementation for searching Unpaywall via DOI.
    def search(self, query: str, max_results: int = 10, **kwargs) -> List[Paper]:
        """Lookup a DOI in Unpaywall and return at most one Paper.
    
        Args:
            query: DOI string or text containing DOI.
            max_results: Kept for interface compatibility; Unpaywall returns max 1.
            **kwargs: Reserved for future use.
        """
        if not self.resolver.has_api_access():
            logger.warning(
                "Unpaywall search skipped: missing PAPER_SEARCH_MCP_UNPAYWALL_EMAIL/UNPAYWALL_EMAIL."
            )
            return []
    
        doi = extract_doi(query) or (query.strip() if query.strip().startswith("10.") else "")
        if not doi:
            return []
    
        paper = self.resolver.get_paper_by_doi(doi)
        if not paper:
            return []
    
        return [paper]
  • The 'UnpaywallResolver.get_paper_by_doi' helper method, which fetches metadata and maps it to a 'Paper' object.
    def get_paper_by_doi(self, doi: str) -> Optional[Paper]:
        """Fetch Unpaywall metadata by DOI and map it to a Paper object.
    
        Args:
            doi: DOI string.
    
        Returns:
            Paper instance when record exists, otherwise None.
        """
        if not self.email:
            return None
    
        normalized_doi = (doi or "").strip()
        if not normalized_doi:
            return None
    
        data = self._fetch_doi_record(normalized_doi)
        if not data:
            return None
    
        title = (data.get("title") or "").strip()
        if not title:
            title = normalized_doi
    
        authors: List[str] = []
        for author in data.get("z_authors") or []:
            if not isinstance(author, dict):
                continue
            given = (author.get("given") or "").strip()
            family = (author.get("family") or "").strip()
            full_name = f"{given} {family}".strip()
            if full_name:
                authors.append(full_name)
    
        published_date = None
        published_date_str = (data.get("published_date") or "").strip()
        if published_date_str:
            try:
                if len(published_date_str) == 10:
                    published_date = datetime.strptime(published_date_str, "%Y-%m-%d")
                elif len(published_date_str) == 4:
                    published_date = datetime.strptime(published_date_str, "%Y")
            except ValueError:
                published_date = None
    
        best_location = data.get("best_oa_location") or {}
        landing_url = (
            best_location.get("url")
            or data.get("doi_url")
            or f"https://doi.org/{normalized_doi}"
        )
        pdf_url = best_location.get("url_for_pdf") or self.resolve_best_pdf_url(normalized_doi) or ""
    
        abstract = ""
        is_oa = bool(data.get("is_oa"))
    
        return Paper(
            paper_id=f"unpaywall:{normalized_doi}",
            title=title,
            authors=authors,
            abstract=abstract,
            doi=normalized_doi,
            published_date=published_date,
            pdf_url=pdf_url,
            url=landing_url,
            source="unpaywall",
            extra={
                "is_oa": is_oa,
                "oa_status": data.get("oa_status", ""),
                "journal_name": data.get("journal_name", ""),
                "publisher": data.get("publisher", ""),
                "host_type": best_location.get("host_type", ""),
                "license": best_location.get("license", ""),
                "version": best_location.get("version", ""),
            },
        )
  • The initialization of 'unpaywall_searcher' used by the MCP tool.
    unpaywall_searcher = UnpaywallSearcher(resolver=unpaywall_resolver)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well. It discloses key behavioral traits: extracts first DOI from query, returns at most one record, returns empty list for unresolvable DOIs, and mentions API consistency for max_results. It doesn't cover rate limits or authentication needs, but provides substantial operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Perfectly structured with clear sections: purpose statement, important constraint, behavioral details, and parameter explanations. Every sentence earns its place with zero waste. The information is front-loaded with the most critical constraint (DOI-centric nature) stated immediately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 2 parameters with 0% schema coverage and no annotations, the description provides complete operational context. It explains what the tool does, its limitations, how parameters work, and return behavior. With an output schema presumably covering return format, the description focuses appropriately on usage constraints and parameter semantics.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so description must compensate fully. It provides excellent parameter semantics: explains that 'query' can be DOI string or text containing DOI, clarifies that 'max_results' is kept for API consistency but Unpaywall returns max 1 record. This adds crucial meaning beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('lookup a DOI via Unpaywall'), resource ('OA metadata'), and scope ('DOI-centric, does not support generic keyword search'). It distinguishes from sibling tools by explicitly mentioning Unpaywall's DOI-centric nature, unlike many other search/download tools in the list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('lookup a DOI') and when not to use ('does not support generic keyword search'). It provides clear alternatives by implication - for generic searches, use other sibling tools like search_arxiv, search_crossref, etc. The guidance is specific and actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/openags/paper-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server