search_unpaywall
Find open access academic papers by DOI using Unpaywall to retrieve metadata and availability information for scholarly articles.
Instructions
Lookup a DOI via Unpaywall and return OA metadata.
Unpaywall is DOI-centric and does not support generic keyword search.
This tool extracts the first DOI from query and returns at most one record.
Args: query: DOI string or text containing a DOI. max_results: Kept for API consistency; Unpaywall returns max 1 record. Returns: List with one paper metadata dict when DOI is resolvable, else empty list.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | ||
| max_results | No |
Implementation Reference
- paper_search_mcp/server.py:1033-1046 (handler)The MCP tool handler function 'search_unpaywall' which uses 'async_search' with the 'unpaywall_searcher' instance to perform the lookup.
async def search_unpaywall(query: str, max_results: int = 10) -> List[Dict]: """Lookup a DOI via Unpaywall and return OA metadata. Unpaywall is DOI-centric and does not support generic keyword search. This tool extracts the first DOI from `query` and returns at most one record. Args: query: DOI string or text containing a DOI. max_results: Kept for API consistency; Unpaywall returns max 1 record. Returns: List with one paper metadata dict when DOI is resolvable, else empty list. """ papers = await async_search(unpaywall_searcher, query, max_results) return papers if papers else [] - The 'UnpaywallSearcher.search' method, which contains the core implementation for searching Unpaywall via DOI.
def search(self, query: str, max_results: int = 10, **kwargs) -> List[Paper]: """Lookup a DOI in Unpaywall and return at most one Paper. Args: query: DOI string or text containing DOI. max_results: Kept for interface compatibility; Unpaywall returns max 1. **kwargs: Reserved for future use. """ if not self.resolver.has_api_access(): logger.warning( "Unpaywall search skipped: missing PAPER_SEARCH_MCP_UNPAYWALL_EMAIL/UNPAYWALL_EMAIL." ) return [] doi = extract_doi(query) or (query.strip() if query.strip().startswith("10.") else "") if not doi: return [] paper = self.resolver.get_paper_by_doi(doi) if not paper: return [] return [paper] - The 'UnpaywallResolver.get_paper_by_doi' helper method, which fetches metadata and maps it to a 'Paper' object.
def get_paper_by_doi(self, doi: str) -> Optional[Paper]: """Fetch Unpaywall metadata by DOI and map it to a Paper object. Args: doi: DOI string. Returns: Paper instance when record exists, otherwise None. """ if not self.email: return None normalized_doi = (doi or "").strip() if not normalized_doi: return None data = self._fetch_doi_record(normalized_doi) if not data: return None title = (data.get("title") or "").strip() if not title: title = normalized_doi authors: List[str] = [] for author in data.get("z_authors") or []: if not isinstance(author, dict): continue given = (author.get("given") or "").strip() family = (author.get("family") or "").strip() full_name = f"{given} {family}".strip() if full_name: authors.append(full_name) published_date = None published_date_str = (data.get("published_date") or "").strip() if published_date_str: try: if len(published_date_str) == 10: published_date = datetime.strptime(published_date_str, "%Y-%m-%d") elif len(published_date_str) == 4: published_date = datetime.strptime(published_date_str, "%Y") except ValueError: published_date = None best_location = data.get("best_oa_location") or {} landing_url = ( best_location.get("url") or data.get("doi_url") or f"https://doi.org/{normalized_doi}" ) pdf_url = best_location.get("url_for_pdf") or self.resolve_best_pdf_url(normalized_doi) or "" abstract = "" is_oa = bool(data.get("is_oa")) return Paper( paper_id=f"unpaywall:{normalized_doi}", title=title, authors=authors, abstract=abstract, doi=normalized_doi, published_date=published_date, pdf_url=pdf_url, url=landing_url, source="unpaywall", extra={ "is_oa": is_oa, "oa_status": data.get("oa_status", ""), "journal_name": data.get("journal_name", ""), "publisher": data.get("publisher", ""), "host_type": best_location.get("host_type", ""), "license": best_location.get("license", ""), "version": best_location.get("version", ""), }, ) - paper_search_mcp/server.py:60-60 (registration)The initialization of 'unpaywall_searcher' used by the MCP tool.
unpaywall_searcher = UnpaywallSearcher(resolver=unpaywall_resolver)