papers_by_author
Search for academic papers by a specific author using OpenAlex API, with options to sort by citation count or publication date.
Instructions
Searches for academic papers by a particular author using the OpenAlex API.
Args: author_id: An OpenAlex Author ID of target author. e.g., "https://openalex.org/A123456789" sort_by: The sorting criteria ("cited_by_count", or "publication_date"). page: The page number of the results to retrieve (default: 1).
Returns: A JSON object containing a list of papers+ids by the specified author, or an error message if the search fails.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| author_id | Yes | ||
| page | No | ||
| sort_by | No | cited_by_count |
Implementation Reference
- src/server.py:246-306 (handler)Handler function implementing the papers_by_author tool logic. Queries OpenAlex API for author's papers, processes with Work models, handles pagination and errors, returns PageResult. Registered via @mcp.tool decorator.@mcp.tool async def papers_by_author( author_id: str, sort_by: Literal["cited_by_count", "publication_date"] = "cited_by_count", page: int = 1, ) -> PageResult: """ Searches for academic papers by a particular author using the OpenAlex API. Args: author_id: An OpenAlex Author ID of target author. e.g., "https://openalex.org/A123456789" sort_by: The sorting criteria ("cited_by_count", or "publication_date"). page: The page number of the results to retrieve (default: 1). Returns: A JSON object containing a list of papers+ids by the specified author, or an error message if the search fails. """ params = { "filter": f"authorships.author.id:{author_id}", "sort": f"{sort_by}:desc", "page": page, "per_page": 10, } # Fetches search results from the OpenAlex API async with RequestAPI("https://api.openalex.org", default_params={"mailto": OPENALEX_MAILTO}) as api: logger.info(f"Searching for papers using: author_id={author_id}, sort_by={sort_by}, page={page}") try: result = await api.aget("/works", params=params) # Returns a message for when the search results are empty if result is None or len(result.get("results", []) or []) == 0: error_message = f"No works found for author_id={author_id}." logger.info(error_message) raise ToolError(error_message) # Successfully returns the searched papers works = Work.from_list(result.get("results", []) or []) success_message = f"Found {len(works)} papers by author_id={author_id}." logger.info(success_message) total_count = (result.get("meta", {}) or {}).get("count") if total_count and total_count > params["per_page"] * params["page"]: has_next = True else: has_next = None return PageResult( data=Work.list_to_json(works), total_count=total_count, per_page=params["per_page"], page=params["page"], has_next=has_next ) except httpx.HTTPStatusError as e: error_message = f"Request failed with status: {e.response.status_code}" logger.error(error_message) raise ToolError(error_message) except httpx.RequestError as e: error_message = f"Network error: {str(e)}" logger.error(error_message) raise ToolError(error_message)
- src/schemas.py:133-139 (schema)Pydantic model defining the PageResult schema returned by the papers_by_author tool, including paginated data, counts, and navigation flags.class PageResult(BaseModel): data: List[Union[Institution, Author, Work, dict]] = Field(default_factory=list) total_count: Optional[int] = None per_page: int page: int has_next: Optional[bool] = None
- src/schemas.py:82-131 (schema)Pydantic model for Work used to parse and serialize paper data in the papers_by_author tool response.class Work(BaseModel): model_config = ConfigDict( frozen=False, # set True for immutability validate_assignment=True, # runtime type safety on attribute set str_strip_whitespace=True, # trims incoming strings ) title: str = None ids: Dict[str, str] = Field(default_factory=dict) cited_by_count: Optional[int] = None authors: List[Author] = Field(default_factory=list) publication_date: Optional[str] = None preferred_fulltext_url: Optional[str] = None @classmethod def from_json(cls, json_obj: Dict[str, Any]) -> "Work": # Gets title and page urls title = json_obj.get("title") or json_obj.get("display_name") or "" # Prioritize Open Access url preferred_fulltext_url = (json_obj.get("best_oa_location", {}) or {}).get("pdf_url") if preferred_fulltext_url is None: preferred_fulltext_url = (json_obj.get("best_oa_location", {}) or {}).get("landing_page_url") if preferred_fulltext_url is None: preferred_fulltext_url = (json_obj.get("primary_location", {}) or {}).get("pdf_url") if preferred_fulltext_url is None: preferred_fulltext_url = (json_obj.get("primary_location", {}) or {}).get("landing_page_url") # Gets individual authors of the work authors = Author.from_list(json_obj.get("authorships", []) or []) return cls( title=title, ids=json_obj.get("ids", {}) or {}, authors=authors, cited_by_count=json_obj.get("cited_by_count"), publication_date=json_obj.get("publication_date"), preferred_fulltext_url=preferred_fulltext_url ) @classmethod def from_list(cls, json_list: List[dict]) -> List["Work"]: return [cls.from_json(item) for item in json_list] @staticmethod def list_to_json(works: List["Work"]) -> List[dict]: return [work.model_dump(exclude_none=True) for work in works] def __str__(self) -> str: return self.model_dump_json(exclude_none=True)