semantic_search_papers_on_huggingface
Search HuggingFace papers using semantic queries to find relevant academic research based on meaning rather than keywords.
Instructions
Search for papers on HuggingFace using semantic search.
Args:
query (str): The query term to search for. It will automatically determine if it should use keywords or a natural language query, so format your queries accordingly.
top_n (int): The number of papers to return. Default is 10, but you can set it to any number.
Returns:
str: A list of papers with the title, summary, ID, and upvotes.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | ||
| top_n | No |
Implementation Reference
- paperpal.py:18-31 (handler)The main handler function for the 'semantic_search_papers_on_huggingface' tool. It is registered via the @mcp.tool() decorator, performs the semantic search using a helper function, formats the papers list, and returns it as a string.@mcp.tool() async def semantic_search_papers_on_huggingface(query: str, top_n: int = 10) -> str: """Search for papers on HuggingFace using semantic search. Args: query (str): The query term to search for. It will automatically determine if it should use keywords or a natural language query, so format your queries accordingly. top_n (int): The number of papers to return. Default is 10, but you can set it to any number. Returns: str: A list of papers with the title, summary, ID, and upvotes. """ papers: list[HuggingFacePaper] = semantic_search_huggingface_papers(query, top_n) return stringify_papers(papers)
- paperpal.py:11-15 (helper)Helper utility to convert a list of paper objects (Arxiv or HuggingFace) into a formatted string output used by the tool.def stringify_papers(papers: list[ArxivPaper | HuggingFacePaper]) -> str: """Format a list of papers into a string.""" papers_str = "\n---\n".join([str(paper) for paper in papers]) return f"List of papers:\n---\n{papers_str}\n---\n"
- huggingface.py:4-11 (schema)Pydantic BaseModel schema defining the structure of HuggingFace paper data, used in the tool's output.class HuggingFacePaper(BaseModel): title: str summary: str arxiv_id: str upvotes: int def __str__(self) -> str: return f"Title: {self.title}\nSummary: {self.summary}\nID: {self.arxiv_id}\nUpvotes: {self.upvotes}"
- huggingface.py:26-40 (helper)Core helper function that queries the HuggingFace papers API with the given query, parses the top_n results into HuggingFacePaper models, handles errors.def semantic_search_huggingface_papers(query: str, top_n: int) -> list[HuggingFacePaper]: """Search for papers on HuggingFace.""" url = f"https://huggingface.co/api/papers/search?q={query}" try: response = httpx.get(url) response.raise_for_status() papers_json = response.json() papers: list[HuggingFacePaper] = [parse_paper(paper) for paper in papers_json[:top_n]] return papers except Exception as e: return [f"Error fetching papers from HuggingFace. Try again later. {e}"]
- huggingface.py:13-20 (helper)Utility helper to parse raw JSON dict from HuggingFace API into a HuggingFacePaper model instance.def parse_paper(paper: dict) -> HuggingFacePaper: """Parse a paper from the HuggingFace API response.""" return HuggingFacePaper( title=paper['paper']["title"], summary=paper['paper']["summary"], arxiv_id=paper['paper']["id"], upvotes=paper['paper']["upvotes"], )