search_papers
Search arXiv papers by title, abstract, author, or category using advanced query syntax to find relevant academic research.
Instructions
Search for papers on arXiv by title and abstract content.
You can use advanced search syntax:
Search in title: ti:"search terms"
Search in abstract: abs:"search terms"
Search by author: au:"author name"
Combine terms with: AND, OR, ANDNOT
Filter by category: cat:cs.AI (use list_categories tool to see available categories)
Examples:
"machine learning" (searches all fields)
ti:"neural networks" AND cat:cs.AI (title with category)
au:bengio AND ti:"deep learning" (author and title)
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | ||
| max_results | No |
Implementation Reference
- mcp_simple_arxiv/server.py:43-80 (handler)The main handler function for the 'search_papers' tool. It calls ArxivClient.search, formats the results into a readable string with title, authors, ID, categories, published date, and abstract preview.async def search_papers(query: str, max_results: int = 10) -> str: """ Search for papers on arXiv by title and abstract content. You can use advanced search syntax: - Search in title: ti:"search terms" - Search in abstract: abs:"search terms" - Search by author: au:"author name" - Combine terms with: AND, OR, ANDNOT - Filter by category: cat:cs.AI (use list_categories tool to see available categories) Examples: - "machine learning" (searches all fields) - ti:"neural networks" AND cat:cs.AI (title with category) - au:bengio AND ti:"deep learning" (author and title) """ max_results = min(max_results, 50) papers = await arxiv_client.search(query, max_results) # Format results in a readable way result = "Search Results:\n\n" for i, paper in enumerate(papers, 1): result += f"{i}. {paper['title']}\n" result += f" Authors: {', '.join(paper['authors'])}\n" result += f" ID: {paper['id']}\n" result += f" Categories: " if paper['primary_category']: result += f"Primary: {paper['primary_category']}" if paper['categories']: result += f", Additional: {', '.join(paper['categories'])}" result += f"\n Published: {paper['published']}\n" # Add first sentence of abstract abstract_preview = get_first_sentence(paper['summary']) result += f" Preview: {abstract_preview}\n" result += "\n" return result
- mcp_simple_arxiv/server.py:36-42 (registration)Registers the search_papers tool with the FastMCP app, providing title and hints for the tool schema.@app.tool( annotations={ "title": "Search arXiv Papers", "readOnlyHint": True, "openWorldHint": True } )
- mcp_simple_arxiv/server.py:43-58 (schema)Type annotations and docstring defining input parameters (query: str, max_results: int=10), advanced search syntax, and output format (str).async def search_papers(query: str, max_results: int = 10) -> str: """ Search for papers on arXiv by title and abstract content. You can use advanced search syntax: - Search in title: ti:"search terms" - Search in abstract: abs:"search terms" - Search by author: au:"author name" - Combine terms with: AND, OR, ANDNOT - Filter by category: cat:cs.AI (use list_categories tool to see available categories) Examples: - "machine learning" (searches all fields) - ti:"neural networks" AND cat:cs.AI (title with category) - au:bengio AND ti:"deep learning" (author and title) """
- mcp_simple_arxiv/server.py:19-29 (helper)Helper function used to generate abstract previews in search results.def get_first_sentence(text: str, max_len: int = 200) -> str: """Extract first sentence from text, limiting length.""" # Look for common sentence endings for end in ['. ', '! ', '? ']: pos = text.find(end) if pos != -1 and pos < max_len: return text[:pos + 1] # If no sentence ending found, just take first max_len chars if len(text) > max_len: return text[:max_len].rstrip() + '...' return text
- Core helper method in ArxivClient class that performs the actual arXiv API search, respects rate limits, parses Atom feed response, and returns list of paper dicts used by the tool handler.async def search(self, query: str, max_results: int = 10) -> List[Dict[str, Any]]: """ Search arXiv papers. The query string supports arXiv's advanced search syntax: - Search in title: ti:"search terms" - Search in abstract: abs:"search terms" - Search by author: au:"author name" - Combine terms with: AND, OR, ANDNOT - Filter by category: cat:cs.AI Examples: - "machine learning" (searches all fields) - ti:"neural networks" AND cat:cs.AI (title with category) - au:bengio AND ti:"deep learning" (author and title) """ await self._wait_for_rate_limit() # Ensure max_results is within API limits max_results = min(max_results, 2000) # API limit: 2000 per request params = { "search_query": query, "max_results": max_results, "sortBy": "submittedDate", # Default to newest papers first "sortOrder": "descending", } async with httpx.AsyncClient(timeout=20.0) as client: try: response = await client.get(self.base_url, params=params) response.raise_for_status() # Raise an exception for bad status codes # Parse the Atom feed response feed = feedparser.parse(response.text) if not isinstance(feed, dict) or 'entries' not in feed: logger.error("Invalid response from arXiv API") logger.debug(f"Response text: {response.text[:1000]}...") raise ValueError("Invalid response from arXiv API") if not feed.get('entries'): # Empty results are ok - return empty list return [] return [self._parse_entry(entry) for entry in feed.entries] except httpx.HTTPError as e: logger.error(f"HTTP error while searching: {e}") raise ValueError(f"arXiv API HTTP error: {str(e)}") async def get_paper(self, paper_id: str) -> Dict[str, Any]: