Skip to main content
Glama
openags

Paper Search MCP

by openags

search_arxiv

Search academic papers from arXiv to find relevant research publications using specific queries and return paper metadata.

Instructions

Search academic papers from arXiv.

Args: query: Search query string (e.g., 'machine learning'). max_results: Maximum number of papers to return (default: 10). Returns: List of paper metadata in dictionary format.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
max_resultsNo

Implementation Reference

  • The MCP tool wrapper that calls the async_search helper with the arxiv_searcher instance.
    async def search_arxiv(query: str, max_results: int = 10) -> List[Dict]:
        """Search academic papers from arXiv.
    
        Args:
            query: Search query string (e.g., 'machine learning').
            max_results: Maximum number of papers to return (default: 10).
        Returns:
            List of paper metadata in dictionary format.
        """
        papers = await async_search(arxiv_searcher, query, max_results)
        return papers if papers else []
  • The core implementation of the Arxiv search logic using feedparser to parse arXiv API response.
    def search(self, query: str, max_results: int = 10) -> List[Paper]:
        params = {
            'search_query': f'all:{query}',
            'max_results': max_results,
            'sortBy': 'submittedDate',
            'sortOrder': 'descending'
        }
        response = None
        for attempt in range(3):
            try:
                response = self.session.get(self.BASE_URL, params=params, timeout=30)
            except requests.RequestException:
                time.sleep((attempt + 1) * 1.5)
                continue
            if response.status_code == 200:
                break
            if response.status_code in (429, 500, 502, 503, 504):
                time.sleep((attempt + 1) * 1.5)
                continue
            break
    
        if response is None or response.status_code != 200:
            return []
    
        feed = feedparser.parse(response.content)
        papers = []
        for entry in feed.entries:
            try:
                authors = [author.name for author in entry.authors]
                published = datetime.strptime(entry.published, '%Y-%m-%dT%H:%M:%SZ')
                updated = datetime.strptime(entry.updated, '%Y-%m-%dT%H:%M:%SZ')
                pdf_url = next((link.href for link in entry.links if link.type == 'application/pdf'), '')
                
                # Try to extract DOI from entry.doi or links or summary
                doi = entry.get('doi', '') or extract_doi(entry.summary) or extract_doi(entry.id)
                for link in entry.links:
                    if link.get('title') == 'doi':
                        doi = doi or extract_doi(link.href)
    
                papers.append(Paper(
                    paper_id=entry.id.split('/')[-1],
                    title=entry.title,
                    authors=authors,
                    abstract=entry.summary,
                    url=entry.id,
                    pdf_url=pdf_url,
                    published_date=published,
                    updated_date=updated,
                    source='arxiv',
                    categories=[tag.term for tag in entry.tags],
                    keywords=[],
                    doi=doi
                ))
            except Exception as e:
                print(f"Error parsing arXiv entry: {e}")
        return papers

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/openags/paper-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server