Skip to main content
Glama
h-lu

Paper Search MCP Server

by h-lu

search_crossref

Search academic papers across CrossRef's extensive database to find publications by DOI, metadata, or keywords, accessing comprehensive citation information from thousands of publishers.

Instructions

Search academic papers in CrossRef - the largest DOI citation database.

USE THIS TOOL WHEN:
- You need to find papers by DOI or citation metadata
- You want to search across all academic publishers (not just preprints)
- You need publication metadata like journal, volume, issue, citations
- You want to verify if a DOI exists or get its metadata

CrossRef indexes 150M+ scholarly works from thousands of publishers.
Results include DOI, authors, title, abstract, citations, and publisher info.

Args:
    query: Search terms (e.g., 'machine learning', 'CRISPR gene editing').
    max_results: Number of results (default: 10, max: 1000).
    **kwargs: Optional filters:
        - filter: 'has-full-text:true,from-pub-date:2020'
        - sort: 'relevance' | 'published' | 'cited'
        - order: 'asc' | 'desc'

Returns:
    List of paper metadata dicts with keys: paper_id (DOI), title, 
    authors, abstract, doi, published_date, citations, url.

Example:
    search_crossref("attention mechanism transformer", max_results=5)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
max_resultsNo
kwargsYes

Implementation Reference

  • Main handler function for the search_crossref tool. Registered via @mcp.tool() decorator. Includes schema via type hints and docstring. Delegates to _search helper using 'crossref' searcher.
    @mcp.tool()
    async def search_crossref(
        query: str, 
        max_results: int = 10,
        **kwargs
    ) -> List[Dict]:
        """Search academic papers in CrossRef - the largest DOI citation database.
        
        USE THIS TOOL WHEN:
        - You need to find papers by DOI or citation metadata
        - You want to search across all academic publishers (not just preprints)
        - You need publication metadata like journal, volume, issue, citations
        - You want to verify if a DOI exists or get its metadata
        
        CrossRef indexes 150M+ scholarly works from thousands of publishers.
        Results include DOI, authors, title, abstract, citations, and publisher info.
    
        Args:
            query: Search terms (e.g., 'machine learning', 'CRISPR gene editing').
            max_results: Number of results (default: 10, max: 1000).
            **kwargs: Optional filters:
                - filter: 'has-full-text:true,from-pub-date:2020'
                - sort: 'relevance' | 'published' | 'cited'
                - order: 'asc' | 'desc'
        
        Returns:
            List of paper metadata dicts with keys: paper_id (DOI), title, 
            authors, abstract, doi, published_date, citations, url.
        
        Example:
            search_crossref("attention mechanism transformer", max_results=5)
        """
        return await _search('crossref', query, max_results, **kwargs)
  • Global SEARCHERS dictionary registering the CrossRefSearcher instance under 'crossref' key, used by the generic _search function.
    SEARCHERS = {
        'arxiv': ArxivSearcher(),
        'pubmed': PubMedSearcher(),
        'biorxiv': BioRxivSearcher(),
        'medrxiv': MedRxivSearcher(),
        'google_scholar': GoogleScholarSearcher(),
        'iacr': IACRSearcher(),
        'semantic': SemanticSearcher(),
        'crossref': CrossRefSearcher(),
        'repec': RePECSearcher(),
    }
  • Generic _search helper function that retrieves the searcher instance and calls its search method, converting results to dicts.
    async def _search(
        searcher_name: str, 
        query: str, 
        max_results: int = 10,
        **kwargs
    ) -> List[Dict]:
        """通用搜索函数"""
        searcher = SEARCHERS.get(searcher_name)
        if not searcher:
            logger.error(f"Unknown searcher: {searcher_name}")
            return []
        
        try:
            papers = searcher.search(query, max_results=max_results, **kwargs)
            return [paper.to_dict() for paper in papers]
        except Exception as e:
            logger.error(f"Search failed for {searcher_name}: {e}")
            return []
  • Core search implementation in CrossRefSearcher class: makes API request to CrossRef /works endpoint, handles retries, parses responses into Paper objects.
    def search(self, query: str, max_results: int = 10, **kwargs) -> List[Paper]:
        """
        Search CrossRef database for papers.
        
        Args:
            query: Search query string
            max_results: Maximum number of results to return (default: 10)
            **kwargs: Additional parameters like filters, sort, etc.
            
        Returns:
            List of Paper objects
        """
        try:
            params = {
                'query': query,
                'rows': min(max_results, 1000),  # CrossRef API max is 1000
                'sort': 'relevance',
                'order': 'desc'
            }
            
            # Add any additional filters from kwargs
            if 'filter' in kwargs:
                params['filter'] = kwargs['filter']
            if 'sort' in kwargs:
                params['sort'] = kwargs['sort']
            if 'order' in kwargs:
                params['order'] = kwargs['order']
                
            # Polite Pool 参数
            if self.mailto:
                params['mailto'] = self.mailto
            
            url = f"{self.BASE_URL}/works"
            response = self._make_request(url, params)
            
            if not response:
                return []
            data = response.json()
            
            papers = []
            items = data.get('message', {}).get('items', [])
            
            for item in items:
                try:
                    paper = self._parse_crossref_item(item)
                    if paper:
                        papers.append(paper)
                except Exception as e:
                    logger.warning(f"Error parsing CrossRef item: {e}")
                    continue
                    
            return papers
            
        except requests.RequestException as e:
            logger.error(f"Error searching CrossRef: {e}")
            return []
        except Exception as e:
            logger.error(f"Unexpected error in CrossRef search: {e}")
            return []
  • Import of CrossRefSearcher class used by the server.
    from .academic_platforms.crossref import CrossRefSearcher
    from .academic_platforms.repec import RePECSearcher
    from .academic_platforms.sci_hub import SciHubFetcher, check_paper_year
    from .paper import Paper
    
    # ============================================================
    # 配置
    # ============================================================
    # PDF 下载目录,可通过环境变量 PAPER_DOWNLOAD_PATH 配置
    # 默认为用户目录下的 paper_downloads (跨平台兼容)
    # - macOS: ~/paper_downloads
    # - Linux: ~/paper_downloads  
    # - Windows: C:\Users\<username>\paper_downloads
    from pathlib import Path
    
    def get_download_path() -> str:
        """获取下载路径,支持跨平台
        
        注意:此函数每次调用时都会重新计算路径,以确保:
        1. 环境变量 PAPER_DOWNLOAD_PATH 的变化能够生效
        2. MCP 在不同环境下运行时能正确获取 HOME 目录
        """
        env_path = os.environ.get("PAPER_DOWNLOAD_PATH")
        if env_path:
            return env_path
        # 使用 Path.home() 获取跨平台的用户主目录
        return str(Path.home() / "paper_downloads")
    
    # ============================================================
    # 日志配置
    # ============================================================
    logger = logging.getLogger(__name__)
    
    # ============================================================
    # MCP Server 初始化
    # ============================================================
    mcp = FastMCP("paper_search_server")
    
    # ============================================================
    # 搜索器实例(单例)
    # ============================================================
    SEARCHERS = {
        'arxiv': ArxivSearcher(),
        'pubmed': PubMedSearcher(),
        'biorxiv': BioRxivSearcher(),
        'medrxiv': MedRxivSearcher(),

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/h-lu/paper-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server