Skip to main content
Glama
h-lu

Paper Search MCP Server

by h-lu

search_google_scholar

Search academic papers across all disciplines on Google Scholar to find research with citation counts and coverage for topics not available in specialized databases.

Instructions

Search academic papers on Google Scholar (broad coverage).

USE THIS TOOL WHEN:
- You need broad academic search across ALL disciplines
- You want citation counts and "cited by" information
- Other specialized tools don't cover your topic

COVERAGE: All academic disciplines, books, theses, patents.

LIMITATIONS:
- Uses web scraping (may be rate-limited)
- Does NOT support PDF download

FOR FULL TEXT (try in order):
1. download_arxiv(id) - if arXiv preprint
2. download_scihub(doi) - if published before 2023
3. download_semantic(id) - last resort

Args:
    query: Search terms (any academic topic).
    max_results: Number of results (default: 10, keep small to avoid blocks).

Returns:
    List of paper dicts with: title, authors, abstract snippet,
    citations count, url, source.

Example:
    search_google_scholar("climate change economic impact", max_results=5)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
max_resultsNo

Implementation Reference

  • MCP tool handler for search_google_scholar. Decorated with @mcp.tool() for registration. Delegates to generic _search using GoogleScholarSearcher.
    @mcp.tool()
    async def search_google_scholar(query: str, max_results: int = 10) -> List[Dict]:
        """Search academic papers on Google Scholar (broad coverage).
        
        USE THIS TOOL WHEN:
        - You need broad academic search across ALL disciplines
        - You want citation counts and "cited by" information
        - Other specialized tools don't cover your topic
        
        COVERAGE: All academic disciplines, books, theses, patents.
        
        LIMITATIONS:
        - Uses web scraping (may be rate-limited)
        - Does NOT support PDF download
        
        FOR FULL TEXT (try in order):
        1. download_arxiv(id) - if arXiv preprint
        2. download_scihub(doi) - if published before 2023
        3. download_semantic(id) - last resort
        
        Args:
            query: Search terms (any academic topic).
            max_results: Number of results (default: 10, keep small to avoid blocks).
        
        Returns:
            List of paper dicts with: title, authors, abstract snippet,
            citations count, url, source.
        
        Example:
            search_google_scholar("climate change economic impact", max_results=5)
        """
        return await _search('google_scholar', query, max_results)
  • Core implementation of the search logic in GoogleScholarSearcher class, performing HTTP requests to Google Scholar, parsing HTML with BeautifulSoup, and extracting paper metadata.
    def search(self, query: str, max_results: int = 10) -> List[Paper]:
        """
        Search Google Scholar with custom parameters
        """
        papers = []
        start = 0
        results_per_page = min(10, max_results)
    
        while len(papers) < max_results:
            try:
                # Construct search parameters
                params = {
                    'q': query,
                    'start': start,
                    'hl': 'en',
                    'as_sdt': '0,5'  # Include articles and citations
                }
    
                # Make request with random delay
                time.sleep(random.uniform(1.0, 3.0))
                response = self.session.get(self.SCHOLAR_URL, params=params)
                
                if response.status_code != 200:
                    logger.error(f"Search failed with status {response.status_code}")
                    break
    
                # Parse results
                soup = BeautifulSoup(response.text, 'html.parser')
                results = soup.find_all('div', class_='gs_ri')
    
                if not results:
                    break
    
                # Process each result
                for item in results:
                    if len(papers) >= max_results:
                        break
                        
                    paper = self._parse_paper(item)
                    if paper:
                        papers.append(paper)
    
                start += results_per_page
    
            except Exception as e:
                logger.error(f"Search error: {e}")
                break
    
        return papers[:max_results]
  • Instantiation and registration of GoogleScholarSearcher() in the global SEARCHERS dictionary, referenced by the _search helper function.
    SEARCHERS = {
        'arxiv': ArxivSearcher(),
        'pubmed': PubMedSearcher(),
        'biorxiv': BioRxivSearcher(),
        'medrxiv': MedRxivSearcher(),
        'google_scholar': GoogleScholarSearcher(),
        'iacr': IACRSearcher(),
        'semantic': SemanticSearcher(),
        'crossref': CrossRefSearcher(),
        'repec': RePECSearcher(),
    }
  • Generic _search helper function that invokes the specific searcher.search() method and converts Paper objects to dicts for the tool response.
    async def _search(
        searcher_name: str, 
        query: str, 
        max_results: int = 10,
        **kwargs
    ) -> List[Dict]:
        """通用搜索函数"""
        searcher = SEARCHERS.get(searcher_name)
        if not searcher:
            logger.error(f"Unknown searcher: {searcher_name}")
            return []
        
        try:
            papers = searcher.search(query, max_results=max_results, **kwargs)
            return [paper.to_dict() for paper in papers]
        except Exception as e:
            logger.error(f"Search failed for {searcher_name}: {e}")
            return []

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/h-lu/paper-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server