Skip to main content
Glama
openags

Paper Search MCP

by openags

search_citeseerx

Search academic papers from the CiteSeerX digital library to find relevant research publications for your query.

Instructions

Search academic papers from CiteSeerX digital library.

Args: query: Search query string (e.g., 'machine learning'). max_results: Maximum number of papers to return (default: 10). Returns: List of paper metadata in dictionary format.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
max_resultsNo

Implementation Reference

  • Tool definition and handler for 'search_citeseerx'. It uses 'async_search' with 'citeseerx_searcher'.
    async def search_citeseerx(query: str, max_results: int = 10) -> List[Dict]:
        """Search academic papers from CiteSeerX digital library.
    
        Args:
            query: Search query string (e.g., 'machine learning').
            max_results: Maximum number of papers to return (default: 10).
        Returns:
            List of paper metadata in dictionary format.
        """
        papers = await async_search(citeseerx_searcher, query, max_results)
        return papers if papers else []
  • The actual implementation of the CiteSeerX search logic, within the 'CiteSeerXSearcher' class.
    def search(self, query: str, max_results: int = 10, **kwargs) -> List[Paper]:
        """
        Search CiteSeerX for computer science papers.
    
        Args:
            query: Search query string
            max_results: Maximum results to return (default: 10)
            **kwargs: Additional parameters:
                - year: Filter by publication year
                - author: Filter by author name
                - venue: Filter by conference/journal venue
                - min_citations: Minimum citation count
                - sort: Sort by 'relevance', 'date', 'citations'
    
        Returns:
            List of Paper objects
        """
        papers = []
    
        try:
            # Prepare parameters for CiteSeerX API
            params = {
                'q': query,
                'max': min(max_results, 100),  # CiteSeerX default max
                'start': 0,
                'sort': kwargs.get('sort', 'relevance')
            }
    
            # Add filters
            if 'year' in kwargs:
                year = kwargs['year']
                if isinstance(year, str) and '-' in year:
                    # Handle year range
                    year_range = year.split('-')
                    if len(year_range) == 2:
                        params['year'] = f"{year_range[0]}-{year_range[1]}"
                else:
                    params['year'] = str(year)
    
            if 'author' in kwargs:
                params['author'] = kwargs['author']
    
            if 'venue' in kwargs:
                params['venue'] = kwargs['venue']
    
            if 'min_citations' in kwargs:
                params['minCitations'] = kwargs['min_citations']
    
            logger.debug(f"Searching CiteSeerX with params: {params}")
    
            response = self._get(self.SEARCH_API, params=params)
            response.raise_for_status()
    
            data = response.json()
    
            # CiteSeerX API returns results in 'result' field
            results = data.get('result', {}).get('hits', {}).get('hit', [])
    
            # Handle single result (API returns dict instead of list for single result)
            if isinstance(results, dict):
                results = [results]
    
            for result in results:
                try:
                    paper = self._parse_citeseerx_result(result)
                    if paper:
                        papers.append(paper)
                        if len(papers) >= max_results:
                            break
                except Exception as e:
                    logger.warning(f"Error parsing CiteSeerX result: {e}")
                    continue
    
            logger.info(f"Found {len(papers)} papers from CiteSeerX for query: {query}")
    
        except requests.RequestException as e:
            logger.error(f"CiteSeerX API request error: {e}")
            if hasattr(e, 'response') and e.response is not None:
                logger.error(f"Response status: {e.response.status_code}")
                if e.response.status_code == 429:
                    logger.warning("CiteSeerX rate limit exceeded")
        except json.JSONDecodeError as e:
            logger.error(f"Failed to parse CiteSeerX JSON response: {e}")
        except Exception as e:
            logger.error(f"Unexpected error in CiteSeerX search: {e}")
    
        return papers
  • Task registration mapping 'search_citeseerx' in the server.
    task_map[source] = search_citeseerx(query, max_results_per_source)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/openags/paper-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server