Skip to main content
Glama

search_papers

Search and filter arXiv papers by query, date, citations, and categories. Save results to a file or access directly for research and analysis purposes.

Instructions

Search for papers on arXiv with advanced filtering

Input Schema

NameRequiredDescriptionDefault
categoriesNo
date_fromNo
date_toNo
max_resultsNo
min_citationsNoMinimum citation count filter
queryYes
save_to_fileNoOptional file path to save results

Input Schema (JSON Schema)

{ "properties": { "categories": { "items": { "type": "string" }, "type": "array" }, "date_from": { "type": "string" }, "date_to": { "type": "string" }, "max_results": { "type": "integer" }, "min_citations": { "description": "Minimum citation count filter", "type": "integer" }, "query": { "type": "string" }, "save_to_file": { "description": "Optional file path to save results", "type": "string" } }, "required": [ "query" ], "type": "object" }

Implementation Reference

  • The primary handler function for the 'search_papers' tool. It processes input arguments, constructs an advanced arXiv search query with automatic field specifiers, applies filters (categories, dates), retrieves and processes paper results using the arxiv library, optionally saves results to a file, and returns formatted JSON results as text content.
    async def handle_search(arguments: Dict[str, Any]) -> List[types.TextContent]: """Handle paper search requests. Automatically adds field specifiers to plain queries for better relevance. This fixes issue #33 where queries sorted by date returned irrelevant results. """ try: client = arxiv.Client() max_results = min(int(arguments.get("max_results", 10)), settings.MAX_RESULTS) # Build search query with category filtering query = arguments["query"] # Add field specifier if not already present # This ensures the query actually searches the content if not any(field in query for field in ["all:", "ti:", "abs:", "au:", "cat:"]): # Convert plain query to use all: field for better results # Handle quoted phrases if '"' in query: # Keep quoted phrases intact query = f"all:{query}" else: # For unquoted multi-word queries, use AND operator terms = query.split() if len(terms) > 1: query = " AND ".join(f"all:{term}" for term in terms) else: query = f"all:{query}" if categories := arguments.get("categories"): category_filter = " OR ".join(f"cat:{cat}" for cat in categories) query = f"({query}) AND ({category_filter})" # Parse dates for query construction date_from = None date_to = None try: if "date_from" in arguments: date_from = parser.parse(arguments["date_from"]).replace(tzinfo=timezone.utc) if "date_to" in arguments: date_to = parser.parse(arguments["date_to"]).replace(tzinfo=timezone.utc) except (ValueError, TypeError) as e: return [ types.TextContent( type="text", text=f"Error: Invalid date format - {str(e)}" ) ] # Add date range to query if specified # Note: arXiv API date filtering is limited, so we rely mainly on post-processing # We can try to use lastUpdatedDate format but it's not always reliable if date_from or date_to: # For now, we'll rely on post-processing filtering # The arXiv API doesn't have reliable date range queries in search pass search = arxiv.Search( query=query, max_results=max_results, sort_by=arxiv.SortCriterion.SubmittedDate, ) # Process results results = [] for paper in client.results(search): # Additional date filtering for edge cases (API date query might not be precise) if date_from or date_to: if not _is_within_date_range(paper.published, date_from, date_to): continue results.append(_process_paper(paper)) if len(results) >= max_results: break response_data = {"total_results": len(results), "papers": results} # Save to file if requested if save_file := arguments.get("save_to_file"): save_success = save_results_to_file(response_data, save_file) if save_success: response_data["saved_to"] = save_file else: response_data["save_error"] = f"Failed to save to {save_file}" return [ types.TextContent(type="text", text=json.dumps(response_data, indent=2)) ] except Exception as e: return [types.TextContent(type="text", text=f"Error: {str(e)}")]
  • Defines the input schema and metadata for the 'search_papers' tool, including parameters like query (required), max_results, date ranges, categories, save_to_file, and min_citations.
    search_tool = types.Tool( name="search_papers", description="Search for papers on arXiv with advanced filtering", inputSchema={ "type": "object", "properties": { "query": {"type": "string"}, "max_results": {"type": "integer"}, "date_from": {"type": "string"}, "date_to": {"type": "string"}, "categories": {"type": "array", "items": {"type": "string"}}, "save_to_file": {"type": "string", "description": "Optional file path to save results"}, "min_citations": {"type": "integer", "description": "Minimum citation count filter"}, }, "required": ["query"], }, )
  • Registers the 'search_papers' tool (as search_tool) in the MCP server's list of available tools.
    @server.list_tools() async def list_tools() -> List[types.Tool]: """List available arXiv research tools.""" return [search_tool, download_tool, list_tool, read_tool]
  • The MCP server tool dispatcher that routes calls to 'search_papers' to the handle_search function.
    @server.call_tool() async def call_tool(name: str, arguments: Dict[str, Any]) -> List[types.TextContent]: """Handle tool calls for arXiv research functionality.""" logger.debug(f"Calling tool {name} with arguments {arguments}") try: if name == "search_papers": return await handle_search(arguments) elif name == "download_paper": return await handle_download(arguments) elif name == "list_papers": return await handle_list_papers(arguments) elif name == "read_paper": return await handle_read_paper(arguments) else: return [types.TextContent(type="text", text=f"Error: Unknown tool {name}")] except Exception as e: logger.error(f"Tool error: {str(e)}") return [types.TextContent(type="text", text=f"Error: {str(e)}")]
  • Helper function to save search results to a JSON file, used optionally by the handler.
    def save_results_to_file(results: List[Dict[str, Any]], file_path: str) -> bool: """Save search results to a JSON file.""" try: output_path = Path(file_path) output_path.parent.mkdir(parents=True, exist_ok=True) with open(output_path, 'w', encoding='utf-8') as f: json.dump(results, f, indent=2, ensure_ascii=False) return True except Exception as e: print(f"Error saving to file: {e}") return False

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wr-web/APR'

If you have feedback or need assistance with the MCP directory API, please join our Discord server