search_arxiv

Find academic papers on arXiv by entering search queries to access research publications for academic and technical domains.

Instructions

Search arXiv for academic papers

Input Schema

TableJSON Schema

Name	Required	Description	Default
`query`	Yes	Search query
`maxResults`	No	Maximum results to return

Implementation Reference

src/tools/academic/simple.ts:14-194 (handler)
Complete tool registration block containing the handler (execute function), schema, and core implementation of search_arxiv. Handles arXiv API calls with retry logic, XML parsing, error handling, search engine fallback, and result formatting.
registry.registerTool({ name: 'search_arxiv', description: 'Search arXiv for academic papers', category: 'academic', source: 'arxiv.org', inputSchema: { type: 'object', properties: { query: { type: 'string', description: 'Search query' }, maxResults: { type: 'number', description: 'Maximum results to return' } }, required: ['query'] }, execute: async (args: ToolInput): Promise<ToolOutput> => { const query = args.query || ''; const maxResults = Math.min(args.maxResults || 10, 50); // Limit to 50 results // Declare lastError at function scope let lastError: any = null; try { const startTime = Date.now(); // Try arXiv API with enhanced retry mechanism let results = []; let apiSuccess = false; // Try multiple endpoints with different configurations const apiConfigs = [ { url: 'https://export.arxiv.org/api/query', timeout: 20000, headers: { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36', 'Accept': 'application/atom+xml' } }, { url: 'http://export.arxiv.org/api/query', timeout: 15000, headers: { 'User-Agent': 'Open-Search-MCP/1.0', 'Accept': 'application/atom+xml' } }, { url: 'https://arxiv.org/api/query', timeout: 10000, headers: { 'User-Agent': 'Open-Search-MCP/1.0', 'Accept': 'application/atom+xml' } } ]; for (const config of apiConfigs) { for (let attempt = 0; attempt < 3; attempt++) { try { const params = { search_query: `all:${encodeURIComponent(query)}`, start: 0, max_results: maxResults, sortBy: 'relevance', sortOrder: 'descending' }; const response = await axios.get(config.url, { params, timeout: config.timeout, headers: config.headers, maxRedirects: 5, validateStatus: (status) => status < 500 // Accept 4xx but retry on 5xx }); if (response.status === 200 && response.data) { // Parse XML response const xmlData = response.data; results = parseArxivXML(xmlData); if (results.length > 0) { apiSuccess = true; break; } } } catch (apiError) { lastError = apiError; // Wait before retry if (attempt < 2) { await new Promise(resolve => setTimeout(resolve, 1000 * (attempt + 1))); } } } if (apiSuccess) break; } // If API fails, try search engine as fallback if (!apiSuccess || results.length === 0) { try { console.log('arXiv API failed, trying search engine fallback...'); const searchQuery = `site:arxiv.org "${query}" filetype:pdf`; const searchEngine = await import('../../engines/search-engine-manager.js'); const searchResults = await searchEngine.SearchEngineManager.getInstance().search(searchQuery, { maxResults: maxResults * 2, timeout: 10000 }); if (searchResults && searchResults.results && searchResults.results.length > 0) { results = extractArxivResultsFromSearch(searchResults.html || '', query); console.log(`Found ${results.length} results from search engine fallback`); } } catch (searchError) { console.log('Search engine fallback also failed:', searchError); } } const searchTime = Date.now() - startTime; // If no results found, provide helpful error message if (results.length === 0) { return { success: false, error: 'No arXiv papers found for this query', data: { source: 'arXiv', query, results: [], totalResults: 0, searchTime, apiUsed: apiSuccess, suggestions: [ 'Try broader search terms', 'Check spelling of technical terms', 'Use different keywords or synonyms', 'Try searching without quotes' ], lastError: lastError ? (lastError instanceof Error ? lastError.message : String(lastError)) : null } }; } return { success: true, data: { source: apiSuccess ? 'arXiv API' : 'arXiv (Search Engine)', query, results: results.slice(0, maxResults), totalResults: results.length, searchTime, apiUsed: apiSuccess, fallbackUsed: !apiSuccess }, metadata: { totalResults: results.length, searchTime, sources: ['arxiv.org'], cached: false, apiSuccess, fallbackUsed: !apiSuccess } }; } catch (error) { return { success: false, error: `arXiv search failed: ${error instanceof Error ? error.message : String(error)}`, data: { source: 'arXiv', query, results: [], totalResults: 0, apiUsed: false, lastError: lastError ? (lastError instanceof Error ? lastError.message : String(lastError)) : null, suggestions: [ 'Check your internet connection', 'Try again in a few moments', 'Use different search terms', 'Contact support if the problem persists' ] } }; } } });
src/index.ts:229-229 (registration)
Server initialization calls registerAcademicTools which registers the search_arxiv tool into the ToolRegistry.
registerAcademicTools(this.toolRegistry); // 1 tool: search_arxiv
src/tools/academic/simple.ts:19-25 (schema)
Input schema defined in the tool registration for validating query and optional maxResults parameters.
inputSchema: { type: 'object', properties: { query: { type: 'string', description: 'Search query' }, maxResults: { type: 'number', description: 'Maximum results to return' } }, required: ['query']
src/tools/academic/simple.ts:277-305 (helper)
Helper function to parse arXiv XML API response into structured paper results.
function parseArxivXML(xmlData: string): any[] { const results: any[] = []; try { // Simple XML parsing for arXiv entries const entryRegex = /<entry>(.*?)<\/entry>/gs; const entries = xmlData.match(entryRegex) || []; for (const entry of entries) { const result = { id: extractXMLValue(entry, 'id'), title: extractXMLValue(entry, 'title')?.replace(/\s+/g, ' ').trim(), summary: extractXMLValue(entry, 'summary')?.replace(/\s+/g, ' ').trim(), authors: extractAuthors(entry), published: extractXMLValue(entry, 'published'), updated: extractXMLValue(entry, 'updated'), categories: extractCategories(entry), url: extractXMLValue(entry, 'id'), pdfUrl: extractPdfUrl(entry) }; if (result.title && result.summary) { results.push(result); } } } catch (error) {} return results; }
src/utils/input-validator.ts:221-221 (schema)
Runtime input validation schema mapping for search_arxiv using shared academicSearch Zod schema.
'search_arxiv': ToolSchemas.academicSearch,

Open Search MCP

search_arxiv

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API