Skip to main content
Glama
wlmwwx

Jina AI Remote MCP Server

by wlmwwx

search_arxiv

Search academic papers and preprints on arXiv to find research papers, scientific studies, and technical literature across fields like AI, physics, and mathematics.

Instructions

Search academic papers and preprints on arXiv repository. Perfect for finding research papers, scientific studies, technical papers, and academic literature. Use this when researching scientific topics, looking for papers by specific authors, or finding the latest research in fields like AI, physics, mathematics, computer science, etc.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesAcademic search terms, author names, or research topics (e.g., 'transformer neural networks', 'Einstein relativity', 'machine learning optimization'). Can be a single query string or an array of queries for parallel search.
numNoMaximum number of academic papers to return, between 1-100
tbsNoTime-based search parameter, e.g., 'qdr:h' for past hour, can be qdr:h, qdr:d, qdr:w, qdr:m, qdr:y

Implementation Reference

  • Handler function that orchestrates the search_arxiv tool: checks bearer token, calls executeArxivSearch helper, formats results using formatSingleSearchResultToContentItems, and handles errors.
    async ({ query, num, tbs }: SearchArxivArgs) => {
    	try {
    		const props = getProps();
    
    		const tokenError = checkBearerToken(props.bearerToken);
    		if (tokenError) {
    			return tokenError;
    		}
    
    		const searchResult = await executeArxivSearch({ query, num, tbs }, props.bearerToken);
    
    		return {
    			content: formatSingleSearchResultToContentItems(searchResult),
    		};
    	} catch (error) {
    		return createErrorResponse(`Error: ${error instanceof Error ? error.message : String(error)}`);
    	}
    },
  • Registers the search_arxiv tool with McpServer, including description, Zod input schema, and inline handler function.
    server.tool(
    	"search_arxiv",
    	"Search academic papers and preprints on arXiv repository. Perfect for finding research papers, scientific studies, technical papers, and academic literature. Use this when researching scientific topics, looking for papers by specific authors, or finding the latest research in fields like AI, physics, mathematics, computer science, etc. 💡 Tip: Use `parallel_search_arxiv` if you need to run multiple arXiv searches simultaneously.",
    	{
    		query: z.string().describe("Academic search terms, author names, or research topics (e.g., 'transformer neural networks', 'Einstein relativity', 'machine learning optimization')"),
    		num: z.number().default(30).describe("Maximum number of academic papers to return, between 1-100"),
    		tbs: z.string().optional().describe("Time-based search parameter, e.g., 'qdr:h' for past hour, can be qdr:h, qdr:d, qdr:w, qdr:m, qdr:y")
    	},
    	async ({ query, num, tbs }: SearchArxivArgs) => {
    		try {
    			const props = getProps();
    
    			const tokenError = checkBearerToken(props.bearerToken);
    			if (tokenError) {
    				return tokenError;
    			}
    
    			const searchResult = await executeArxivSearch({ query, num, tbs }, props.bearerToken);
    
    			return {
    				content: formatSingleSearchResultToContentItems(searchResult),
    			};
    		} catch (error) {
    			return createErrorResponse(`Error: ${error instanceof Error ? error.message : String(error)}`);
    		}
    	},
    );
  • Core helper function that makes the HTTP request to Jina's search API (svip.jina.ai) specifically for arXiv domain, handling response and errors.
    export async function executeArxivSearch(
        searchArgs: SearchArxivArgs,
        bearerToken: string
    ): Promise<SearchResultOrError> {
        try {
            const response = await fetch('https://svip.jina.ai/', {
                method: 'POST',
                headers: {
                    'Accept': 'application/json',
                    'Content-Type': 'application/json',
                    'Authorization': `Bearer ${bearerToken}`,
                },
                body: JSON.stringify({
                    q: searchArgs.query,
                    domain: 'arxiv',
                    num: searchArgs.num || 30,
                    ...(searchArgs.tbs && { tbs: searchArgs.tbs })
                }),
            });
    
            if (!response.ok) {
                return { error: `arXiv search failed for query "${searchArgs.query}": ${response.statusText}` };
            }
    
            const data = await response.json() as any;
            return { query: searchArgs.query, results: data.results || [] };
        } catch (error) {
            return { error: `arXiv search failed for query "${searchArgs.query}": ${error instanceof Error ? error.message : String(error)}` };
        }
    }
  • TypeScript interface defining the input arguments for search_arxiv tool.
    export interface SearchArxivArgs {
        query: string;
        num?: number;
        tbs?: string;
    }
  • src/index.ts:20-22 (registration)
    Top-level initialization that calls registerJinaTools to register all tools, including search_arxiv.
    	// Register all Jina AI tools
    	registerJinaTools(this.server, () => this.props);
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. While it mentions the tool is 'perfect for finding research papers' and lists use cases, it lacks critical behavioral details such as rate limits, authentication requirements, pagination behavior, error handling, or what the output looks like (e.g., format of returned papers). For a search tool with no annotation coverage, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, starting with the core purpose. The second sentence elaborates on use cases efficiently. However, the phrase 'Perfect for finding research papers, scientific studies, technical papers, and academic literature' is slightly redundant with the first sentence, and it could be more structured (e.g., separating purpose from guidelines).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description is incomplete for a search tool with 3 parameters. It covers purpose and usage context well but lacks behavioral transparency (e.g., output format, limitations) and doesn't compensate for the missing output schema. The schema handles parameters, but overall completeness is only adequate with clear gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters (query, num, tbs) with good descriptions. The description adds no additional parameter semantics beyond what's in the schema—it doesn't explain parameter interactions, provide examples beyond those in the schema, or clarify edge cases. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('search academic papers and preprints') and resource ('arXiv repository'), distinguishing it from siblings like search_web or search_images by specifying the academic/scientific domain. It explicitly mentions what it searches for (research papers, scientific studies, etc.), making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('when researching scientific topics, looking for papers by specific authors, or finding the latest research in fields like AI, physics, etc.'). However, it does not explicitly state when NOT to use it or mention alternatives like parallel_search_arxiv or search_ssrn, which are relevant sibling tools for similar purposes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wlmwwx/jina-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server