Skip to main content
Glama
acchuang

Jina AI Remote MCP Server

by acchuang

search_arxiv

Search academic papers and preprints on arXiv to find research papers, scientific studies, and technical literature across fields like AI, physics, and mathematics.

Instructions

Search academic papers and preprints on arXiv repository. Perfect for finding research papers, scientific studies, technical papers, and academic literature. Use this when researching scientific topics, looking for papers by specific authors, or finding the latest research in fields like AI, physics, mathematics, computer science, etc.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesAcademic search terms, author names, or research topics (e.g., 'transformer neural networks', 'Einstein relativity', 'machine learning optimization'). Can be a single query string or an array of queries for parallel search.
numNoMaximum number of academic papers to return, between 1-100
tbsNoTime-based search parameter, e.g., 'qdr:h' for past hour, can be qdr:h, qdr:d, qdr:w, qdr:m, qdr:y

Implementation Reference

  • Full registration of the search_arxiv MCP tool, including name, description, Zod input schema (query, optional num), and the complete async handler function that calls Jina's search API with domain='arxiv' to fetch and return academic paper results in YAML format.
    server.tool(
    	"search_arxiv",
    	"Search academic papers and preprints on arXiv repository. Perfect for finding research papers, scientific studies, technical papers, and academic literature. Use this when researching scientific topics, looking for papers by specific authors, or finding the latest research in fields like AI, physics, mathematics, computer science, etc. Returns academic papers with URLs, titles, abstracts, and metadata.",
    	{
    		query: z.string().describe("Academic search terms, author names, or research topics (e.g., 'transformer neural networks', 'Einstein relativity', 'machine learning optimization')"),
    		num: z.number().optional().describe("Maximum number of academic papers to return, between 1-100 (default: 30)")
    	},
    	async ({ query, num = 30 }: { query: string; num?: number }) => {
    		try {
    			const props = getProps();
    
    			const tokenError = checkBearerToken(props.bearerToken);
    			if (tokenError) {
    				return tokenError;
    			}
    
    			const response = await fetch('https://svip.jina.ai/', {
    				method: 'POST',
    				headers: {
    					'Accept': 'application/json',
    					'Content-Type': 'application/json',
    					'Authorization': `Bearer ${props.bearerToken}`,
    				},
    				body: JSON.stringify({
    					q: query,
    					domain: 'arxiv',
    					num
    				}),
    			});
    
    			if (!response.ok) {
    				return handleApiError(response, "arXiv search");
    			}
    
    			const data = await response.json() as any;
    
    
    			return {
    				content: [
    					{
    						type: "text" as const,
    						text: yamlStringify(data.results),
    					},
    				],
    			};
    		} catch (error) {
    			return {
    				content: [
    					{
    						type: "text" as const,
    						text: `Error: ${error instanceof Error ? error.message : String(error)}`,
    					},
    				],
    				isError: true,
    			};
    		}
    	},
    );
  • src/index.ts:21-22 (registration)
    Calls registerJinaTools to register all tools including search_arxiv on the MCP server during initialization.
    	registerJinaTools(this.server, () => this.props);
    }
  • handleApiError helper function used in search_arxiv handler to return standardized error responses for common API errors like 401, 402, 429.
    export function handleApiError(response: Response, context: string = "API request") {
    	if (response.status === 401) {
    		return {
    			content: [
    				{
    					type: "text" as const,
    					text: "Authentication failed. Please set your API key in the Jina AI MCP settings. You can get a free API key by visiting https://jina.ai and signing up for an account.",
    				},
    			],
    			isError: true,
    		};
    	}
    	if (response.status === 402) {
    		return {
    			content: [
    				{
    					type: "text" as const,
    					text: "This key is out of quota. Please top up this key at https://jina.ai",
    				},
    			],
    			isError: true,
    		};
    	}
    	
    	if (response.status === 429) {
    		return {
    			content: [
    				{
    					type: "text" as const,
    					text: "Rate limit exceeded. Please upgrade your API key to get higher rate limits. Visit https://jina.ai to manage your subscription and increase your usage limits.",
    				},
    			],
    			isError: true,
    		};
    	}
    	
    	// Default error message for other HTTP errors
    	return {
    		content: [
    			{
    				type: "text" as const,
    				text: `Error: ${context} failed - ${response.status} ${response.statusText}`,
    			},
    		],
    		isError: true,
    	};
    }
  • checkBearerToken helper function used in search_arxiv handler to validate the presence of Jina API key (bearer token) before making API calls.
    export function checkBearerToken(bearerToken: string | undefined) {
    	if (!bearerToken) {
    		return {
    			content: [
    				{
    					type: "text" as const,
    					text: "Please set your API key in the Jina AI MCP settings. You can get a free API key by visiting https://jina.ai and signing up for an account.",
    				},
    			],
    			isError: true,
    		};
    	}
    	return null; // No error, token is available
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the tool is 'perfect for finding research papers' but doesn't describe rate limits, authentication needs, pagination behavior, error conditions, or what the response format looks like. For a search tool with zero annotation coverage, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized with two sentences. The first sentence states the purpose clearly, and the second provides usage examples. While efficient, the second sentence could be slightly more concise by reducing the list of examples, but overall it's well-structured and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description provides adequate basic information about what the tool does and when to use it, but lacks details about behavioral traits, response format, and explicit differentiation from sibling tools. For a search tool with 3 parameters and no structured output documentation, this is minimally viable but has clear gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema. According to scoring rules, when schema coverage is high (>80%), the baseline is 3 even with no param info in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches academic papers on arXiv with specific examples (research papers, scientific studies, etc.), but doesn't explicitly differentiate from sibling tools like parallel_search_arxiv or search_ssrn. It provides a specific verb ('search') and resource ('academic papers and preprints on arXiv repository'), making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implied usage guidelines with examples ('when researching scientific topics, looking for papers by specific authors, or finding the latest research'), but doesn't explicitly state when to use this tool versus alternatives like parallel_search_arxiv or search_ssrn. It gives context but lacks explicit comparison or exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/acchuang/jina-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server