arxiv-mcp-server
Server Details
Search arXiv, fetch paper metadata, and read full-text content. STDIO & Streamable HTTP.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- cyanheads/arxiv-mcp-server
- GitHub Stars
- 0
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.1/5 across 4 of 4 tools scored.
Each tool has a clearly distinct purpose with no overlap: get_metadata retrieves known IDs, list_categories provides category info, read_paper fetches full text, and search performs queries. An agent can easily tell them apart based on their specific functions.
All tool names follow a consistent 'arxiv_verb_noun' pattern (arxiv_get_metadata, arxiv_list_categories, arxiv_read_paper, arxiv_search), using snake_case and starting with the server prefix. This predictability makes the set easy to navigate and understand.
With 4 tools, this server is well-scoped for its arXiv domain, covering core operations: searching, retrieving metadata, accessing full text, and discovering categories. Each tool earns its place without bloat or thinness, fitting typical use cases efficiently.
The tool surface is nearly complete for arXiv interactions, covering search, metadata retrieval, full text access, and category discovery. A minor gap exists in lacking explicit update or delete operations, but these are not typical for arXiv's read-heavy domain, and agents can work around this with the provided tools.
Available Tools
4 toolsarxiv_get_metadataArxiv Get MetadataARead-onlyInspect
Get full metadata for one or more arXiv papers by ID. Use when you have known IDs from citations, prior search results, or memory.
| Name | Required | Description | Default |
|---|---|---|---|
| paper_ids | Yes | arXiv paper ID or array of up to 10 IDs. Format: "2401.12345" or "2401.12345v2" (with version). Also accepts legacy IDs like "hep-th/9901001". |
Output Schema
| Name | Required | Description |
|---|---|---|
| papers | Yes | Papers found. May be fewer than requested if some IDs are invalid. |
| not_found | No | Per-input explanations for inputs that could not be returned. Absent when nothing failed. |
| totalSucceeded | Yes | Number of successful items in 'papers' |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description mentions 'full metadata' implying comprehensive data retrieval, and 'one or more' matching the schema's array support. However, with readOnlyHint=true already declared in annotations, the description adds minimal behavioral context regarding error handling (e.g., invalid IDs), rate limits, or the 10-item maximum constraint documented in the schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences total with zero waste. The first sentence front-loads the core capability (what), while the second provides usage context (when). Every word earns its place with no redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (covering return values), simple single-parameter input, and readOnlyHint annotation, the description adequately covers the essential context. Minor gap: it doesn't mention the 10-item limit or error behavior, though these are partially covered by the schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the parameter paper_ids is already well-documented with format examples and constraints. The description references 'by ID' but does not add semantic meaning, examples, or usage patterns beyond what the schema already provides, warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Get full metadata'), resource ('arXiv papers'), and access method ('by ID'). The 'by ID' qualifier effectively distinguishes this tool from siblings like arxiv_search (which finds papers) and arxiv_read_paper (which likely retrieves content).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The second sentence provides explicit contextual guidance: 'Use when you have known IDs from citations, prior search results, or memory.' This clearly indicates when to use this tool versus searching, though it doesn't explicitly name the sibling search tool as the alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arxiv_list_categoriesArxiv List CategoriesARead-onlyInspect
List arXiv category codes and names. Useful for discovering valid category filters for arxiv_search.
| Name | Required | Description | Default |
|---|---|---|---|
| group | No | Filter by top-level group (e.g., "cs", "math", "physics"). Returns all categories if omitted. |
Output Schema
| Name | Required | Description |
|---|---|---|
| categories | Yes | arXiv categories matching the filter. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The readOnlyHint annotation already establishes this is a safe read operation. The description adds workflow context (that results are used as filters for arxiv_search) but does not disclose additional behavioral traits like rate limits, pagination, or caching behavior that might be relevant given it queries an external API.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of two efficient sentences with zero waste: the first establishes purpose, the second establishes usage context. Every word earns its place and the information is front-loaded appropriately.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple enumeration tool with one optional parameter and an output schema present, the description is complete. It explains what the tool returns (category codes and names) and the practical purpose (discovering filters), which is sufficient given the structured fields handle the rest.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already fully documents the optional 'group' parameter and its behavior when omitted. The description adds no parameter-specific details beyond what the schema provides, which warrants the baseline score of 3 for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('List') and resource ('arXiv category codes and names') and explicitly distinguishes itself from sibling tools by stating it provides 'valid category filters for arxiv_search', clearly differentiating it from the search and paper-reading tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool ('Useful for discovering valid category filters') and references the sibling tool 'arxiv_search' by name, establishing the clear workflow that users should call this first to obtain valid filter values before searching.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arxiv_read_paperArxiv Read PaperARead-onlyInspect
Fetch the full text of an arXiv paper as HTML. Tries arxiv.org/html first; falls back to ar5iv.labs.arxiv.org when the native render is unavailable. PDF-only papers (no HTML render on either source) return an html_unavailable error with the pdf_url for direct download. Page through long papers with the start and max_characters parameters.
| Name | Required | Description | Default |
|---|---|---|---|
| start | No | Character offset into the cleaned body to begin reading from. Defaults to 0. Use with max_characters to page through long papers — e.g., start=100000 with max_characters=100000 returns chars 100,000–199,999. The total length is reported as body_characters in the response. | |
| paper_id | Yes | arXiv paper ID (e.g., "2401.12345" or "2401.12345v2"). | |
| max_characters | No | Maximum characters of paper body content to return. Defaults to 100,000. HTML head/boilerplate is stripped before counting. When truncated, a notice and total character count are included. |
Output Schema
| Name | Required | Description |
|---|---|---|
| start | Yes | Character offset of the first character in content within the cleaned body. |
| title | Yes | Paper title (from metadata, not parsed from HTML). |
| source | Yes | Which HTML source the content was fetched from. |
| content | Yes | Cleaned paper body HTML for the requested slice. Empty when start is past body_characters. |
| pdf_url | Yes | Direct PDF download URL. |
| paper_id | Yes | arXiv paper ID. |
| truncated | Yes | True when more body content exists past this slice (start + content.length < body_characters). |
| abstract_url | Yes | arXiv abstract page URL for attribution. |
| body_characters | Yes | Character count of the full cleaned body HTML. Use with start and max_characters to page. Typically 3-4× smaller than total_characters for math-heavy papers. |
| total_characters | Yes | Character count of the original unprocessed HTML body. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With readOnlyHint=true in annotations, the safety profile is covered. The description adds valuable behavioral context not present in structured fields: the fallback mechanism ('Tries native arXiv HTML first, falls back to ar5iv') and the return format ('Returns raw HTML'). It does not mention rate limits or error states for invalid paper IDs, preventing a perfect score.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences efficiently structured: purpose first, implementation detail second, output format third. Zero redundancy. Every sentence conveys distinct information (what it fetches, how it fetches, what it returns) without wasting tokens.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the existence of an output schema (which handles return value documentation) and 100% input schema coverage, the description provides sufficient context. It acknowledges the output format (raw HTML), explains the fetching strategy (fallback), and covers the tool's scope completely for a 2-parameter read operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The main description does not explicitly discuss the parameters (paper_id, max_characters) or add semantic context beyond what the schema already provides. The detail about 'HTML head/boilerplate is stripped' appears in the schema's max_characters description, not the main description text.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb ('Fetch') and resource ('full text content of an arXiv paper') and distinguishes from siblings by specifying 'full text content' versus metadata (arxiv_get_metadata) or search results (arxiv_search). The HTML rendering scope further clarifies the specific format retrieved.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear functional context by specifying 'full text content,' which implicitly distinguishes it from the metadata-focused sibling (arxiv_get_metadata) and search sibling (arxiv_search). However, it does not explicitly state 'use arxiv_get_metadata instead for bibliographic data only' or similar explicit alternative guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arxiv_searchArxiv SearchARead-onlyInspect
Search arXiv papers by query with category and sort filters. Returns paper metadata including title, authors, abstract, categories, and links.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Search query. Field prefixes: ti: (title), au: (author — token-based; quote multi-token names like au:"hinton g" or pair with a topical clause to disambiguate common surnames), abs: (abstract), cat: (category — exact code match, not fuzzy), co: (comment), jr: (journal ref), all: (all fields). Boolean operators: AND, OR, ANDNOT. Examples: "au:bengio AND ti:attention", "all:transformer AND cat:cs.CL". | |
| start | No | Pagination offset (0-10000). Use with max_results to page through results. E.g., start=10 with max_results=10 returns results 11-20. | |
| sort_by | No | Sort criterion. Use "submitted" for newest papers, "relevance" for best query matches. | relevance |
| category | No | Filter results to a specific arXiv category (e.g., "cs.CL", "math.AG"). Use arxiv_list_categories to discover valid codes. | |
| sort_order | No | Sort direction. "descending" returns newest/most relevant first. | descending |
| max_results | No | Maximum results to return (1-50). Default 10. Each result includes title, authors, abstract, and metadata — keep low to limit response size. |
Output Schema
| Name | Required | Description |
|---|---|---|
| start | Yes | Pagination offset of this result set. |
| papers | Yes | Matching papers with full metadata. |
| total_results | Yes | Total matching papers (may exceed returned count due to pagination). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With readOnlyHint=true in annotations, the description appropriately doesn't reiterate safety. It adds value by specifying the exact metadata fields returned (title, authors, abstract, categories, links), but omits details about pagination limits, empty result handling, or rate limiting.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste: first covers the search action and available filters, second covers the return payload. Every word earns its place; appropriately front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and comprehensive input schema, the description provides sufficient context for an agent to understand the tool's function. It appropriately summarizes the return values without redundant detail.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Given 100% schema description coverage, the description meets the baseline by referencing 'category and sort filters' generally. It does not add semantic meaning beyond the detailed schema descriptions (e.g., it doesn't explain the query syntax that the schema already documents).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches arXiv papers using queries with category and sort filters, and specifies it returns metadata (not full text). This implicitly distinguishes it from sibling arxiv_read_paper, though it doesn't explicitly contrast with arxiv_get_metadata.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides implied usage guidance by specifying it returns paper metadata (title, authors, abstract, etc.), suggesting use for discovery rather than full-text retrieval. However, it lacks explicit when-to-use guidance or direct comparison with sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!