arxiv-mcp-server
Server Details
Search arXiv, fetch paper metadata, and read full-text content. STDIO & Streamable HTTP.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- cyanheads/arxiv-mcp-server
- GitHub Stars
- 0
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Score is being calculated. Check back soon.
Available Tools
4 toolsarxiv_get_metadataArxiv Get MetadataARead-onlyInspect
Get full metadata for one or more arXiv papers by ID. Use when you have known IDs from citations, prior search results, or memory.
| Name | Required | Description | Default |
|---|---|---|---|
| paper_ids | Yes | arXiv paper ID or array of up to 10 IDs. Format: "2401.12345" or "2401.12345v2" (with version). Also accepts legacy IDs like "hep-th/9901001". |
Output Schema
| Name | Required | Description |
|---|---|---|
| papers | Yes | Papers found. May be fewer than requested if some IDs are invalid. |
| not_found | No | Paper IDs that returned no results. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description mentions 'full metadata' implying comprehensive data retrieval, and 'one or more' matching the schema's array support. However, with readOnlyHint=true already declared in annotations, the description adds minimal behavioral context regarding error handling (e.g., invalid IDs), rate limits, or the 10-item maximum constraint documented in the schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences total with zero waste. The first sentence front-loads the core capability (what), while the second provides usage context (when). Every word earns its place with no redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (covering return values), simple single-parameter input, and readOnlyHint annotation, the description adequately covers the essential context. Minor gap: it doesn't mention the 10-item limit or error behavior, though these are partially covered by the schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the parameter paper_ids is already well-documented with format examples and constraints. The description references 'by ID' but does not add semantic meaning, examples, or usage patterns beyond what the schema already provides, warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Get full metadata'), resource ('arXiv papers'), and access method ('by ID'). The 'by ID' qualifier effectively distinguishes this tool from siblings like arxiv_search (which finds papers) and arxiv_read_paper (which likely retrieves content).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The second sentence provides explicit contextual guidance: 'Use when you have known IDs from citations, prior search results, or memory.' This clearly indicates when to use this tool versus searching, though it doesn't explicitly name the sibling search tool as the alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arxiv_list_categoriesArxiv List CategoriesARead-onlyInspect
List arXiv category codes and names. Useful for discovering valid category filters for arxiv_search.
| Name | Required | Description | Default |
|---|---|---|---|
| group | No | Filter by top-level group (e.g., "cs", "math", "physics"). Returns all categories if omitted. |
Output Schema
| Name | Required | Description |
|---|---|---|
| categories | Yes | arXiv categories matching the filter. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The readOnlyHint annotation already establishes this is a safe read operation. The description adds workflow context (that results are used as filters for arxiv_search) but does not disclose additional behavioral traits like rate limits, pagination, or caching behavior that might be relevant given it queries an external API.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of two efficient sentences with zero waste: the first establishes purpose, the second establishes usage context. Every word earns its place and the information is front-loaded appropriately.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple enumeration tool with one optional parameter and an output schema present, the description is complete. It explains what the tool returns (category codes and names) and the practical purpose (discovering filters), which is sufficient given the structured fields handle the rest.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already fully documents the optional 'group' parameter and its behavior when omitted. The description adds no parameter-specific details beyond what the schema provides, which warrants the baseline score of 3 for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('List') and resource ('arXiv category codes and names') and explicitly distinguishes itself from sibling tools by stating it provides 'valid category filters for arxiv_search', clearly differentiating it from the search and paper-reading tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool ('Useful for discovering valid category filters') and references the sibling tool 'arxiv_search' by name, establishing the clear workflow that users should call this first to obtain valid filter values before searching.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arxiv_read_paperArxiv Read PaperARead-onlyInspect
Fetch the full text content of an arXiv paper from its HTML rendering. Tries native arXiv HTML first, falls back to ar5iv. Returns raw HTML for direct interpretation.
| Name | Required | Description | Default |
|---|---|---|---|
| paper_id | Yes | arXiv paper ID (e.g., "2401.12345" or "2401.12345v2"). | |
| max_characters | No | Maximum characters of paper body content to return. Defaults to 100,000. HTML head/boilerplate is stripped before counting. When truncated, a notice and total character count are included. |
Output Schema
| Name | Required | Description |
|---|---|---|
| title | Yes | Paper title (from metadata, not parsed from HTML). |
| source | Yes | Which HTML source the content was fetched from. |
| content | Yes | Raw HTML content of the paper. |
| pdf_url | Yes | Direct PDF download URL. |
| paper_id | Yes | arXiv paper ID. |
| truncated | Yes | Whether content was truncated due to max_characters. |
| abstract_url | Yes | arXiv abstract page URL for attribution. |
| total_characters | Yes | Total character count of the full (untruncated) content. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With readOnlyHint=true in annotations, the safety profile is covered. The description adds valuable behavioral context not present in structured fields: the fallback mechanism ('Tries native arXiv HTML first, falls back to ar5iv') and the return format ('Returns raw HTML'). It does not mention rate limits or error states for invalid paper IDs, preventing a perfect score.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences efficiently structured: purpose first, implementation detail second, output format third. Zero redundancy. Every sentence conveys distinct information (what it fetches, how it fetches, what it returns) without wasting tokens.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the existence of an output schema (which handles return value documentation) and 100% input schema coverage, the description provides sufficient context. It acknowledges the output format (raw HTML), explains the fetching strategy (fallback), and covers the tool's scope completely for a 2-parameter read operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The main description does not explicitly discuss the parameters (paper_id, max_characters) or add semantic context beyond what the schema already provides. The detail about 'HTML head/boilerplate is stripped' appears in the schema's max_characters description, not the main description text.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb ('Fetch') and resource ('full text content of an arXiv paper') and distinguishes from siblings by specifying 'full text content' versus metadata (arxiv_get_metadata) or search results (arxiv_search). The HTML rendering scope further clarifies the specific format retrieved.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear functional context by specifying 'full text content,' which implicitly distinguishes it from the metadata-focused sibling (arxiv_get_metadata) and search sibling (arxiv_search). However, it does not explicitly state 'use arxiv_get_metadata instead for bibliographic data only' or similar explicit alternative guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
arxiv_searchArxiv SearchARead-onlyInspect
Search arXiv papers by query with category and sort filters. Returns paper metadata including title, authors, abstract, categories, and links.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Search query. Supports field prefixes: ti: (title), au: (author), abs: (abstract), cat: (category), co: (comment), jr: (journal ref), all: (all fields). Boolean operators: AND, OR, ANDNOT. Examples: "au:bengio AND ti:attention", "all:transformer AND cat:cs.CL". | |
| start | No | Pagination offset. Use with max_results to page through results. E.g., start=10 with max_results=10 returns results 11-20. | |
| sort_by | No | Sort criterion. Use "submitted" for newest papers, "relevance" for best query matches. | relevance |
| category | No | Filter by arXiv category (e.g., "cs.CL", "math.AG"). Prepended as "AND cat:{category}" to the query. Use arxiv_list_categories to discover valid codes. | |
| sort_order | No | Sort direction. "descending" returns newest/most relevant first. | descending |
| max_results | No | Maximum results to return (1-50). Default 10. Each result includes title, authors, abstract, and metadata — keep low to manage context budget. |
Output Schema
| Name | Required | Description |
|---|---|---|
| start | Yes | Pagination offset of this result set. |
| papers | Yes | Matching papers with full metadata. |
| total_results | Yes | Total matching papers (may exceed returned count due to pagination). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With readOnlyHint=true in annotations, the description appropriately doesn't reiterate safety. It adds value by specifying the exact metadata fields returned (title, authors, abstract, categories, links), but omits details about pagination limits, empty result handling, or rate limiting.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste: first covers the search action and available filters, second covers the return payload. Every word earns its place; appropriately front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and comprehensive input schema, the description provides sufficient context for an agent to understand the tool's function. It appropriately summarizes the return values without redundant detail.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Given 100% schema description coverage, the description meets the baseline by referencing 'category and sort filters' generally. It does not add semantic meaning beyond the detailed schema descriptions (e.g., it doesn't explain the query syntax that the schema already documents).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches arXiv papers using queries with category and sort filters, and specifies it returns metadata (not full text). This implicitly distinguishes it from sibling arxiv_read_paper, though it doesn't explicitly contrast with arxiv_get_metadata.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides implied usage guidance by specifying it returns paper metadata (title, authors, abstract, etc.), suggesting use for discovery rather than full-text retrieval. However, it lacks explicit when-to-use guidance or direct comparison with sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail — every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control — enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management — store and rotate API keys and OAuth tokens in one place
Change alerts — get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption — public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics — see which tools are being used most, helping you prioritize development and documentation
Direct user feedback — users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!