arXiv MCP Server
Search arXiv by keywords, authors, categories, and dates; extract full text from PDFs; manage a local library with collections and tags; generate summaries and compare papers.
Build citation networks and discover cited and citing papers for arXiv papers.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@arXiv MCP ServerSearch for recent papers on large language models"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
arXiv MCP Server
I built this MCP server to access 2.4M+ arXiv papers directly in Claude Desktop. It uses GROBID for academic PDF extraction and builds citation networks to track research connections.
What It Does
Search arXiv by keywords, authors, categories, and dates
Extract full text from PDFs using GROBID (handles equations and references)
Build citation networks using Semantic Scholar integration
Manage a local library with collections and tags
Generate summaries and compare papers side-by-side
Related MCP server: arXiv MCP Server
PDF Extraction
I implemented three extraction tiers that adapt to document complexity:
FAST: pdfplumber for simple documents (~1s)
SMART: GROBID for academic papers (~5s) - preserves equations and references
PREMIUM: Mistral OCR for complex layouts (~2s) - requires API key
🚀 Quick Start
Installation
Option 1: Install via npm (Recommended)
# Install globally
npm install -g arxiv-mcp-server
# Or install locally in a project
npm install arxiv-mcp-serverOption 2: Install from source
# Clone the repository
git clone https://github.com/r-uben/arxiv-mcp-server.git
cd arxiv-mcp-server
# Install dependencies with Poetry
poetry install
# Test the server
poetry run arxiv-mcp-serverClaude Desktop Integration
For npm installation:
Add to your claude_desktop_config.json:
{
"mcpServers": {
"arxiv": {
"command": "npx",
"args": ["arxiv-mcp-server"],
"cwd": "/path/to/your/project"
}
}
}Or for global installation:
{
"mcpServers": {
"arxiv": {
"command": "arxiv-mcp-server"
}
}
}For Poetry installation:
{
"mcpServers": {
"arxiv": {
"command": "poetry",
"args": ["run", "arxiv-mcp-server"],
"cwd": "/path/to/arxiv-mcp-server"
}
}
}Restart Claude Desktop and you're ready to go!
Examples
"Search for recent papers on large language models in the last 6 months"
"Find all papers by Geoffrey Hinton on deep learning"
"Build a citation network around paper 2301.00001"
"Save paper 2301.00001 to my 'Transformers' collection"
"Summarize the key findings from paper 2301.00001"⚙️ Configuration
API Keys (Optional)
For enhanced features, set these environment variables:
# For premium PDF extraction (Mistral OCR)
export MISTRAL_API_KEY="your-mistral-api-key"
# For faster citation lookups (Semantic Scholar)
export SEMANTIC_SCHOLAR_API_KEY="your-semantic-scholar-api-key"External Services (Optional)
GROBID Server - For enhanced academic paper processing:
docker run --rm -it --init -p 8070:8070 lfoppiano/grobid:0.8.0Configuration Options
Variable | Purpose | Default |
| Premium OCR extraction | None |
| Citation discovery API | None |
| GROBID server URL |
|
| Always use SMART tier for academic papers |
|
Available Tools
I've implemented 25 tools across four categories:
Search & Discovery: search papers, find by author, get recent papers, find similar papers
Library Management: save papers, manage collections, track reading status, search library
Citation Analysis: extract references, find citing papers, build citation networks
Content Analysis: extract PDFs, summarize papers, compare papers, extract key findings
How It Works
The server automatically:
Analyzes PDF complexity and selects the best extraction method
Caches papers locally to reduce API calls
Respects rate limits (arXiv: 3 req/s, Semantic Scholar: 1-4 req/s)
Falls back gracefully when services are unavailable
Development
# Development setup
poetry install
poetry run pytest # Run tests
poetry run black . # Format code
poetry run ruff check . # Lint code
# Testing individual components
poetry run python -m pytest tests/ # Full test suite
poetry run arxiv-mcp-server # Start server manuallyarXiv Categories
Field | Popular Categories |
Computer Science |
|
Mathematics |
|
Physics |
|
Biology |
|
License
MIT License © 2025 Ruben Fernández-Fuertes
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/r-uben/arxiv-mcp-server'
If you have feedback or need assistance with the MCP directory API, please join our Discord server