Enables searching and downloading research papers from arXiv.org, providing access to paper metadata and full PDF versions based on academic search queries.
Provides tools for citation analysis and academic impact metrics, allowing users to search for papers and retrieve detailed metadata including citation counts and influence graphs.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Research Paper Ingestion MCP Serversearch arXiv for 'recursive self-improvement' and extract key insights"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Research Paper Ingestion MCP Server
Autonomous knowledge acquisition from academic research papers for AGI self-improvement.
Part of the Agentic System - a 24/7 autonomous AI framework with persistent memory.
Features
Paper Discovery
arXiv Integration: Search and download from arXiv.org
Semantic Scholar: Citation analysis and academic impact metrics
PDF Download: Automatic paper retrieval and storage
Knowledge Extraction
Insight Extraction: Identify key findings and contributions
Citation Analysis: Understand paper influence and relationships
Technique Identification: Extract novel methods and approaches
Memory Integration
Enhanced Memory: Store extracted knowledge for AGI learning
Structured Entities: Create searchable memory representations
Citation Graphs: Track knowledge lineage
Installation
Configuration
Add to ~/.claude.json:
Available Tools
search_arxiv
Search arXiv for research papers by query.
Parameters:
query(required): Search query (e.g., "recursive self-improvement AGI")max_results: Maximum results (default: 10)sort_by: Sort order - relevance, lastUpdatedDate, submittedDate
Example:
search_semantic_scholar
Search Semantic Scholar for papers with citation metrics.
Parameters:
query(required): Search queryfields: Metadata fields to retrievelimit: Maximum results (default: 10)
Example:
download_paper
Download research paper PDF from URL.
Parameters:
url(required): PDF URLpaper_id(required): Unique identifier for filename
Example:
extract_insights
Extract key insights and findings from paper text.
Parameters:
paper_text(required): Full paper text or abstractfocus_areas: Optional specific areas to focus on
Example:
analyze_citations
Analyze citation relationships and paper influence.
Parameters:
paper_id(required): Semantic Scholar or arXiv paper IDdepth: Citation graph depth 1-3 (default: 1)
Example:
store_paper_knowledge
Store extracted knowledge in enhanced-memory for AGI learning.
Parameters:
paper_metadata(required): Paper metadata dictinsights(required): List of key insightstechniques: List of novel techniques
Example:
Usage Patterns
Autonomous Research Workflow
Citation Network Analysis
Storage
Papers Directory:
${AGENTIC_SYSTEM_PATH:-/opt/agentic}/agentic-system/research-papers/PDFs: Saved as
{paper_id}.pdfMemory Integration: Via enhanced-memory-mcp create_entities
Dependencies
arxiv: arXiv API Python wrapper
aiohttp: Async HTTP client for Semantic Scholar API
mcp: Model Context Protocol SDK
Future Enhancements
PDF Text Extraction: Parse full paper text from PDFs
Figure/Diagram Analysis: Extract visual insights
Code Repository Links: Find implementation code
Related Papers: Automatic discovery of connected research
Trend Detection: Identify emerging research directions
LLM-Powered Insight Extraction: Use GPT-4 for deeper analysis
Integration with AGI System
This MCP server closes Gap #1 from AGI_GAP_ANALYSIS.md:
Knowledge Acquisition Infrastructure ✅
✓ Research Paper Ingestion (arXiv + Semantic Scholar)
⏳ Video Transcript Processing (separate MCP)
⏳ GitHub Repository Analysis (future)
⏳ Documentation Scraping (future)
⏳ Knowledge Graph Integration (future)
Impact: System can now autonomously learn from the latest AI research!