Paperlib MCP is an academic literature management system that enables PDF import, intelligent search, knowledge graph construction, and automated literature review generation.
Document Management & Processing: Import PDFs with automatic text extraction, chunking, and vector embedding generation. List, retrieve, update metadata, delete, re-chunk, and regenerate embeddings for documents. Monitor ingestion status and troubleshoot errors.
Search & Retrieval: Perform hybrid searches combining full-text (FTS) and semantic vector search with configurable weighting, or use FTS-only/vector-only modes. Retrieve specific chunks, explain search results with detailed breakdowns, and build reusable evidence packs for research queries.
Knowledge Graph Construction: Extract entities (topics, methods, measures, identification strategies), relationships, and claims using LLM-driven analysis. Canonicalize and merge duplicate entities, lock entities to prevent auto-merging, and build topic communities using the Leiden clustering algorithm for macro/micro analysis. Group and cluster similar claims across papers.
Literature Review Generation: Generate structured review outlines with deterministic, reproducible templates. Draft complete literature reviews or specific sections (methodology, findings, gaps) based on topics or evidence packs. Build section-specific evidence packs, validate citation compliance, and export evidence matrices (paper-level and claim-level) in JSON/CSV formats.
Data Export & Analysis: Export compact relation views, grouped claim matrices with representative claims, section packets with complete writing inputs, and full markdown templates with placeholders for composition.
System Management: Perform health checks for database, S3/MinIO storage, and GraphRAG layer. Execute batch operations (bulk graph extraction, community rebuilds, parallel summarization), clear graph data selectively or globally, manage taxonomy rules for controlled vocabulary, and pre-calculate document frequency and topic features for performance optimization.
Provides S3-compatible object storage for PDF documents and related files in the academic literature management system.
Stores document metadata, vector embeddings, knowledge graph entities and relations, and supports hybrid search combining full-text search with pgvector semantic search.
Paperlib MCP
Academic literature management and retrieval MCP server - supporting PDF import, hybrid search, knowledge graph construction, and literature review generation.
✨ Features
Feature | Description |
PDF Import | Auto-extract text, chunk by page, generate vector embeddings |
Hybrid Search | FTS full-text search + pgvector semantic search |
Knowledge Graph | LLM-driven entity/relation/claim extraction, Leiden community detection |
Review Generation | Structured literature review auto-generation based on evidence packs |
📋 Prerequisites
PostgreSQL 16+ with pgvector extension
MinIO or S3-compatible storage
OpenRouter API Key
🚀 Installation & Usage
Method 1: Docker Compose (Recommended for Beginners)
One-click launch of complete environment (PostgreSQL + MinIO + MCP):
Configure in Cursor
Add to claude_desktop_config.json:
Method 2: uvx Install (Recommended)
Prerequisites: Requires available PostgreSQL (with pgvector) and MinIO/S3-compatible storage service.
Configure in Cursor/Claude Desktop, modify environment variables according to your actual service addresses:
Method 3: pip Install
Prerequisites: Same as Method 2, requires available PostgreSQL and MinIO/S3 services.
Configure MCP client (modify according to your actual service addresses):
Method 4: Local Development
📖 Available Tools
Basic Tools
Tool | Description |
| System health check |
| Import PDF documents |
| Download PDF by title to local directory |
| Hybrid search (recommended) |
| Get document metadata |
| List all documents |
Graph Tools
Tool | Description |
| Extract knowledge graph |
| Build topic communities |
| Generate community summaries |
Writing Tools
Tool | Description |
| Build evidence pack |
| Generate review draft |
Full tool list (48+) available at docs/MCP_TOOLS_REFERENCE.md
💡 Usage Examples
📚 Documentation
Document | Description |
Deployment Guide | |
System Architecture | |
Embedding & Retrieval | |
Knowledge Graph | |
Database Schema | |
Tools API Reference |
🛠️ Tech Stack
Component | Technology |
MCP Protocol | FastMCP |
Database | PostgreSQL 16 + pgvector |
Object Storage | MinIO (S3 Compatible) |
PDF Processing | PyMuPDF4LLM |
Embedding Model | OpenRouter (text-embedding-3-small) |
Graph Clustering | igraph + Leiden |
Environment Variables
Variable | Required | Default | Description |
| ✅ | - | OpenRouter API key |
| ❌ |
| Database host |
| ❌ |
| Database user |
| ❌ |
| Database password |
| ❌ |
| Database name |
| ❌ |
| MinIO endpoint |
| ❌ |
| MinIO user |
| ❌ |
| MinIO password |
📄 License
MIT