Searchless-ngx
Provides tools for searching, filtering, and retrieving documents from a Paperless-ngx instance using metadata and semantic search.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Searchless-ngxFind all invoices from January 2024"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
🪄 Searchless-ngx
Stop searching your documents. Start asking them. An Agentic RAG MCP Server for Paperless-ngx.
Less searching. More finding. Searchless-ngx transforms your Paperless-ngx instance from a static, keyword-based archive into an intelligent, conversational agent. By leveraging the Model Context Protocol (MCP) and Agentic RAG, it allows modern LLMs to natively understand, search, filter, and reason over your documents.
🤔 About the Name
If Paperless freed you from the burden of physical paper, Searchless frees you from the burden of manual searching.
Serverless means you don't manage servers.
Passwordless means you don't type passwords.
Searchless means you don't click through filters or skim 20-page PDFs anymore. You just ask your assistant a question, and it does the heavy lifting for you. The
-ngxpays homage to the incredible Paperless-ngx project that makes all of this possible.
(Note: Under the hood, the technical service is named paperless-mcp-server to provide optimal contextual grounding for the AI).
This project assumes that your documents are properly parsed (OCR) and have high-quality tags assigned within Paperless-ngx. Searchless-ngx is a retrieval and reasoning layer, not an organization tool. If your library needs better metadata or automated tagging, please check outPaperless-GPT.
✨ Key Features
Agentic RAG: Equips your LLM with tools to query, filter, and summarize your personal documents.
Hybrid Search Strategy:
Exact Metadata API: Leverage Paperless-ngx's powerful filtering (tags, correspondents, dates) for precise retrieval.
Semantic Vector Search: Use ChromaDB and Gemini embeddings to find documents based on meaning and context (e.g., "software subscriptions", "food receipts").
Optimized for Open WebUI:
Strict JSON Schema: Zero
anyOfornulltypes to ensure 100% compatibility with experimental MCP parsers.Interactive Cards: Search results are presented as beautiful Markdown cards with clickable titles and metadata.
Read-Only: Zero destructive actions. It uses existing OCR text and never downloads binary PDFs.
Smart Sync: Startup sync uses a watermark to fetch only new/changed documents from Paperless in seconds. Periodic background sync (default: every 15 min, configurable) keeps the index continuously up to date. Webhook support for real-time ingestion of individual documents. Manual full-sync via
POST /sync/all.Search Resilience: Proactive fallback strategies ensure the LLM finds documents even when initial filters are too restrictive.
Related MCP server: ragi
🏗️ Architecture
graph TD
User([User]) -->|Chat| OWUI(Open WebUI)
OWUI -->|MCP Streamable HTTP| MCP(FastAPI MCP Server)
MCP -->|Gemini Embeddings| Chroma[(ChromaDB)]
MCP -->|"Read-Only API (metadata, content, sync)"| Paperless(Paperless-ngx)
MCP -->|Paginated Cache| Cache[(In-Memory Metadata)]
External([Paperless Workflow / External]) -->|POST /webhook/sync| MCP
Timer([Periodic Sync\nevery 15 min]) -->|bulk_sync_documents| MCP🚀 Setup & Installation
1. Prerequisites
Docker & Docker Compose
Paperless-ngx instance
Google Gemini API Key (for embeddings and/or LLM)
2. Environment Configuration
Copy .env.example to .env and configure:
cp .env.example .envVariable | Description |
| Your Paperless-ngx base URL. |
| API Token from Paperless settings. |
| Google GenAI key for embeddings. |
| (Optional) URL used for clickable links in chat. Defaults to |
| (Optional) Log verbosity: |
| (Optional) Limit segments per document (Default: 100 ≈ 25 pages). |
| (Optional) Cap initial ingestion to the X newest documents. |
| (Optional) Periodic background sync interval in minutes (Default: 15). Set to |
3. Docker Compose
Start the agent and Open WebUI:
docker compose up -d4. Connect to Open WebUI
Open
http://localhost:8080.Go to Settings > Connections > MCP Servers.
Preferred Method: Click the import button and select
scripts/webui-connection.json.Manual Method: Add a new server with type
MCP Streamable HTTPand URLhttp://mcp-server:8001/mcp.
For detailed Open WebUI instructions, see WEBUI_SETUP.md.
💡 Usage Examples
Listing Documents
"List the last 5 documents from Amazon." (Uses exact metadata search)
"Show me my most recent invoices." (Uses empty query to fetch by date)
Conceptual Search (Semantic)
"Find all software subscriptions I have." (Finds "Netflix", "Adobe", "Microsoft" even if "subscription" isn't in the title)
"Where are my food receipts from my last trip to Berlin?" (Combines location context with document meaning)
Data Extraction
"How much did I spend on mobility in February 2024?" (LLM iterates through scouter/train invoices and calculates the sum)
"Summarize the cancellation terms for my gym contract." (LLM uses
get_document_detailsto read the full OCR text)
🛠️ Development & Testing
Diagnostic Tools
Use the raw protocol checker to verify the server's output:
docker exec paperless-mcp-server python scripts/test_mcp_raw.pyTest Coverage
Run the test suite using uv:
uv run pytest👤 Author
Developed and maintained by Dr. Henning Dickten (@hensing).
⚖️ License
Licensed under the GPLv3.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/hensing/searchless-ngx'
If you have feedback or need assistance with the MCP directory API, please join our Discord server