LongBook Verifier
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@LongBook Verifierverify these claims against the uploaded book"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
LongBook Verifier
An MCP-enabled evaluation and claim-grounding toolkit for long-document RAG systems.
LongBook Verifier measures whether retrieval methods and AI outputs are actually grounded in long narrative manuscripts β by retrieving evidence from the document and scoring coverage, deterministically and without external model APIs.
Use it three ways
What you get | For | |
π Local MCP | A local stdio MCP server for Claude Code / Codex β long-document evaluation, retrieval, claim verification, and report tools. Local only (not hosted or remote). β docs/MCP.md | Using the verifier as tools inside your AI client |
π Try BookProof online | The live hosted web product β upload a document + golden questions in the browser, no install. β tts.bedvibe.studio/bookproof/app | Trying it instantly |
π» Run locally | Clone and run the FastAPI web app + evaluation engine on your own machine. β docs/LOCAL_RUN.md | Developers / researchers inspecting or running the verifier |
Related MCP server: research-assistant-mcp
What it does
Retrieval evaluation across five methods β
naive_first_context,naive_last_context,flat_chunk_rag,chapter_summary_chain,hierarchical_book_ragβ on book-length documents.Claim / answer grounding: scores an AI output (or a set of claims/questions) against the source document using evidence-term coverage, answer-term coverage, and retrieval context precision/recallβlike metrics.
Deterministic local embeddings (
hashing_numpy) β reproducible, no downloaded models and no Claude/OpenAI/Gemini calls.Three ways to use the same engine: a CLI/eval pipeline, a local FastAPI web app, and a local stdio MCP server for AI coding clients.
Why it exists
Short-answer correctness and evidence grounding can diverge: a model can give a plausible answer that the document doesn't actually support. LongBook Verifier separates those signals so you can audit whether outputs and retrieval are grounded in long manuscripts β useful for manuscript QA and reproducible long-document evaluation.
Live product
A hosted, public version of this evaluation runs as BookProof:
β‘οΈ BookProof β try it online
BookProof is an existing, related public product. It is not required to run anything in this repository locally.
Architecture
One evaluation engine, three access surfaces, plus the hosted product:
Research / evaluation engine (
src/) β chunking, deterministic index build, retrieval, the five methods, metrics, and claim verification.Local FastAPI web app (
product_mvp/server_longbook_verifier.py) β upload a document + golden questions in the browser and get scored locally.Local stdio MCP server (
product_mvp/mcp_longbook_server.py) β exposes the engine to MCP clients (e.g. Claude Code / Codex) over stdio, locally only.BookProof public product/API β a deployed instance offering a rate-limited public demo and a separate token-gated verification API (see BookProof API).
See docs/ARCHITECTURE.md for a diagram.
Research methods
The benchmark reports evidence-term coverage, answer-term coverage, retrieval context recall, and
task-completion behavior separately β because short-answer correctness and evidence grounding can
diverge. Two experiments are documented in paper/:
Experiment A β a pilot single-book benchmark (~64k words, 40 gold questions, 5 retrieval methods, 5 external consumer AI systems under a free-tier protocol).
Experiment B β an extended stress test on a 240,767-word corpus (~320,220 tokens, 80 gold questions, 5 retrieval methods).
These are a pilot plus stress-test package, not a universal model ranking or state-of-the-art
claim. Full methods and results are in paper/; the research package is archived at
DOI 10.5281/zenodo.20513116.
Quick start
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txtDetails: docs/LOCAL_RUN.md.
Run the local web app
python -m uvicorn product_mvp.server_longbook_verifier:app --host 127.0.0.1 --port 8078Then open http://127.0.0.1:8078/ and upload a document + golden questions.
Use the local MCP server
python product_mvp\mcp_longbook_server.pyThe MCP server runs locally over stdio; an MCP client launches it as a subprocess. It also
requires the mcp package: python -m pip install mcp. See docs/MCP.md.
MCP client configuration
Generic mcpServers entry (replace the path with the location of your cloned repo β see
mcp_config.example.json):
{
"mcpServers": {
"longbook-proof-local": {
"command": "python",
"args": ["EDIT_THIS_PATH/product_mvp/mcp_longbook_server.py"]
}
}
}MCP tools
All tools are read-or-allowlisted, local-only:
Tool | Description |
| Read-only project summary (allowed roots, scripts, report/run counts, default backend). |
| List |
| List report-like files ( |
| Read a report-like file with truncation. |
| Chunk a book into a |
| Build a retrieval index ( |
| Return ranked chunks from an existing local index ( |
| Run a retrieval-evaluation method over a book + questions ( |
| Build summary CSV tables from run folders ( |
Repository structure
src/ evaluation engine (chunking, index, retrieval, methods, metrics, claim checks)
product_mvp/ local FastAPI web app + local stdio MCP server + site/ frontend
paper/ research write-ups (methods, results, limitations) + CITATION.cff
docs/ LOCAL_RUN, MCP, ARCHITECTURE
scripts/ Windows helpers (run_web.bat, run_mcp.bat)Data policy
Copyrighted corpora, source manuscripts, private evaluation data, and user uploads are intentionally excluded from this repository. The tools operate on documents you provide.
Security model
Confirmed in product_mvp/mcp_longbook_server.py: the MCP server runs local stdio only and
calls an allowlisted set of local scripts. It uses no arbitrary shell commands (no
shell=True), enforces strict read/write path checks (reads confined to the project root;
writes confined to outputs/, reports/, and product_mvp/runs/), rejects paths containing
.env / secret / key / token / password, runs child scripts with stdin=DEVNULL, applies a
timeout, and makes no cloud or external model calls. It does not provide shell execution or
remote access.
BookProof API
The hosted BookProof product exposes:
a public, rate-limited demo endpoint (capped document size, capped questions, one run per IP per day, inputs deleted after processing), and
a separate token-gated verification API (authenticated via an
X-BookProof-Tokenheader) with a machine-readable spec endpoint.
No token is included in this repository.
Limitations
Evaluation is lexical/retrieval-based and deterministic (
hashing_numpy); it is not a semantic-embedding or model-graded benchmark.The published results are a pilot + stress test, not a universal ranking or SOTA claim.
The MCP server expects the local project files and runs entirely on your machine.
License
MIT Β© 2026 Panos Gkilis. Contact via GitHub Security Advisories for security reports (see SECURITY.md).
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Mormolykos/longbook-rag-eval-lab'
If you have feedback or need assistance with the MCP directory API, please join our Discord server