How do I use LongBook Verifier?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@LongBook Verifier verify these claims against the uploaded book" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

LongBook Verifier

by Mormolykos

Overview Schema Related Servers Score Discussions

Python

Local

LongBook Verifier

An MCP-enabled evaluation and claim-grounding toolkit for long-document RAG systems.

License: MIT Python DOI Live demo: BookProof MCP: local stdio

LongBook Verifier measures whether retrieval methods and AI outputs are actually grounded in long narrative manuscripts — by retrieving evidence from the document and scoring coverage, deterministically and without external model APIs.

Use it three ways

	What you get	For
🔌 Local MCP	A local stdio MCP server for Claude Code / Codex — long-document evaluation, retrieval, claim verification, and report tools. Local only (not hosted or remote). → docs/MCP.md	Using the verifier as tools inside your AI client
🌐 Try BookProof online	The live hosted web product — upload a document + golden questions in the browser, no install. → tts.bedvibe.studio/bookproof/app	Trying it instantly
💻 Run locally	Clone and run the FastAPI web app + evaluation engine on your own machine. → docs/LOCAL_RUN.md	Developers / researchers inspecting or running the verifier

Related MCP server: mcp-rag-server

What it does

Retrieval evaluation across five methods — naive_first_context, naive_last_context, flat_chunk_rag, chapter_summary_chain, hierarchical_book_rag — on book-length documents.
Claim / answer grounding: scores an AI output (or a set of claims/questions) against the source document using evidence-term coverage, answer-term coverage, and retrieval context precision/recall–like metrics.
Deterministic local embeddings (hashing_numpy) — reproducible, no downloaded models and no Claude/OpenAI/Gemini calls.
Three ways to use the same engine: a CLI/eval pipeline, a local FastAPI web app, and a local stdio MCP server for AI coding clients.

Why it exists

Short-answer correctness and evidence grounding can diverge: a model can give a plausible answer that the document doesn't actually support. LongBook Verifier separates those signals so you can audit whether outputs and retrieval are grounded in long manuscripts — useful for manuscript QA and reproducible long-document evaluation.

Live product

A hosted, public version of this evaluation runs as BookProof:

➡️ BookProof — try it online

BookProof is an existing, related public product. It is not required to run anything in this repository locally.

Architecture

One evaluation engine, three access surfaces, plus the hosted product:

Research / evaluation engine (src/) — chunking, deterministic index build, retrieval, the five methods, metrics, and claim verification.
Local FastAPI web app (product_mvp/server_longbook_verifier.py) — upload a document + golden questions in the browser and get scored locally.
Local stdio MCP server (product_mvp/mcp_longbook_server.py) — exposes the engine to MCP clients (e.g. Claude Code / Codex) over stdio, locally only.
BookProof public product/API — a deployed instance offering a rate-limited public demo and a separate token-gated verification API (see BookProof API).

See docs/ARCHITECTURE.md for a diagram.

Research methods

The benchmark reports evidence-term coverage, answer-term coverage, retrieval context recall, and task-completion behavior separately — because short-answer correctness and evidence grounding can diverge. Two experiments are documented in paper/:

Experiment A — a pilot single-book benchmark (~64k words, 40 gold questions, 5 retrieval methods, 5 external consumer AI systems under a free-tier protocol).
Experiment B — an extended stress test on a 240,767-word corpus (~320,220 tokens, 80 gold questions, 5 retrieval methods).

These are a pilot plus stress-test package, not a universal model ranking or state-of-the-art claim. Full methods and results are in paper/; the research package is archived at DOI 10.5281/zenodo.20513116.

Quick start

python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt

Details: docs/LOCAL_RUN.md.

Run the local web app

python -m uvicorn product_mvp.server_longbook_verifier:app --host 127.0.0.1 --port 8078

Then open http://127.0.0.1:8078/ and upload a document + golden questions.

Use the local MCP server

python product_mvp\mcp_longbook_server.py

The MCP server runs locally over stdio; an MCP client launches it as a subprocess. It also requires the mcp package: python -m pip install mcp. See docs/MCP.md.

MCP client configuration

Generic mcpServers entry (replace the path with the location of your cloned repo — see mcp_config.example.json):

{
  "mcpServers": {
    "longbook-proof-local": {
      "command": "python",
      "args": ["EDIT_THIS_PATH/product_mvp/mcp_longbook_server.py"]
    }
  }
}

MCP tools

All tools are read-or-allowlisted, local-only:

Tool	Description
`longbook_status`	Read-only project summary (allowed roots, scripts, report/run counts, default backend).
`list_books`	List `.txt` / `.md` / `.docx` files under the project book folder (or a sub-path inside the project).
`list_reports`	List report-like files (`.md` / `.txt` / `.json` / `.jsonl` / `.csv`).
`read_report`	Read a report-like file with truncation.
`run_chunking`	Chunk a book into a `.jsonl` (`src/chunk_book.py`).
`run_index_build`	Build a retrieval index (`src/build_index.py`, `hashing_numpy`).
`run_retrieve`	Return ranked chunks from an existing local index (`src/retrieve.py`).
`run_eval`	Run a retrieval-evaluation method over a book + questions (`src/run_eval.py`).
`generate_report_tables`	Build summary CSV tables from run folders (`src/report_tables.py`).

Repository structure

src/             evaluation engine (chunking, index, retrieval, methods, metrics, claim checks)
product_mvp/     local FastAPI web app + local stdio MCP server + site/ frontend
paper/           research write-ups (methods, results, limitations) + CITATION.cff
docs/            LOCAL_RUN, MCP, ARCHITECTURE
scripts/         Windows helpers (run_web.bat, run_mcp.bat)

Data policy

Copyrighted corpora, source manuscripts, private evaluation data, and user uploads are intentionally excluded from this repository. The tools operate on documents you provide.

Security model

Confirmed in product_mvp/mcp_longbook_server.py: the MCP server runs local stdio only and calls an allowlisted set of local scripts. It uses no arbitrary shell commands (no shell=True), enforces strict read/write path checks (reads confined to the project root; writes confined to outputs/, reports/, and product_mvp/runs/), rejects paths containing .env / secret / key / token / password, runs child scripts with stdin=DEVNULL, applies a timeout, and makes no cloud or external model calls. It does not provide shell execution or remote access.

BookProof API

The hosted BookProof product exposes:

a public, rate-limited demo endpoint (capped document size, capped questions, one run per IP per day, inputs deleted after processing), and
a separate token-gated verification API (authenticated via an X-BookProof-Token header) with a machine-readable spec endpoint.

No token is included in this repository.

Limitations

Evaluation is lexical/retrieval-based and deterministic (hashing_numpy); it is not a semantic-embedding or model-graded benchmark.
The published results are a pilot + stress test, not a universal ranking or SOTA claim.
The MCP server expects the local project files and runs entirely on your machine.

License

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Mormolykos/longbook-rag-eval-lab'

If you have feedback or need assistance with the MCP directory API, please join our Discord server