PageIndex MCP
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@PageIndex MCPshow the table of contents for lecture01.pdf"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
PageIndex MCP (self-hosted)
A self-hosted MCP server exposing PageIndex's
vectorless, reasoning-based document retrieval. The pageindex/ directory is a
vendored copy of VectifyAI's open-source PageIndex package (MIT licensed, see
pageindex/LICENSE.upstream).
How it works:
Ingest (
app/ingest.py): builds a hierarchical "table of contents" tree for a PDF using an LLM (OpenAI by default, configurable viapageindex/config.yaml+ LiteLLM). This costs a small amount of LLM usage, once per document.Serve (
app/server.py): exposeslist_documents,get_document,get_document_structure,get_page_contentas MCP tools over streamable HTTP, protected by a bearer token. The connecting agent (e.g. Claude) does the navigation/reasoning itself - serving is free after ingest.Text files: anything that isn't a PDF (code, Jupyter notebooks, markdown, any UTF-8 file up to 10 MB) is stored as plain text without LLM ingest - instantly available, zero cost. The same MCP tools serve them, with 1-indexed line numbers taking the role of page numbers (e.g.
get_page_content(doc_id, "1-200")returns the first 200 lines). Notebook outputs are stripped on upload; only markdown and code cells are kept.Web UI (
/): minimal document manager - create folders (projects), upload PDFs and text files (PDF ingest runs in a background worker, one at a time), watch processing status, delete documents. Unlock with the same bearer token; it is kept in the browser's localStorage.
Setup
Copy
.env.exampleto.envand fill in:OPENAI_API_KEY- used only during ingest (tree building)PAGEINDEX_MCP_API_KEY- bearer token clients must send, e.g.openssl rand -hex 32optional:
PAGEINDEX_MODEL(any LiteLLM model id for tree-building; default gpt-4o-2024-11-20) andPAGEINDEX_INGEST_WORKERS(parallel PDF ingests, default 2)
Build and start:
docker compose up -d --buildUpload PDFs and text files via the web UI at
https://<your-domain>/(unlock with thePAGEINDEX_MCP_API_KEY). PDF ingest runs in the background; the list shows processing/done/failed per document. Text files are done immediately.Alternatively via CLI inside the container:
docker compose exec pageindex-mcp python3 app/ingest.py /data/pdfs/lecture01.pdf --project "Machine Learning"Trees are saved to
<data>/trees/<doc_id>.jsonand registered in<data>/documents.json.
Related MCP server: flint-slating
Connecting an MCP client
{
"mcpServers": {
"pageindex-self": {
"type": "http",
"url": "https://<your-domain>/mcp",
"headers": {
"Authorization": "Bearer <PAGEINDEX_MCP_API_KEY>"
}
}
}
}For Claude Code:
claude mcp add --transport http pageindex-self https://<your-domain>/mcp \
--header "Authorization: Bearer <PAGEINDEX_MCP_API_KEY>"For opencode (~/.config/opencode/opencode.json, or a project-level
opencode.json):
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"pageindex-self": {
"type": "remote",
"url": "https://<your-domain>/mcp",
"enabled": true,
"headers": {
"Authorization": "Bearer <PAGEINDEX_MCP_API_KEY>"
}
}
}
}To keep the token out of the config file, opencode supports env substitution:
"Authorization": "Bearer {env:PAGEINDEX_MCP_API_KEY}".
Deployment
The compose file attaches the service to the external dokploy-network, so in
Dokploy you only need to add a domain pointing at service pageindex-mcp,
port 8000 (Traefik handles TLS). The container port is intentionally not
published on the host - the bearer token must only travel over HTTPS.
For plain local use (no Dokploy), swap the networks section for the
commented-out 127.0.0.1 port binding in docker-compose.yml.
GET /health is unauthenticated and returns ok - useful for uptime checks.
Persistence
../files/data/ (PDFs, generated trees, registry) is bind-mounted and persists
across rebuilds/restarts. On Dokploy this is the app's files storage dir,
which survives redeploys (the code dir does not). Back it up if you don't want
to re-run ingest.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/lorenzpfei/pageindex-mcp-self-hosted'
If you have feedback or need assistance with the MCP directory API, please join our Discord server