pdf-context
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| STORAGE_DIR | No | Legacy alternative for storage folder (use PDF_CONTEXT_STORAGE_DIR instead) | |
| PDF_DATA_DIR | No | Legacy alternative for PDF folder (use PDF_CONTEXT_PDF_DATA_DIR instead) | |
| PDF_CONTEXT_STORAGE_DIR | No | SQLite + Chroma storage folder (must not overlap PDF folder) | storage |
| PDF_CONTEXT_PDF_DATA_DIR | No | PDF watch folder (must not overlap storage) | data/pdfs |
| PDF_CONTEXT_WATCH_ENABLED | No | Auto-ingest on folder changes | true |
| PDF_CONTEXT_EMBEDDING_MODEL | No | Local embedding model | all-MiniLM-L6-v2 |
| PDF_CONTEXT_EMBEDDING_PROVIDER | No | Embedding provider (sentence_transformers or ollama) | sentence_transformers |
| PDF_CONTEXT_CHECKPOINT_PAGE_INTERVAL | No | Resume checkpoint interval during large ingests | 50 |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": false
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| search_pdf_context_toolA | Semantic search over PDFs indexed by this MCP server only. Use when the user asks for facts, quotes, definitions, or comparisons drawn from ingested PDFs; wants page citations; or names a document, chapter, or section in the corpus. Prefer chapter_id/section_id/page filters when scope is known. Do not use for general world knowledge, repo/code tasks, git, or questions that do not require content from this server's PDF folder. If unsure whether the question is about indexed PDFs, call list_documents first. |
| list_documents_toolA | List PDFs indexed by this MCP server with type and structure summary. Call first when unsure whether the user's question is about this corpus, when they ask what books/papers are available, or before search when no document name was given. Do not use for listing files in the git repo or filesystem outside this server's ingested index. |
| get_document_profile_toolA | Get document type, retrieval profile, and how to query a specific PDF. Use when starting work on a named document, when choosing between semantic search vs sequential reading, or when the user asks how a document is classified (textbook, paper, etc.). Do not use for documents not in this server's index or for general ML/PDF advice unrelated to a specific ingested file. |
| set_document_type_toolA | Override auto-classified document type for an indexed PDF. Use only when the user explicitly asks to change doc_type or when misclassification is confirmed and affects retrieval. Set reingest=true only if they want chunks re-built under the new profile. Do not use proactively on every document or for repo configuration tasks. |
| list_structure_toolA | Get the full table-of-contents / structure tree for an indexed PDF. Use when the user needs the complete outline, section hierarchy, or node IDs for navigation—not just top-level chapters. Do not use for repo directory trees or documents not in list_documents. |
| list_chapters_toolA | List chapters for an indexed PDF (primarily textbooks). Use when the user asks about chapter numbers, chapter titles, chapter IDs, or page ranges at the chapter level. Pair with get_section_content for reading; note outline end_page may equal start_page—use get_next_chunks to read further. Do not use for git branches, code modules, or non-PDF structure. |
| get_section_content_toolA | Get ordered text chunks for a chapter or section node in an indexed PDF. Use for structured reading, teaching a section, or fetching passage text when node_id is known (from list_chapters or list_structure). Prefer over search when the user wants sequential content at a known location. Do not use without a valid node_id from this document's structure tree. |
| get_next_chunks_toolA | Read the next sequential chunks in an indexed PDF from a cursor. Use after get_section_content or a prior get_next_chunks call to continue chapter-by-chapter or page-by-page reading. Pass cursor from the previous response's global_sequence_index. Do not use for unrelated search queries; use search_pdf_context instead. |
| get_ingest_status_toolA | Get ingest queue status and health for this MCP server's index. Use when search returns nothing, the user added new PDFs, ingestion may still be running, or they ask whether documents are ready. Check before assuming a document is missing from the corpus. Do not use on every message—only when index readiness or failures matter. |
| reingest_document_toolA | Force re-index of a PDF in this server's corpus. Use when the user explicitly requests re-ingest, after PDF file changes, or after set_document_type with reingest. Not for routine queries. Do not use proactively or for code/build tasks. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/mematcha/pdf-context'
If you have feedback or need assistance with the MCP directory API, please join our Discord server