Skip to main content
Glama
mematcha

pdf-context

by mematcha

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
STORAGE_DIRNoLegacy alternative for storage folder (use PDF_CONTEXT_STORAGE_DIR instead)
PDF_DATA_DIRNoLegacy alternative for PDF folder (use PDF_CONTEXT_PDF_DATA_DIR instead)
PDF_CONTEXT_STORAGE_DIRNoSQLite + Chroma storage folder (must not overlap PDF folder)storage
PDF_CONTEXT_PDF_DATA_DIRNoPDF watch folder (must not overlap storage)data/pdfs
PDF_CONTEXT_WATCH_ENABLEDNoAuto-ingest on folder changestrue
PDF_CONTEXT_EMBEDDING_MODELNoLocal embedding modelall-MiniLM-L6-v2
PDF_CONTEXT_EMBEDDING_PROVIDERNoEmbedding provider (sentence_transformers or ollama)sentence_transformers
PDF_CONTEXT_CHECKPOINT_PAGE_INTERVALNoResume checkpoint interval during large ingests50

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
search_pdf_context_toolA

Semantic search over PDFs indexed by this MCP server only.

Use when the user asks for facts, quotes, definitions, or comparisons drawn from ingested PDFs; wants page citations; or names a document, chapter, or section in the corpus. Prefer chapter_id/section_id/page filters when scope is known.

Do not use for general world knowledge, repo/code tasks, git, or questions that do not require content from this server's PDF folder. If unsure whether the question is about indexed PDFs, call list_documents first.

list_documents_toolA

List PDFs indexed by this MCP server with type and structure summary.

Call first when unsure whether the user's question is about this corpus, when they ask what books/papers are available, or before search when no document name was given.

Do not use for listing files in the git repo or filesystem outside this server's ingested index.

get_document_profile_toolA

Get document type, retrieval profile, and how to query a specific PDF.

Use when starting work on a named document, when choosing between semantic search vs sequential reading, or when the user asks how a document is classified (textbook, paper, etc.).

Do not use for documents not in this server's index or for general ML/PDF advice unrelated to a specific ingested file.

set_document_type_toolA

Override auto-classified document type for an indexed PDF.

Use only when the user explicitly asks to change doc_type or when misclassification is confirmed and affects retrieval. Set reingest=true only if they want chunks re-built under the new profile.

Do not use proactively on every document or for repo configuration tasks.

list_structure_toolA

Get the full table-of-contents / structure tree for an indexed PDF.

Use when the user needs the complete outline, section hierarchy, or node IDs for navigation—not just top-level chapters.

Do not use for repo directory trees or documents not in list_documents.

list_chapters_toolA

List chapters for an indexed PDF (primarily textbooks).

Use when the user asks about chapter numbers, chapter titles, chapter IDs, or page ranges at the chapter level. Pair with get_section_content for reading; note outline end_page may equal start_page—use get_next_chunks to read further.

Do not use for git branches, code modules, or non-PDF structure.

get_section_content_toolA

Get ordered text chunks for a chapter or section node in an indexed PDF.

Use for structured reading, teaching a section, or fetching passage text when node_id is known (from list_chapters or list_structure). Prefer over search when the user wants sequential content at a known location.

Do not use without a valid node_id from this document's structure tree.

get_next_chunks_toolA

Read the next sequential chunks in an indexed PDF from a cursor.

Use after get_section_content or a prior get_next_chunks call to continue chapter-by-chapter or page-by-page reading. Pass cursor from the previous response's global_sequence_index.

Do not use for unrelated search queries; use search_pdf_context instead.

get_ingest_status_toolA

Get ingest queue status and health for this MCP server's index.

Use when search returns nothing, the user added new PDFs, ingestion may still be running, or they ask whether documents are ready. Check before assuming a document is missing from the corpus.

Do not use on every message—only when index readiness or failures matter.

reingest_document_toolA

Force re-index of a PDF in this server's corpus.

Use when the user explicitly requests re-ingest, after PDF file changes, or after set_document_type with reingest. Not for routine queries.

Do not use proactively or for code/build tasks.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mematcha/pdf-context'

If you have feedback or need assistance with the MCP directory API, please join our Discord server