Schema | oxidize-pdf

oxidize-pdf

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`OXIDIZE_WORKSPACE`	No	Path to your PDFs workspace directory

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": true }
`logging`	{}
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`extensions`	{ "io.modelcontextprotocol/ui": {} }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
read_pdfA	Read a single PDF's document-level metadata without parsing its content. Returns a JSON object with: page_count, is_encrypted, version, title, author, subject, keywords, and (when include_page_details=true) a `pages` array of {index, width, height, rotation}. Read-only: never modifies the file. Use this to inspect what a PDF is before deciding how to process it. For structural validation, corruption/PDF-A checks, or comparing two files use analyze_pdf instead; for the actual text use extract_text. Encrypted files without a password return {is_encrypted, locked, message} rather than metadata.
extract_textA	Extract the raw, unformatted text of a PDF as a single string. Returns JSON {text, page_count} (plus `page` when a specific page was requested). Read-only. Use this when you want the plain reading text. If you need Markdown structure or chunking for LLM/RAG pipelines use convert_pdf; if you need each text run with its on-page coordinates and font use extract_entities.
convert_pdfA	Convert a whole PDF into a text representation for downstream LLM use. Returns JSON: {content, format} for 'markdown', or {chunks, format} for 'chunks'/'rag' (each chunk carries its index and page_numbers; rag chunks add token_estimate and heading_context). Read-only. Use this when you need structure or chunking. If you just want the raw reading text use extract_text; for per-run coordinates use extract_entities.
analyze_pdfA	Inspect a PDF's structural health or conformance (does not read content). Returns JSON keyed by the chosen check: validate → {valid, error_count, warning_count}; corruption → {corrupted, corruption_type, severity, found_pages, file_size, errors}; compliance → {level, is_valid, error_count, warning_count, compliance_percentage}; compare → {structurally_equivalent, content_equivalent, similarity_score, difference_count}. Read-only. Use this to verify a file is well-formed, archival-grade, or identical to another. To read titles/author/page counts use read_pdf; for the text use extract_text.
extract_entitiesA	Extract every text run of a PDF together with its layout geometry. Returns JSON {path, entities, entity_count, page_count} where each entity is {text, page (0-based), x, y, font_size, font_name}. Coordinates are in PDF points with the origin at the bottom-left of the page. Read-only. Use this for layout-aware tasks (table reconstruction, positional lookup, locating a label on the page). If you only need the reading text without coordinates, use extract_text; for Markdown or RAG chunks use convert_pdf.
manipulate_pdfA	Restructure the pages of existing PDF file(s) and write a new PDF. Each operation writes to output_path (overwriting any existing file) and returns JSON {status, operation}; on a missing required argument it returns {error, code}. Per-operation requirements: merge→input_paths; rotate→degrees; extract_pages→page_indices; overlay→overlay_path; split→output_path is a directory. Page indices are 0-based. Use this for page-level structure. To stamp notes/highlights use annotate_pdf; to fill form fields use manage_forms; to encrypt use secure_pdf.
annotate_pdfA	Stamp a sticky note or highlight onto a page of an existing PDF. Writes the annotated copy to output_path (overwriting any existing file) and returns JSON {status, annotation_type}; out-of-range pages or coordinates outside the page bounds return {error, code}. Coordinates are in PDF points with the origin at the bottom-left. Use this to mark up a document. To reorder/rotate/overlay whole pages use manipulate_pdf; to author a new PDF use create_pdf.
manage_formsA	Create, fill, read or validate PDF form fields. Returns JSON per operation: create→{status, fields_created}; fill→{status, fields_filled}; read→{path, fields, page_count}; validate→{valid, fields}. 'create'/'fill' write output_path (overwriting); 'read'/'validate' are read-only computations. Honest limitations: 'fill' lays the values into a new overlay at computed positions rather than mapping them onto the original AcroForm widgets; 'read' returns the page's text runs (not declared AcroForm field objects); 'validate' currently enforces only a non-empty (required) rule per value. For page-structure edits use manipulate_pdf; to read prose use extract_text.
secure_pdfA	Encrypt a PDF, report its encryption status, or verify its signatures. Returns JSON per operation: encrypt→{status, operation, page_count, note}; permissions→{path, is_encrypted, unlocked, permissions}; verify_signatures→ {path, signatures, signature_count}. 'permissions' and 'verify_signatures' are read-only; 'encrypt' writes output_path (overwriting). Caveat: 'encrypt' rebuilds the document from its text, preserving content and layout but possibly dropping images, embedded fonts and vector graphics (the current API has no in-place encryption). To encrypt a PDF you are authoring, pass the passwords to save_pdf instead.
create_pdfA	Open an in-memory PDF building session; the first step of authoring a PDF. Returns JSON {session_id, status, page_size}. No file is written here — this only allocates a session (with one blank starting page) held in server memory and subject to a TTL. Not idempotent: each call creates a new session. Workflow: create_pdf → add_pdf_content (text / new pages, repeatable) → save_pdf (writes the file and closes the session). To annotate or fill an existing PDF instead of authoring one, use annotate_pdf or manage_forms.
add_pdf_contentA	Append text or a new page to an open create_pdf session (step 2 of 3). Mutates the in-memory session; nothing is written to disk until save_pdf. Returns JSON {status, session_id, page_count} on success, or {error, code} if the session is missing/inactive or required text fields are absent. Coordinates use PDF points with the origin at the bottom-left of the page. Call repeatedly to build up pages, then call save_pdf. This only works on a session from create_pdf — to add notes/highlights to an existing PDF file use annotate_pdf instead.
save_pdfA	Render an open create_pdf session to a PDF file (step 3 of 3, terminal). Builds a Document from the session's accumulated pages, writes it to output_path (overwriting any existing file), then deletes the session — so the session_id is no longer usable afterwards. Returns JSON {status, path, page_count}, or {error, code} if the session is missing. Only finalizes sessions created via create_pdf/add_pdf_content. To encrypt an already-saved PDF use secure_pdf; to add annotations use annotate_pdf.

Prompts

Interactive templates invoked by user choice

Name	Description
`create-invoice`	Guide the LLM through creating a PDF invoice.
`extract-for-rag`	Guide the LLM through extracting PDF content for RAG ingestion.
`review-pdf`	Guide the LLM through a comprehensive PDF review.
`compare-documents`	Guide the LLM through comparing two PDF documents.
`fill-form`	Guide the LLM through filling a PDF form.

Resources

Contextual data attached and managed by the client

Name	Description
`get_fonts`	List available built-in PDF fonts.
`get_page_sizes`	List available page sizes with dimensions in points.
`get_capabilities`	List server capabilities: available tools, version, features.
`get_version`	Return version information for oxidize-pdf and the MCP server.
`get_workspace`	List PDF files in the configured workspace directory.

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bzsanti/oxidize-python'

If you have feedback or need assistance with the MCP directory API, please join our Discord server