convert_pdf_to_card_html
Convert a local PDF into a portable, source-linked HTML reader with preserved text, page previews, and cropped tables/figures as images.
Instructions
Convert a local PDF into a portable, source-linked HTML reader.
The generated reader is static HTML with embedded assets. It keeps source text intact, includes source-page previews for verification, and crops detected tables, figures, and display formulas as images. Optional MCP sampling is limited to validated style-token choices or boundary-only card polish; raw CSS and source-text rewrites are rejected.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| ocr | No | Try optional OCR fallback for image-only PDFs. | |
| theme | No | Reader theme name. The default soft theme is calm and minimal. | soft |
| title | No | Optional reader title override. | |
| offline | No | Use only already-cached optional ML models. | |
| pdf_path | Yes | Absolute or relative path to the input PDF. | |
| max_pages | No | Optional limit for very large PDFs. | |
| standalone | No | Keep true to embed images, CSS, and JavaScript in one HTML file. | |
| output_path | No | Optional path for the generated HTML file. | |
| text_engine | No | "char_geometry" or "pdfplumber_words". Defaults to geometry-based spacing. | char_geometry |
| style_engine | No | "fixed", "pdf", or "sampling". Fixed preserves the original soft reader palette. PDF derives a bounded style from local PDF visuals. Sampling asks the host LLM to choose validated style tokens from local PDF style hints. | |
| table_engine | No | "auto", "pdfplumber", or "gmft". Auto uses gmft when installed. | auto |
| model_cache_dir | No | Optional cache directory for local ML table model weights. | |
| postprocess_engine | No | "none" or "sampling". Sampling asks the host LLM for boundary-only card polish operations, validates exact text preservation, and rewrites the reader. | none |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||