Skip to main content
Glama
260,856 tools. Last updated 2026-07-05 08:54

"A tool for reading PDF files" matching MCP tools:

  • Merge multiple PDF files into a single document. Preserves bookmarks, links, and formatting. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Minimum 2 files, no maximum. Files are concatenated in array order. 100 sats per merge regardless of file count. Use convert_file instead if you need format conversion (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key, no account needed. Requires create_payment with toolName='merge_pdfs'.
    Connector
  • Convert HTML or Markdown to a pixel-perfect PDF. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Great for generating invoices, reports, receipts, or formatted documents programmatically. Supports full HTML/CSS including tables, images (base64 or URL), and inline styles. For Markdown input, set format='markdown'. 50 sats per conversion. Use convert_file instead for converting existing files between formats (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='convert_html_to_pdf'.
    Connector
  • Download a PDF from a URL and extract all text content, page by page. Use this to read the full text of a specific document — for example, an annual report PDF linked from a search_filings result. Best combined with search_filings: use search_filings to locate the document, then parse_pdf_to_text for the full text. Do not use for PDFs that are already well-represented in the database — search_filings is faster and returns pre-ranked, relevant excerpts. Not suitable for scanned (image-only) PDFs without embedded text; those pages will be returned as "(no extractable text)". Args: pdf_url: Direct HTTPS URL to the PDF file, e.g. https://example.com/report.pdf. Must be publicly accessible; authentication-protected URLs will fail. Returns: All text from the PDF with "--- Page N ---" separators between pages. Returns an error string if the download fails, the URL does not point to a valid PDF, or the document exceeds the 60-second download timeout.
    Connector
  • USE WHEN reading the full content of a Pine Script v6 documentation file. Returns the file content; when limit is set, a header shows the char range and offset to continue reading. AFTER calling this tool, use offset=<end> to continue if the header indicates more content is available. For large files (ta.md, strategy.md, collections.md, drawing.md, general.md), prefer list_sections() + get_section() instead. Data sourced from bundled Pine Script v6 documentation.
    Connector
  • Upload a file (base64) and attach it to a page (editor+) — an image, PDF, dataset, etc. Returns the serve URL plus a ready-to-paste `markdown` snippet; then call update_page or patch_page to place it in the body (images render inline as ![](…), other files as a download card). The payload is inline base64 and rides through the model's context, so it is capped at 5 MB — keep it to small files (screenshots, charts, short PDFs). For larger files use request_attachment_upload (a direct PUT URL, bytes off-context), or the tela editor (drag-drop).
    Connector
  • Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.
    Connector

Matching MCP Servers

Matching MCP Connectors

  • Merge/split/compress/watermark/encrypt/decrypt PDF and convert images over MCP.

  • 斯特丹STERDAN天猫旗舰店产品咨询MCP Server。洛阳30年源头工厂,高端钢制办公家具,1374个SKU,涵盖保密柜、更衣柜、公寓床、货架、快递柜。BIFMA认证,出口35+国家。8个工具:产品目录查询、场景推荐、认证资质、采购政策、维护指南等。

  • Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.
    Connector
  • Persist income/expense lines as saved Zen data. Use after reading a user-provided statement, PDF, or described transactions — not to correct existing rows (use edit_transaction for that, or delete + recreate for direction flips). Groups lines under one operation; cashflow is attributed to operation_date month, not today. Send whole statements in one call with expected_count set; the response lists saved and not_saved rows — if ok is false, check not_saved, fix ONLY those rows, and resend ONLY them. Rows in saved are already persisted: never resend them.
    Connector
  • Lists **what's in** each extracted artefact for a filing — section counts, item names, and the page each item came from — without returning any of the bulky factor tables, descriptions, or rate rows themselves. **Call this FIRST**, before `get_filing_extracts`, for any "what does this filing contain" question. It costs a fraction of the tokens and tells you which file + which section you need to pull in detail. `get_filing_extracts` is then the targeted second call once you know the SERFF + file + section that actually answer the user's question. Use this when the user asks: - "What forms does this filing include?" / "List the form numbers in TSIS-134726605." - "How many exclusions does it carry? What are they called?" - "What rate tables are in this filing, and which PDF page are they on?" - "List the discounts / endorsements / coverages this filing offers." - "Where in the source PDF is the territory rate table?" - Any "how many", "what are the names of", or "which page is X on" question about a filing's extracted artefacts. Wrong surface for: - Anything that needs the actual numeric content (factor values, full rate rows, full exclusion text). Call `get_filing_extracts` instead, narrowing `files` to just the one(s) you discovered here. Whitelist (same as `get_filing_extracts`): - `calculations.json` — example rate-calculation walk-throughs. - `coverages.json` — coverage definitions (perils, limits, applicability). - `deductibles.json` — deductible options + factors. - `discounts.json` — discount / surcharge schedules. - `endorsements.json` — optional endorsements / riders. - `examples.json` — worked policyholder rating examples. - `exclusions.json` — coverage exclusions + the conditions they apply to. - `extraction_summary.json` — structured filing-overview fields. - `final_rating_calculation.json` — canonical rating expression. - `forms.json` — policy form numbers + types. - `rates_data.json` — base rates + rate-table headers. - `underwriting_guidelines.json` — eligibility / UW rules. Per item the tool returns `{ name, source_page? }`. The item name is picked from whichever identifying field exists (`name` → `form_number` → `id` → `key` → `code` → `coverage` → `label` → `title`). `source_page` is the page in the source PDF where the item was extracted from, when the pipeline recorded one. `rates_data.json` items additionally carry `source_file` — the source PDF the rate table lives in — when the filing has a single source PDF. Multi-source filings get `source_file_note` flagging the limit (per-item `source_file` on non-rate extracts needs a pipeline-side change, deferred). Args: `serff` (required), `files` (optional — pass a subset of the whitelist to narrow; omit for all 12). Returns: `{ serff, files: { "<name>": { file_name, filing_ref?, confidence?, sections: { "<key>": { count, items: [...] } }, total_items } }, count, skipped }`.
    Connector
  • Parse a file using Firecrawl's /v2/parse endpoint. In local/non-cloud MCP mode, this tool reads filePath from the MCP server filesystem and posts multipart data to the configured self-hosted FIRECRAWL_API_URL, preserving the existing direct-read behavior. In hosted CLOUD_SERVICE mode, this tool is a two-call flow because hosted MCP cannot read your local filesystem: 1. Call with filePath, contentType, parse options, and optional declaredSizeBytes. The hosted server mints a short-lived upload URL and returns a safe local curl PUT command plus nextToolCall. 2. Run the returned curl command locally, then call firecrawl_parse again with uploadRef and the desired parse options. The hosted server calls /v2/parse server-side with your session credential. **Best for:** Extracting content from a local document (PDF, Word, Excel, HTML, etc.); pulling structured data out of a file with JSON format; converting binary documents into markdown for downstream reasoning. **Not recommended for:** Remote URLs (use firecrawl_scrape); multiple files at once (call parse multiple times); documents that require interactive actions, screenshots, or change tracking — those aren't supported by the parse endpoint. **Common mistakes:** In hosted mode, do not pass both filePath and uploadRef. Phase 1 uses filePath only to generate upload instructions; phase 2 uses uploadRef only to parse server-side. **Supported file types:** .html, .htm, .xhtml, .pdf, .docx, .doc, .odt, .rtf, .xlsx, .xls **Unsupported options:** actions, screenshot/branding/changeTracking formats, waitFor > 0, location, mobile, proxy values other than "auto" or "basic". **Privacy:** Set `redactPII: true` to return content with personally identifiable information redacted. **CRITICAL - Format Selection (same rules as firecrawl_scrape):** When the user asks for SPECIFIC data points from a document, you MUST use JSON format with a schema. Only use markdown when the user needs the ENTIRE document content. **Handling PDFs:** Add `"parsers": ["pdf"]` (optionally with `pdfOptions.maxPages`) when parsing a PDF so the PDF engine is invoked explicitly. For very long documents, cap `maxPages` to keep the response within token limits. **Hosted phase 1 example:** ```json { "name": "firecrawl_parse", "arguments": { "filePath": "/absolute/path/to/document.pdf", "contentType": "application/pdf", "formats": ["markdown"], "parsers": ["pdf"], "zeroDataRetention": true } } ``` **Hosted phase 2 example:** ```json { "name": "firecrawl_parse", "arguments": { "uploadRef": "upload-ref-from-phase-1", "formats": ["markdown"], "parsers": ["pdf"], "zeroDataRetention": true } } ``` **Returns:** Phase 1 hosted upload instructions or a parsed document with markdown, html, links, summary, json, or query results depending on the requested formats.
    Connector
  • Fetch and convert a Microsoft Learn documentation webpage to markdown format. This tool retrieves the latest complete content of Microsoft documentation webpages including Azure, .NET, Microsoft 365, and other Microsoft technologies. ## When to Use This Tool - When search results provide incomplete information or truncated content - When you need complete step-by-step procedures or tutorials - When you need troubleshooting sections, prerequisites, or detailed explanations - When search results reference a specific page that seems highly relevant - For comprehensive guides that require full context ## Usage Pattern Use this tool AFTER microsoft_docs_search when you identify specific high-value pages that need complete content. The search tool gives you an overview; this tool gives you the complete picture. ## URL Requirements - The URL must be a valid HTML documentation webpage from the microsoft.com domain - Binary files (PDF, DOCX, images, etc.) are not supported ## Output Format markdown with headings, code blocks, tables, and links preserved.
    Connector
  • Download a completed report as PDF. Returns base64-encoded PDF content. Confirm report status='completed' via atlas_get_report(report_id) first. report_id from atlas_start_report response or atlas_list_reports. Free.
    Connector
  • Check the status of a file upload created by files-create_upload_url. Returns status: 'pending' (not uploaded yet), 'completed' (file attached, includes file metadata), or 'expired' (link timed out). Required: token (string, from files-create_upload_url response).
    Connector
  • USE WHEN discovering what Pine Script v6 documentation is available. Returns a categorised list of doc file paths with one-line descriptions. AFTER calling this tool, call get_doc(path) for small files or list_sections(path) then get_section(path, header) for large files (ta.md, strategy.md, collections.md, drawing.md, general.md). Data sourced from bundled Pine Script v6 documentation.
    Connector
  • Returns all published Arco sources for a term — Lexicon entries, blog articles, wiki pages, and podcast episodes — ordered by recommended reading sequence. Read-only. Use this when you need a reading list or reference list for a term. Use cite_term instead when you need a formatted citation for a specific publication type.
    Connector
  • Prepare a paid PDF render from arbitrary Handlebars-flavoured HTML. Use only when no starter fits (one-off layouts, custom branding). Prefer render_template_to_pdf when a starter matches. Validates your HTML and returns the exact, ready-to-execute HTTP request to run against pdfzen's render endpoint — POST /v402/render/pdf (x402, $0.006 USDC on Base, no API key) or POST /v1/render/pdf (pdfzen API key). pdfzen renders are executed over HTTP, not streamed in-band over MCP; this tool is the bridge.
    Connector
  • Discover sheet names and used dimensions before reading or editing a WorkPaper. Returns metadata only; use read_range or read_cell for values.
    Connector
  • Return an inline PDF artifact from supplied report_meta, tables, metrics, and summary content; this read-only renderer does not persist hosted files. Use this only when a structured report payload already exists; use report_docx_generate for editable Word output or compliance_edd_report to build the memo first.
    Connector
  • Lists the **source files** (PDFs, XLS spreadsheets, DOC manuals, ZIP archives) ingested for a SERFF id. Returns metadata only — name, size in bytes, MIME-class type (`pdf` / `spreadsheet` / `document` / `csv` / `archive` / `other`), file extension, modified timestamp. Pair with `get_filing_source_file_link` to mint a signed download link the user can click — list names here, mint a link there. Use this to: - triage a filing whose summary looks thin ("did we even ingest the right files?"), - discover the XLSM rater / rate manual PDF / rating-samples spreadsheet for a filing, - confirm which artefacts a filing actually shipped (e.g. is there a separate rate manual XLS, or just the PDF?). Returns `{ error: ... }` if no source files exist for the SERFF id.
    Connector
  • Sign a PDF: opens an interactive widget where the user draws, types or uploads a signature and places it on the document. Optionally pass signature_name to pre-render a handwritten-style signature. ALWAYS use this for PDF signing requests — never sign or modify the PDF yourself; the user reviews and downloads in the widget. All processing happens locally in the user's browser — the file is never uploaded. Podpisz PDF: narysuj, wpisz lub wgraj podpis i umieść go na dokumencie; plik nie opuszcza przeglądarki.
    Connector