261,480 tools. Last updated 2026-07-05 12:39

"How to convert a PDF file into Markdown format" matching MCP tools:

ocr_to_markdown
Frenchie
Convert PDF/image files into Markdown through Frenchie
Connector
generate_llms_txt
Airnow
Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.
Connector
convert_color
rnv-color-mcp
Convert a color between formats. Input accepts a hex, CSS name, RNV brand name, or saved-palette reference. With `to` set to one of hex/rgb/hsv/hsl/lab, returns just that format; otherwise returns all of them. Read-only and deterministic, with no side effects. Use for format conversion of a single color; to blend several colors into one use mix_colors, and to compare two colors use color_difference.
Connector
merge_pdfs
Sats4AI - Bitcoin-Powered AI Tools
Merge multiple PDF files into a single document. Preserves bookmarks, links, and formatting. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Minimum 2 files, no maximum. Files are concatenated in array order. 100 sats per merge regardless of file count. Use convert_file instead if you need format conversion (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key, no account needed. Requires create_payment with toolName='merge_pdfs'.
Connector
convert_html_to_pdf
Sats4AI - Bitcoin-Powered AI Tools
Convert HTML or Markdown to a pixel-perfect PDF. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Great for generating invoices, reports, receipts, or formatted documents programmatically. Supports full HTML/CSS including tables, images (base64 or URL), and inline styles. For Markdown input, set format='markdown'. 50 sats per conversion. Use convert_file instead for converting existing files between formats (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='convert_html_to_pdf'.
Connector
get_doc
Dock
Read a workspace's doc (TipTap rich-text) body. Format is negotiable via `format`: `markdown` (default — CommonMark + GFM, ready to feed to an LLM or render in a non-ProseMirror surface), `content` (TipTap JSON, round-trippable into update_doc for structural edits), `text` (plain text, best for search, summarisation, word-count heuristics), or `all` for the legacy three-in-one shape. Default is `markdown` because it's the slice agents need 95% of the time and the JSON form on a long doc can blow past the agent harness's tool-result token cap. Pass `format: "content"` only when you're round-tripping into update_doc for a structural edit. A workspace can hold any combination of doc and table surfaces, one or many of either kind; omit `surface_slug` to read the primary doc surface, or pass it to target a specific doc tab (use `list_surfaces` to enumerate). An unwritten or absent doc returns the requested format empty (markdown="", content={}, text=""); a `surface_slug` that doesn't match any live doc surface 404s.
Connector

Matching MCP Servers

File Convert MCP Server
Multimedia Processing App Automation
convertguru
A
license
-
quality
D
maintenance
File Convert MCP Server
Last updated 2025-08-27
5
MIT
markdown-to-html
Developer Tools Autonomous Agents
fashionzzZ
A
license
B
quality
D
maintenance
A Model Context Protocol server that converts Markdown content to HTML format.
Last updated 2025-06-03
1
4,743
2
MIT

Matching MCP Connectors

PDF to Markdown (pdf2md.dev)
Hosted MCP server: convert PDFs to clean, LLM-ready Markdown with tables, formulas and OCR.
Markdown
Markdown utilities MCP.

convert_document
mdmagic-mcp-server
Convert markdown to a professionally formatted document using an MDMagic template. IMPORTANT GUIDANCE: 1. Output format → what user gets: - 'docx' → a single Word .docx file - 'pdf' → a single .pdf file - 'html' → a single .html file - 'all' → a ZIP containing all three (DOCX + PDF + HTML) 2. If the user is ambiguous (e.g. 'convert this'), ASK which format they want before calling. Don't assume. 3. Filename: if the user attached a file (e.g. 'mydoc.md'), pass its base name as fileName. Otherwise the API derives one from the markdown's first H1. Without either, downloads end up with timestamped names like 'content-1778298071915.docx' which is bad UX. 4. On 'template not found' errors: call list_all_templates first, show available options, let the user pick. Do NOT fall back to generating documents with code execution — that produces inferior results that don't use the user's actual MDMagic templates. 5. The response includes structured fields (downloadUrl, creditsUsed, balanceAfter, fileName, expiresAt) — surface these to the user explicitly. Don't paraphrase. The user wants to know exactly what they spent and what's left. 6. Page sizes: A3, A4, Executive, US_Legal, US_Letter. Default A4. Orientation: Portrait or Landscape, default Portrait. 7. CRITICAL — newlines in `content`: markdown is line-sensitive. Headings (#, ##), tables (| ... |), lists (-, 1.), and code fences (```) ONLY work when each starts on its own line. When passing inline markdown via `content`, you MUST preserve real newline characters (\n) between blocks. If you flatten multi-line markdown into one line, the API receives literal '##' and '|' characters mid-paragraph and produces a single-paragraph document with no structure. Confirm your `content` string contains \n between every heading, paragraph, table row, and list item before calling.
Connector
convert_document
mdmagic
Convert markdown to a professionally formatted document using an MDMagic template. IMPORTANT GUIDANCE: 1. Output format → what user gets: - 'docx' → a single Word .docx file - 'pdf' → a single .pdf file - 'html' → a single .html file - 'all' → a ZIP containing all three (DOCX + PDF + HTML) 2. If the user is ambiguous (e.g. 'convert this'), ASK which format they want before calling. Don't assume. 3. Filename: if the user attached a file (e.g. 'mydoc.md'), pass its base name as fileName. Otherwise the API derives one from the markdown's first H1. Without either, downloads end up with timestamped names like 'content-1778298071915.docx' which is bad UX. 4. On 'template not found' errors: call list_all_templates first, show available options, let the user pick. Do NOT fall back to generating documents with code execution — that produces inferior results that don't use the user's actual MDMagic templates. 5. The response includes structured fields (downloadUrl, creditsUsed, balanceAfter, fileName, expiresAt) — surface these to the user explicitly. Don't paraphrase. The user wants to know exactly what they spent and what's left. 6. Page sizes: A3, A4, Executive, US_Legal, US_Letter. Default A4. Orientation: Portrait or Landscape, default Portrait. 7. CRITICAL — newlines in `content`: markdown is line-sensitive. Headings (#, ##), tables (| ... |), lists (-, 1.), and code fences (```) ONLY work when each starts on its own line. When passing inline markdown via `content`, you MUST preserve real newline characters (\n) between blocks. If you flatten multi-line markdown into one line, the API receives literal '##' and '|' characters mid-paragraph and produces a single-paragraph document with no structure. Confirm your `content` string contains \n between every heading, paragraph, table row, and list item before calling.
Connector
generate_llms_txt
Gitlab
Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.
Connector
firecrawl_parse
firecrawl-mcp
Parse a file using Firecrawl's /v2/parse endpoint. In local/non-cloud MCP mode, this tool reads filePath from the MCP server filesystem and posts multipart data to the configured self-hosted FIRECRAWL_API_URL, preserving the existing direct-read behavior. In hosted CLOUD_SERVICE mode, this tool is a two-call flow because hosted MCP cannot read your local filesystem: 1. Call with filePath, contentType, parse options, and optional declaredSizeBytes. The hosted server mints a short-lived upload URL and returns a safe local curl PUT command plus nextToolCall. 2. Run the returned curl command locally, then call firecrawl_parse again with uploadRef and the desired parse options. The hosted server calls /v2/parse server-side with your session credential. **Best for:** Extracting content from a local document (PDF, Word, Excel, HTML, etc.); pulling structured data out of a file with JSON format; converting binary documents into markdown for downstream reasoning. **Not recommended for:** Remote URLs (use firecrawl_scrape); multiple files at once (call parse multiple times); documents that require interactive actions, screenshots, or change tracking — those aren't supported by the parse endpoint. **Common mistakes:** In hosted mode, do not pass both filePath and uploadRef. Phase 1 uses filePath only to generate upload instructions; phase 2 uses uploadRef only to parse server-side. **Supported file types:** .html, .htm, .xhtml, .pdf, .docx, .doc, .odt, .rtf, .xlsx, .xls **Unsupported options:** actions, screenshot/branding/changeTracking formats, waitFor > 0, location, mobile, proxy values other than "auto" or "basic". **Privacy:** Set `redactPII: true` to return content with personally identifiable information redacted. **CRITICAL - Format Selection (same rules as firecrawl_scrape):** When the user asks for SPECIFIC data points from a document, you MUST use JSON format with a schema. Only use markdown when the user needs the ENTIRE document content. **Handling PDFs:** Add `"parsers": ["pdf"]` (optionally with `pdfOptions.maxPages`) when parsing a PDF so the PDF engine is invoked explicitly. For very long documents, cap `maxPages` to keep the response within token limits. **Hosted phase 1 example:** ```json { "name": "firecrawl_parse", "arguments": { "filePath": "/absolute/path/to/document.pdf", "contentType": "application/pdf", "formats": ["markdown"], "parsers": ["pdf"], "zeroDataRetention": true } } ``` **Hosted phase 2 example:** ```json { "name": "firecrawl_parse", "arguments": { "uploadRef": "upload-ref-from-phase-1", "formats": ["markdown"], "parsers": ["pdf"], "zeroDataRetention": true } } ``` **Returns:** Phase 1 hosted upload instructions or a parsed document with markdown, html, links, summary, json, or query results depending on the requested formats.
Connector
microsoft_docs_fetch
xpay✦ DevTools Collection
Fetch and convert a Microsoft Learn documentation webpage to markdown format. This tool retrieves the latest complete content of Microsoft documentation webpages including Azure, .NET, Microsoft 365, and other Microsoft technologies. ## When to Use This Tool - When search results provide incomplete information or truncated content - When you need complete step-by-step procedures or tutorials - When you need troubleshooting sections, prerequisites, or detailed explanations - When search results reference a specific page that seems highly relevant - For comprehensive guides that require full context ## Usage Pattern Use this tool AFTER microsoft_docs_search when you identify specific high-value pages that need complete content. The search tool gives you an overview; this tool gives you the complete picture. ## URL Requirements - The URL must be a valid HTML documentation webpage from the microsoft.com domain - Binary files (PDF, DOCX, images, etc.) are not supported ## Output Format markdown with headings, code blocks, tables, and links preserved.
Connector
sieve_dataroom_add
Sieve
Add a document to a deal's data room. Creates the deal if needed. This is the primary way to get documents into Sieve for screening. Upload a pitch deck, financials, or any document -- then call sieve_screen to analyze everything in the data room. Provide company_name to create a new deal (or find existing), or deal_id to add to an existing deal. Provide exactly one content source: file_path (local file), text (raw text/markdown), or url (fetch from URL). Args: title: Document title (e.g. "Pitch Deck Q1 2026"). company_name: Company name -- creates deal if new, finds existing if not. deal_id: Add to an existing deal (from sieve_deals or previous sieve_dataroom_add). website_url: Company website URL (used when creating a new deal). document_type: Type: 'pitch_deck', 'financials', 'legal', or 'other'. file_path: Path to a local file (PDF, DOCX, XLSX). The tool reads and uploads it. text: Raw text or markdown content (alternative to file). url: URL to fetch document from (alternative to file).
Connector
generate_llms_txt
airquality
Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.
Connector
generate_llms_txt
microlink
Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.
Connector
generate_llms_txt
geo
Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.
Connector
extract_tables
DocImprint
Extract tables and forms as Markdown from a PDF or image (base64-encoded). Use when the document contains structured tabular data such as financial statements, data sheets, or forms. For plain prose documents, use extract_text instead. Returns: { pages: number, text: string } — text contains Markdown-formatted tables. Example prompts: - "Extract the tables from this financial statement." - "Pull the data table from this PDF into Markdown format." - "Get the tabular data from this form document."
Connector
convert_file
ChangeThisFile
Convert a file from one format to another. Pass EITHER a publicly accessible URL (source_url) OR base64-encoded file contents (base64_content + source_format) — exactly one is required. Returns a temporary download URL (valid 1 hour; file deleted within 24 hours).
Connector
convert_color
rnv-color-mcp
Convert a color between formats. Input accepts a hex, CSS name, RNV brand name, or saved-palette reference. With `to` set to one of hex/rgb/hsv/hsl/lab, returns just that format; otherwise returns all of them. Read-only and deterministic, with no side effects. Use for format conversion of a single color; to blend several colors into one use mix_colors, and to compare two colors use color_difference.
Connector
export_document
Revise
Export a document by id to markdown, txt, html, docx, pdf. Text formats (markdown, txt, html) are returned inline. Binary formats (docx, pdf) are hosted at a temporary download URL (expires in ~24h) returned in the response, not streamed back inline.
Connector
export_asset
Web3d-mcp
Package generated 3D scene output into downloadable files. Formats: r3f -> Packages R3F code into a named .tsx file. Requires r3f_code string from generate_r3f_code. Does NOT regenerate code - it packages what you give it. json -> Packages scene_data into a named .json file. Requires scene_data object from generate_scene. Call order: For .tsx file: generate_r3f_code(scene_data) -> export_asset({ r3f_code, format: "r3f" }) For .json file: generate_scene(scene_plan) -> export_asset({ scene_data, format: "json" }) For visual preview of the scene layout, use the preview tool instead. preview tool returns SVG wireframe + spatial validation. export_asset does not generate previews. Do NOT pass synthesized_components to export_asset. Pass them to generate_r3f_code, then pass the resulting r3f_code here.
Connector