260,856 tools. Last updated 2026-07-05 08:54

"Tools or methods to convert PDF to Markdown" matching MCP tools:

google_flights_location_search
Searchapi
Search for airports and cities to get their identifiers for Google Flights tools. Returns: - IATA airport codes (e.g., 'JFK') for specific airports - kgmid (e.g., '/m/02_286') for cities - searches all airports in that city Use this tool when you have a city name like 'New York' or 'Paris' and need to convert it to codes that the flight tools accept. Note: Common IATA codes like JFK, LAX, SFO, LHR, CDG, NRT can be used directly without this tool.
Connector
convert_html_to_pdf
Sats4AI - Bitcoin-Powered AI Tools
Convert HTML or Markdown to a pixel-perfect PDF. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Great for generating invoices, reports, receipts, or formatted documents programmatically. Supports full HTML/CSS including tables, images (base64 or URL), and inline styles. For Markdown input, set format='markdown'. 50 sats per conversion. Use convert_file instead for converting existing files between formats (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='convert_html_to_pdf'.
Connector
convert_document
mdmagic-mcp-server
Convert markdown to a professionally formatted document using an MDMagic template. IMPORTANT GUIDANCE: 1. Output format → what user gets: - 'docx' → a single Word .docx file - 'pdf' → a single .pdf file - 'html' → a single .html file - 'all' → a ZIP containing all three (DOCX + PDF + HTML) 2. If the user is ambiguous (e.g. 'convert this'), ASK which format they want before calling. Don't assume. 3. Filename: if the user attached a file (e.g. 'mydoc.md'), pass its base name as fileName. Otherwise the API derives one from the markdown's first H1. Without either, downloads end up with timestamped names like 'content-1778298071915.docx' which is bad UX. 4. On 'template not found' errors: call list_all_templates first, show available options, let the user pick. Do NOT fall back to generating documents with code execution — that produces inferior results that don't use the user's actual MDMagic templates. 5. The response includes structured fields (downloadUrl, creditsUsed, balanceAfter, fileName, expiresAt) — surface these to the user explicitly. Don't paraphrase. The user wants to know exactly what they spent and what's left. 6. Page sizes: A3, A4, Executive, US_Legal, US_Letter. Default A4. Orientation: Portrait or Landscape, default Portrait. 7. CRITICAL — newlines in `content`: markdown is line-sensitive. Headings (#, ##), tables (| ... |), lists (-, 1.), and code fences (```) ONLY work when each starts on its own line. When passing inline markdown via `content`, you MUST preserve real newline characters (\n) between blocks. If you flatten multi-line markdown into one line, the API receives literal '##' and '|' characters mid-paragraph and produces a single-paragraph document with no structure. Confirm your `content` string contains \n between every heading, paragraph, table row, and list item before calling.
Connector
upload_attachment
tela
Upload a file (base64) and attach it to a page (editor+) — an image, PDF, dataset, etc. Returns the serve URL plus a ready-to-paste `markdown` snippet; then call update_page or patch_page to place it in the body (images render inline as ![](…), other files as a download card). The payload is inline base64 and rides through the model's context, so it is capped at 5 MB — keep it to small files (screenshots, charts, short PDFs). For larger files use request_attachment_upload (a direct PUT URL, bytes off-context), or the tela editor (drag-drop).
Connector
convert_document
mdmagic
Convert markdown to a professionally formatted document using an MDMagic template. IMPORTANT GUIDANCE: 1. Output format → what user gets: - 'docx' → a single Word .docx file - 'pdf' → a single .pdf file - 'html' → a single .html file - 'all' → a ZIP containing all three (DOCX + PDF + HTML) 2. If the user is ambiguous (e.g. 'convert this'), ASK which format they want before calling. Don't assume. 3. Filename: if the user attached a file (e.g. 'mydoc.md'), pass its base name as fileName. Otherwise the API derives one from the markdown's first H1. Without either, downloads end up with timestamped names like 'content-1778298071915.docx' which is bad UX. 4. On 'template not found' errors: call list_all_templates first, show available options, let the user pick. Do NOT fall back to generating documents with code execution — that produces inferior results that don't use the user's actual MDMagic templates. 5. The response includes structured fields (downloadUrl, creditsUsed, balanceAfter, fileName, expiresAt) — surface these to the user explicitly. Don't paraphrase. The user wants to know exactly what they spent and what's left. 6. Page sizes: A3, A4, Executive, US_Legal, US_Letter. Default A4. Orientation: Portrait or Landscape, default Portrait. 7. CRITICAL — newlines in `content`: markdown is line-sensitive. Headings (#, ##), tables (| ... |), lists (-, 1.), and code fences (```) ONLY work when each starts on its own line. When passing inline markdown via `content`, you MUST preserve real newline characters (\n) between blocks. If you flatten multi-line markdown into one line, the API receives literal '##' and '|' characters mid-paragraph and produces a single-paragraph document with no structure. Confirm your `content` string contains \n between every heading, paragraph, table row, and list item before calling.
Connector
firecrawl_parse
firecrawl-mcp
Parse a file using Firecrawl's /v2/parse endpoint. In local/non-cloud MCP mode, this tool reads filePath from the MCP server filesystem and posts multipart data to the configured self-hosted FIRECRAWL_API_URL, preserving the existing direct-read behavior. In hosted CLOUD_SERVICE mode, this tool is a two-call flow because hosted MCP cannot read your local filesystem: 1. Call with filePath, contentType, parse options, and optional declaredSizeBytes. The hosted server mints a short-lived upload URL and returns a safe local curl PUT command plus nextToolCall. 2. Run the returned curl command locally, then call firecrawl_parse again with uploadRef and the desired parse options. The hosted server calls /v2/parse server-side with your session credential. **Best for:** Extracting content from a local document (PDF, Word, Excel, HTML, etc.); pulling structured data out of a file with JSON format; converting binary documents into markdown for downstream reasoning. **Not recommended for:** Remote URLs (use firecrawl_scrape); multiple files at once (call parse multiple times); documents that require interactive actions, screenshots, or change tracking — those aren't supported by the parse endpoint. **Common mistakes:** In hosted mode, do not pass both filePath and uploadRef. Phase 1 uses filePath only to generate upload instructions; phase 2 uses uploadRef only to parse server-side. **Supported file types:** .html, .htm, .xhtml, .pdf, .docx, .doc, .odt, .rtf, .xlsx, .xls **Unsupported options:** actions, screenshot/branding/changeTracking formats, waitFor > 0, location, mobile, proxy values other than "auto" or "basic". **Privacy:** Set `redactPII: true` to return content with personally identifiable information redacted. **CRITICAL - Format Selection (same rules as firecrawl_scrape):** When the user asks for SPECIFIC data points from a document, you MUST use JSON format with a schema. Only use markdown when the user needs the ENTIRE document content. **Handling PDFs:** Add `"parsers": ["pdf"]` (optionally with `pdfOptions.maxPages`) when parsing a PDF so the PDF engine is invoked explicitly. For very long documents, cap `maxPages` to keep the response within token limits. **Hosted phase 1 example:** ```json { "name": "firecrawl_parse", "arguments": { "filePath": "/absolute/path/to/document.pdf", "contentType": "application/pdf", "formats": ["markdown"], "parsers": ["pdf"], "zeroDataRetention": true } } ``` **Hosted phase 2 example:** ```json { "name": "firecrawl_parse", "arguments": { "uploadRef": "upload-ref-from-phase-1", "formats": ["markdown"], "parsers": ["pdf"], "zeroDataRetention": true } } ``` **Returns:** Phase 1 hosted upload instructions or a parsed document with markdown, html, links, summary, json, or query results depending on the requested formats.
Connector

Matching MCP Servers

markdown-to-html
Developer Tools Autonomous Agents
fashionzzZ
A
license
B
quality
D
maintenance
A Model Context Protocol server that converts Markdown content to HTML format.
Last updated 2025-06-03
1
4,743
2
MIT
web-to-markdown-mcp
Web Scraping Browser Automation
sidney
A
license
A
quality
B
maintenance
Fetches a URL and returns the main content as clean Markdown, using plain HTTP when possible and headless Chromium for JavaScript-rendered or bot-protected pages.
Last updated 2026-04-27
1
MIT

Matching MCP Connectors

PDF to Markdown (pdf2md.dev)
Hosted MCP server: convert PDFs to clean, LLM-ready Markdown with tables, formulas and OCR.
Content to Social
Transform any blog post or article URL into ready-to-post social media content for Twitter/X threads, LinkedIn posts, Instagram captions, Facebook posts, and email newsletters. Pay-per-event: $0.07 for all 5 platforms, $0.03 for single platform.

microsoft_docs_fetch
xpay✦ DevTools Collection
Fetch and convert a Microsoft Learn documentation webpage to markdown format. This tool retrieves the latest complete content of Microsoft documentation webpages including Azure, .NET, Microsoft 365, and other Microsoft technologies. ## When to Use This Tool - When search results provide incomplete information or truncated content - When you need complete step-by-step procedures or tutorials - When you need troubleshooting sections, prerequisites, or detailed explanations - When search results reference a specific page that seems highly relevant - For comprehensive guides that require full context ## Usage Pattern Use this tool AFTER microsoft_docs_search when you identify specific high-value pages that need complete content. The search tool gives you an overview; this tool gives you the complete picture. ## URL Requirements - The URL must be a valid HTML documentation webpage from the microsoft.com domain - Binary files (PDF, DOCX, images, etc.) are not supported ## Output Format markdown with headings, code blocks, tables, and links preserved.
Connector
sieve_dataroom_add
Sieve
Add a document to a deal's data room. Creates the deal if needed. This is the primary way to get documents into Sieve for screening. Upload a pitch deck, financials, or any document -- then call sieve_screen to analyze everything in the data room. Provide company_name to create a new deal (or find existing), or deal_id to add to an existing deal. Provide exactly one content source: file_path (local file), text (raw text/markdown), or url (fetch from URL). Args: title: Document title (e.g. "Pitch Deck Q1 2026"). company_name: Company name -- creates deal if new, finds existing if not. deal_id: Add to an existing deal (from sieve_deals or previous sieve_dataroom_add). website_url: Company website URL (used when creating a new deal). document_type: Type: 'pitch_deck', 'financials', 'legal', or 'other'. file_path: Path to a local file (PDF, DOCX, XLSX). The tool reads and uploads it. text: Raw text or markdown content (alternative to file). url: URL to fetch document from (alternative to file).
Connector
create_powersource_docs
Heista
Build a complete creative intelligence profile from internal brand documents — creative briefs, brand guidelines, product specs, customer research, competitive analysis. Takes any mix of file_ids (from a previous upload), document_urls (public PDF/DOCX/TXT/MD links, up to 10), or documents_inline (base64-encoded files with filename), plus an optional context_url for layering live brand context (colors, fonts, current messaging) and optional idempotency_key. Returns a job_id; poll with get_powersource. Output shape is identical to create_powersource_url: identity, offer, selling points, voice, buyer profile, tensions, angles, emotional arcs, ctas, narrative. Use this when the user says "I have a brief", "here's my brand guidelines", "use this document", drops a PDF / DOCX / strategy deck, or when the truth lives in internal materials rather than the public website. The pipeline reads text only — convert PDFs to markdown before submitting via documents_inline when possible. Costs 100 credits. Do NOT use for URL-only scans — use create_powersource_url. For URL + docs combined (highest fidelity, triangulates public messaging against internal strategy), use create_powersource_full.
Connector
fetch
Sofya
Fetch one or more URLs and return their content as clean markdown. Use this to read articles, documentation, blog posts, or any page where you need the complete text, not just a snippet from search. Also supports PDF, DOCX, and other document formats. Costs 1 credit per URL. Max 10 URLs per request. Failed URLs are not charged. Set include_raw_html=true to also get the raw HTML source in each result. Useful for inspecting embedded URLs, data attributes, iframes, or script tags that are stripped during markdown conversion. Returns null for non-HTML content (PDF, DOCX, etc.). Same cost. Returns: results (array of {title, url, content, raw_html, published_time, success, error}), credits_used, credits_remaining. Args: urls: List of URLs to fetch (max 10) include_raw_html: Include raw HTML source in each result (default false)
Connector
billing.get_portal
Admin Substitute
Get a Stripe Billing Portal URL for the human to manage their subscription — update payment methods, view invoices, change plans, or cancel. Requires an existing Stripe subscription.
Connector
extract_tables
DocImprint
Extract tables and forms as Markdown from a PDF or image (base64-encoded). Use when the document contains structured tabular data such as financial statements, data sheets, or forms. For plain prose documents, use extract_text instead. Returns: { pages: number, text: string } — text contains Markdown-formatted tables. Example prompts: - "Extract the tables from this financial statement." - "Pull the data table from this PDF into Markdown format." - "Get the tabular data from this form document."
Connector
extract_pdf
paper-mcp
Extract a PDF to clean Markdown/LaTeX text via MinerU (great for papers behind no open-access full text — give the user's PDF and get readable text back). Provide pdf_url (downloaded server-side, SSRF-guarded) OR pdf_base64. formula/table toggle math/table reconstruction. Returns {task_id, status, cached, content, chars}: a recently-seen (cached) or small PDF comes back with `content` in one call; a fresh PDF (MinerU is GPU-heavy, minutes) returns status='running' + a task_id — then call extract_pdf_result(task_id) to fetch the text.
Connector
export_document
Revise
Export a document by id to markdown, txt, html, docx, pdf. Text formats (markdown, txt, html) are returned inline. Binary formats (docx, pdf) are hosted at a temporary download URL (expires in ~24h) returned in the response, not streamed back inline.
Connector
list_holdings_reports
Pensiata - Bulgarian Pension Fund Analytics
List available markdown holdings reports for Bulgarian pension funds. Reports contain detailed portfolio holdings data extracted from official PDF filings and converted to structured markdown with metadata (allocation %, exposure, top holdings). Use this tool to discover what reports are available before loading specific ones with `read_holdings_report`. Filter by manager, fund type, or date range.
Connector
format_table
IA-QA — 130+ QA & Dev Tools for AI Agents
Convert a JSON array of objects into a Markdown table. Automatically detects columns, aligns headers, and fills missing keys with empty cells. Use when an agent needs to present structured data — tool results, model comparisons, test reports — as a readable table in a response or document.
Connector
iliad_document_parsing
AXIS Toolbox — Agentic Commerce Codebase Intelligence
AXIS-owned document → Markdown extractor. Accepts either `document_url` (https fetch + 50 MiB cap + 60s timeout) or `document_base64` (inline bytes, 50 MiB decoded cap) — exactly one. Optional `mime_type` hint (application/pdf, application/vnd.openxmlformats-officedocument.wordprocessingml.document, text/html, text/markdown, text/plain); we sniff from magic bytes + URL extension when omitted. Format dispatch: PDF → pdfjs-dist text extraction (one block per page with `--- page N ---` separators); DOCX → mammoth → markdown (tables preserved); HTML → tag-strip with heading + list + entity handling (NOT a full HTML→MD converter — bring turndown if you need fancier); plain text + markdown → passthrough. Returns `{markdown, format_detected, byte_size, page_count, table_count, truncated}`. Output capped at 1 MiB markdown with a truncation marker. Engineer mode (X-Agent-Mode: engineer — Document Intelligence, $0.10): adds an `engineer` block with retrieval chunks (heading-aware, overlapping) + extract-to-caller-schema (pass `json_schema` → a grammar-constrained, validated typed object) + image OCR (image/* via document_base64) — typed data, not just markdown. Requires Authorization: Bearer <api_key>.
Connector
microsoft_docs_fetch
Microsoft Learn MCP
Fetch and convert a Microsoft Learn documentation webpage to markdown format. This tool retrieves the latest complete content of Microsoft documentation webpages including Azure, .NET, Microsoft 365, and other Microsoft technologies. ## When to Use This Tool - When search results provide incomplete information or truncated content - When you need complete step-by-step procedures or tutorials - When you need troubleshooting sections, prerequisites, or detailed explanations - When search results reference a specific page that seems highly relevant - For comprehensive guides that require full context ## Usage Pattern Use this tool AFTER microsoft_docs_search when you identify specific high-value pages that need complete content. The search tool gives you an overview; this tool gives you the complete picture. ## URL Requirements - The URL must be a valid HTML documentation webpage from the microsoft.com domain - Binary files (PDF, DOCX, images, etc.) are not supported ## Output Format markdown with headings, code blocks, tables, and links preserved.
Connector
rendex_screenshot
Rendex: Rendering API for Images, PDFs & Content Extraction
Use this when the user asks to screenshot, capture, or take a picture of a webpage/URL, or to render raw HTML or Markdown to an image or PDF. Do NOT use to get a reusable hosted image URL (use rendex_render_link) or to make a branded multi-format document (use render_artifact). Captures a screenshot or PDF of any webpage, raw HTML, or Markdown. Supports full-page capture, dark mode, ad blocking, custom viewports, CSS/JS injection, cookie/header injection, PDF output, HTML and Markdown rendering, and progressive fallback for heavy sites. Returns partial renders on timeout by default (bestAttempt mode). Costs 1 render credit per call. Cookie/header injection requires Starter+; geo-targeting requires Pro+.
Connector
t54_list_operations
Agentic Swarm Marketplace (T54 x402 MCP)
Returns operationIds, HTTP methods, paths, and query parameter names from the bundled OpenAPI spec (no network). Use before t54_x402_request or per-SKU tools.
Connector