Parse a file using Firecrawl's /v2/parse endpoint.
In local/non-cloud MCP mode, this tool reads filePath from the MCP server filesystem and posts multipart data to the configured self-hosted FIRECRAWL_API_URL, preserving the existing direct-read behavior.
In hosted CLOUD_SERVICE mode, this tool is a two-call flow because hosted MCP cannot read your local filesystem:
1. Call with filePath, contentType, parse options, and optional declaredSizeBytes. The hosted server mints a short-lived upload URL and returns a safe local curl PUT command plus nextToolCall.
2. Run the returned curl command locally, then call firecrawl_parse again with uploadRef and the desired parse options. The hosted server calls /v2/parse server-side with your session credential.
**Best for:** Extracting content from a local document (PDF, Word, Excel, HTML, etc.); pulling structured data out of a file with JSON format; converting binary documents into markdown for downstream reasoning.
**Not recommended for:** Remote URLs (use firecrawl_scrape); multiple files at once (call parse multiple times); documents that require interactive actions, screenshots, or change tracking — those aren't supported by the parse endpoint.
**Common mistakes:** In hosted mode, do not pass both filePath and uploadRef. Phase 1 uses filePath only to generate upload instructions; phase 2 uses uploadRef only to parse server-side.
**Supported file types:** .html, .htm, .xhtml, .pdf, .docx, .doc, .odt, .rtf, .xlsx, .xls
**Unsupported options:** actions, screenshot/branding/changeTracking formats, waitFor > 0, location, mobile, proxy values other than "auto" or "basic".
**Privacy:** Set `redactPII: true` to return content with personally identifiable information redacted.
**CRITICAL - Format Selection (same rules as firecrawl_scrape):**
When the user asks for SPECIFIC data points from a document, you MUST use JSON format with a schema. Only use markdown when the user needs the ENTIRE document content.
**Handling PDFs:**
Add `"parsers": ["pdf"]` (optionally with `pdfOptions.maxPages`) when parsing a PDF so the PDF engine is invoked explicitly. For very long documents, cap `maxPages` to keep the response within token limits.
**Hosted phase 1 example:**
```json
{
"name": "firecrawl_parse",
"arguments": {
"filePath": "/absolute/path/to/document.pdf",
"contentType": "application/pdf",
"formats": ["markdown"],
"parsers": ["pdf"],
"zeroDataRetention": true
}
}
```
**Hosted phase 2 example:**
```json
{
"name": "firecrawl_parse",
"arguments": {
"uploadRef": "upload-ref-from-phase-1",
"formats": ["markdown"],
"parsers": ["pdf"],
"zeroDataRetention": true
}
}
```
**Returns:** Phase 1 hosted upload instructions or a parsed document with markdown, html, links, summary, json, or query results depending on the requested formats.