crawl
Extract website content and save it as structured markdown. Configure parameters like depth, CSS selectors, and anti-bot bypass for tailored results.
Instructions
Crawls a website and saves its content as structured markdown to a file
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to crawl | |
| max_depth | No | Maximum crawling depth | |
| include_external | No | Whether to include external links | |
| verbose | No | Enable verbose output | |
| output_file | No | Path to output file (generated if not provided) | |
| wait_for_selector | No | CSS selector to wait for before extracting content. Useful for single-page applications. | |
| return_content | No | Whether to return the extracted content directly in the MCP response | |
| magic | No | Enable magic mode to bypass anti-bots and simulate a real browser | |
| css_selector | No | Specific CSS selector to extract only targeted elements from the page | |
| js_code | No | Custom JavaScript code to execute on the page before extraction (Requires CRAWL4AI_MCP_ALLOW_JS=true environment variable) | |
| session_id | No | Persistent session identifier to keep cookies and browser state across requests | |
| delay_before_return_html | No | Delay in seconds to wait before extracting HTML (useful for heavy JS pages) |