Skip to main content
Glama

Crawl4AI MCP Server

server-info.json6.19 kB
{ "name": "Crawl4AI-MCP-Server", "instructions": "This server provides web scraping capabilities using Crawl4AI. The server acts as the 'hands and eyes' while the client AI acts as the 'brain'. \n\nAvailable tools:\n• get_page_structure: Analyze webpage structure and content\n• crawl_with_schema: Execute precise data extraction using schemas\n• take_screenshot: Capture visual representation of webpages\n\nAll tools support proper error handling and async operation.", "fastmcp_version": "2.10.6", "mcp_version": "1.12.2", "server_version": "2.10.6", "tools": [ { "key": "server_status", "name": "server_status", "description": "Get the current status and capabilities of the Crawl4AI MCP server.\n\nThis tool provides comprehensive information about server health, available features,\nconfiguration status, and operational capabilities. Use this to verify server\nconnectivity and understand what web scraping capabilities are available.\n\nReturns:\n dict: Server status information including:\n - server_name: The name of the MCP server\n - version: Current server version \n - status: Operational status (operational/error)\n - transport: Communication transport type (stdio)\n - working_directory: Current server working directory\n - capabilities: List of available server capabilities\n - dependencies: Status of key dependencies\n - message: Human-readable status message\n\nExample response:\n {\n \"server_name\": \"Crawl4AI-MCP-Server\",\n \"version\": \"1.0.0\",\n \"status\": \"operational\", \n \"capabilities\": [\"web_crawling\", \"content_extraction\", \"screenshot_capture\", \"schema_based_extraction\"]\n }", "input_schema": { "properties": {}, "type": "object" }, "annotations": null, "tags": null, "enabled": true }, { "key": "get_page_structure", "name": "get_page_structure", "description": "Fetch and analyze the structural content of a webpage for AI analysis.\n\nThis is the fundamental \"eyes\" tool that provides the raw material for client AI\nto understand webpage structure. It returns clean, structured content without\nexecuting any extraction schemas.\n\nArgs:\n url: The URL of the webpage to crawl and analyze\n format: Output format - 'html' for cleaned HTML or 'markdown' for raw markdown\n ctx: MCP context for logging and progress reporting\n \nReturns:\n str: The webpage content in the requested format (HTML or Markdown)\n \nRaises:\n Exception: If the webpage cannot be accessed or processed", "input_schema": { "properties": { "url": { "description": "The URL of the webpage to analyze", "title": "Url", "type": "string" }, "format": { "default": "html", "description": "Output format: 'html' for cleaned HTML or 'markdown' for raw markdown", "pattern": "^(html|markdown)$", "title": "Format", "type": "string" } }, "required": [ "url" ], "type": "object" }, "annotations": null, "tags": null, "enabled": true }, { "key": "crawl_with_schema", "name": "crawl_with_schema", "description": "Execute precision data extraction using AI-generated schemas with JsonCssExtractionStrategy.\n\nThis is the 'hands' tool that performs targeted data extraction based on schemas\nprovided by the client AI. It uses CSS selectors to extract specific data points\nfrom webpages and returns structured JSON results.\n\nArgs:\n url: The URL of the webpage to crawl and extract data from\n extraction_schema: JSON string defining field names and their CSS selectors\n ctx: MCP context for logging and progress reporting\n \nReturns:\n str: JSON string containing the extracted data according to the schema\n \nRaises:\n Exception: If the webpage cannot be accessed, schema is invalid, or extraction fails", "input_schema": { "properties": { "url": { "description": "The URL of the webpage to crawl and extract data from", "title": "Url", "type": "string" }, "extraction_schema": { "description": "JSON string containing the extraction schema with field names and CSS selectors. Example: '{\"title\": \"h1\", \"price\": \".price\", \"description\": \".desc\"}'", "title": "Extraction Schema", "type": "string" } }, "required": [ "url", "extraction_schema" ], "type": "object" }, "annotations": null, "tags": null, "enabled": true }, { "key": "take_screenshot", "name": "take_screenshot", "description": "Capture a visual screenshot of a webpage for media representation.\n\nThis is the visual capture tool that provides screenshot images of webpages\nfor the client AI to analyze. It returns base64-encoded image data that can\nbe processed by FastMCP's native image handling capabilities.\n\nArgs:\n url: The URL of the webpage to capture\n ctx: MCP context for logging and progress reporting\n \nReturns:\n str: JSON string containing base64-encoded screenshot data and metadata\n \nRaises:\n Exception: If the webpage cannot be accessed or screenshot fails", "input_schema": { "properties": { "url": { "description": "The URL of the webpage to capture as a screenshot", "title": "Url", "type": "string" } }, "required": [ "url" ], "type": "object" }, "annotations": null, "tags": null, "enabled": true } ], "prompts": [], "resources": [], "templates": [], "capabilities": { "tools": { "listChanged": true }, "resources": { "subscribe": false, "listChanged": false }, "prompts": { "listChanged": false }, "logging": {} } }

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Nexus-Digital-Automations/crawl4ai-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server