Skip to main content
Glama

MCP Server for Crawl4AI

by omgwtfwow

parse_sitemap

Extract URLs from XML sitemaps to discover all site pages, plan crawl strategies, or verify sitemap validity. Supports regex filtering for targeted URL extraction.

Instructions

[STATELESS] Extract URLs from XML sitemaps. Use when: discovering all site pages, planning crawl strategies, or checking sitemap validity. Supports regex filtering. Try sitemap.xml or robots.txt first. Creates new browser each time.

Input Schema

NameRequiredDescriptionDefault
filter_patternNoOptional regex pattern to filter URLs
urlYesURL of the sitemap (e.g., https://example.com/sitemap.xml)

Input Schema (JSON Schema)

{ "properties": { "filter_pattern": { "description": "Optional regex pattern to filter URLs", "type": "string" }, "url": { "description": "URL of the sitemap (e.g., https://example.com/sitemap.xml)", "type": "string" } }, "required": [ "url" ], "type": "object" }

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/omgwtfwow/mcp-crawl4ai-ts'

If you have feedback or need assistance with the MCP directory API, please join our Discord server