Schema | crawler-mcp

crawler-mcp

Describes the environment variables required to run the server.

Name	Required	Description	Default
`MCP_HTTP_PORT`	No	Port for HTTP mode	3001
`MCP_TRANSPORT`	No	Transport mode: stdio or http	stdio
`CRAWLER_MAX_CHARS`	No	Default cap on returned page content in characters	20000
`CRAWLER_TIMEOUT_MS`	No	Per-request timeout in milliseconds	15000
`CRAWLER_USER_AGENT`	No	User-Agent for all requests	crawler-mcp/1.0

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": true }

Functions exposed to the LLM to take actions

Name	Description
fetch_pageA	Fetch a single web page and return its readable content as Markdown, plain text, or raw HTML. Automatically renders JavaScript-heavy pages with a headless browser when needed.
extract_linksB	Extract all hyperlinks from a web page, resolved to absolute URLs. Optionally restrict to links on the same domain.
crawl_siteA	Recursively crawl a website starting from a URL, following links up to a maximum depth and page count. Returns a short content summary for each page visited. Stays on the same domain by default.
extract_by_selectorB	Extract specific data from a page using a CSS selector. Returns each matching element's text, or an attribute value when `attribute` is given (e.g. selector='a.product', attribute='href').

Interactive templates invoked by user choice

Name	Description
No prompts

Contextual data attached and managed by the client

Name	Description
No resources

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/shadab15github/crawler-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server