crawler-mcp
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| MCP_HTTP_PORT | No | Port for HTTP mode | 3001 |
| MCP_TRANSPORT | No | Transport mode: stdio or http | stdio |
| CRAWLER_MAX_CHARS | No | Default cap on returned page content in characters | 20000 |
| CRAWLER_TIMEOUT_MS | No | Per-request timeout in milliseconds | 15000 |
| CRAWLER_USER_AGENT | No | User-Agent for all requests | crawler-mcp/1.0 |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| fetch_pageA | Fetch a single web page and return its readable content as Markdown, plain text, or raw HTML. Automatically renders JavaScript-heavy pages with a headless browser when needed. |
| extract_linksB | Extract all hyperlinks from a web page, resolved to absolute URLs. Optionally restrict to links on the same domain. |
| crawl_siteA | Recursively crawl a website starting from a URL, following links up to a maximum depth and page count. Returns a short content summary for each page visited. Stays on the same domain by default. |
| extract_by_selectorB | Extract specific data from a page using a CSS selector. Returns each matching element's text, or an attribute value when |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/shadab15github/crawler-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server