Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
CRAWL4AI_LANGNoThe language for the interface (e.g., 'en' for English, 'ja' for Japanese)en

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tasks
{
  "list": {},
  "cancel": {},
  "requests": {
    "tools": {
      "call": {}
    },
    "prompts": {
      "get": {}
    },
    "resources": {
      "read": {}
    }
  }
}
tools
{
  "listChanged": true
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
crawl_urlA

Extract web page content with JavaScript support. Use wait_for_js=true for SPAs. Use content_offset/content_limit to paginate the response. Use output_path to persist the full unsliced content to disk as markdown and receive a slim metadata-only response.

deep_crawl_siteA

Crawl multiple pages from a site with configurable depth. Use output_path (directory) to persist per-URL markdown files + index.json; the response is then slimmed to metadata only.

crawl_url_with_fallbackA

Crawl with fallback strategies for anti-bot sites. Use content_offset/content_limit to paginate the response. Use output_path to persist the full unsliced content to disk as markdown and receive a slim response.

intelligent_extractA

Extract specific data from web pages using LLM. Use output_path to persist the full extraction output to disk as JSON and receive a slim response.

extract_entitiesA

Extract entities (emails, phones, etc.) from web pages. Use output_path to persist the full entity extraction output to disk as JSON and receive a slim response.

extract_structured_dataB

Extract structured data using CSS selectors or LLM. Use output_path to persist the full extraction (including table_data) to disk as JSON and receive a slim response.

extract_youtube_transcriptA

Extract YouTube transcripts with timestamps. Works with public captioned videos. Supports fallback to page crawl. Use output_path to persist the full unsliced transcript to disk as markdown.

batch_extract_youtube_transcriptsB

Extract transcripts from multiple YouTube videos. Max 3 URLs per call. Supply output_path (directory) in the request to persist per-video markdown files + index.json and receive a slim response.

get_youtube_video_infoA

Get YouTube video metadata and transcript availability. Use output_path to persist the full transcript to disk as markdown and receive a slim response.

extract_youtube_commentsB

Extract YouTube video comments. Supports pagination via comment_offset. Use output_path to persist the full unsliced comment list to disk as JSON; the response is then slimmed to metadata only.

process_fileB

Convert PDF, Word, Excel, PowerPoint, ZIP to markdown. Use output_path to persist the full unsliced converted markdown to disk and receive a slim response.

get_supported_file_formatsA

Get supported file formats (PDF, Office, ZIP) and their capabilities.

enhanced_process_large_contentB

Process large content with chunking and BM25 filtering. Use output_path to persist chunks + summaries to disk as JSON and receive a slim response.

search_googleA

Search Google with genre filtering. Genres: academic, news, technical, commercial, social. Supply output_path in the request to persist the full unsliced result set to disk as JSON and receive a slim response.

batch_search_googleA

Perform multiple Google searches. Max 3 queries per call. Supply output_path in the request to persist the full result set to disk as JSON and receive a slim response.

search_and_crawlA

Search Google and crawl top results. Combines search with full content extraction. Supply output_path (directory) in the request to persist per-page markdown (unsliced) + index.json and receive a slim response.

get_search_genresA

Get available search genres for targeted searching.

batch_crawlA

Crawl multiple URLs with fallback. Max 3 URLs per call. Use output_path (directory) to persist full per-URL markdown + index.json; the return shape stays a list, each success item gets an output_file key.

multi_url_crawlA

Multi-URL crawl with pattern-based config. Max 5 URL patterns per call. Use output_path (directory) to persist full per-URL markdown + index.json; the return shape stays a list, each success item gets an output_file key.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/walksoda/crawl-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server