Schema | solocrawl

solocrawl

Describes the environment variables required to run the server.

Name	Required	Description
`SOLOCRAWL_LOG_FILE`	No	Optional log file path (also logs to stderr)
`SOLOCRAWL_LOG_LEVEL`	No	Log level: DEBUG, INFO, WARNING, ERROR
`SOLOCRAWL_PROXY_LIST`	No	Comma-separated proxy URLs
`SOLOCRAWL_PROXY_MODE`	No	Proxy mode: list (rotate a pool) or endpoint (single rotating endpoint)
`SOLOCRAWL_USER_AGENT`	No	Override HTTP User-Agent for API requests
`SOLOCRAWL_MAX_RETRIES`	No	Retries on network errors / rate limits
`SOLOCRAWL_SEARXNG_URL`	No	Base URL of a self-hosted SearXNG instance (enables the searxng provider)
`SOLOCRAWL_PROXY_ENABLED`	No	Enable optional proxy layer
`SOLOCRAWL_PROXY_ENDPOINT`	No	Single rotating proxy endpoint
`SOLOCRAWL_PROXY_PASSWORD`	No	Proxy auth password
`SOLOCRAWL_PROXY_USERNAME`	No	Proxy auth username
`SOLOCRAWL_RESPECT_ROBOTS`	No	Honour robots.txt on scrape (fail-open); set false to skip
`SOLOCRAWL_BROWSER_ALLOWED`	No	Allow Playwright fallback when installed
`SOLOCRAWL_MAX_CONCURRENCY`	No	Global fetch concurrency limit
`SOLOCRAWL_TIMEOUT_SECONDS`	No	Per-request timeout in seconds
`SOLOCRAWL_ENABLE_PROVIDERS`	No	Comma-separated opt-in provider names
`SOLOCRAWL_PER_DOMAIN_LIMIT`	No	Per-domain concurrency limit
`SOLOCRAWL_CACHE_TTL_SECONDS`	No	In-memory fetch cache TTL in seconds (0 = disabled)
`SOLOCRAWL_MAX_RESPONSE_BYTES`	No	Cap on fetched response body size (10 MiB)
`SOLOCRAWL_ALLOW_INTERNAL_URLS`	No	Allow scraping localhost/private IPs (dev only)

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": true }
`logging`	{}
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`extensions`	{ "io.modelcontextprotocol/ui": {} }
`experimental`	{}

Functions exposed to the LLM to take actions

Name	Description
web_searchB	Search the web across SoloCrawl's configured providers and return unified results.
scrapeA	Fetch a URL and return the main page content as markdown suitable for LLM context.
researchA	Search the web, scrape the top results, and return an aggregated cited report.
package_versionB	Look up the latest or constraint-satisfying version of a package from a registry.
list_providersA	List the registered search and package providers (default vs. opt-in).

Interactive templates invoked by user choice

Name	Description
No prompts

Contextual data attached and managed by the client

Name	Description
No resources

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hlavacm/solocrawl'

If you have feedback or need assistance with the MCP directory API, please join our Discord server