Skip to main content
Glama
elad12390
by elad12390

crawl_url

Extract and analyze webpage text content for research, quoting, or data processing by fetching URLs and returning clean text output.

Instructions

Fetch a URL with crawl4ai when you need the actual page text for quoting or analysis.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
reasoningYes
max_charsNo

Implementation Reference

  • The main handler function for the 'crawl_url' MCP tool. It takes a URL, fetches cleaned markdown content using the CrawlerClient, handles errors, clamps output length, and tracks usage.
    @mcp.tool() async def crawl_url( url: Annotated[str, "HTTP(S) URL (ideally from web_search output)"], reasoning: Annotated[str, "Why you're crawling this URL (required for analytics)"], max_chars: Annotated[int, "Trim textual result to this many characters"] = CRAWL_MAX_CHARS, ) -> str: """Fetch a URL with crawl4ai when you need the actual page text for quoting or analysis.""" start_time = time.time() success = False error_msg = None result = "" try: text = await crawler_client.fetch(url, max_chars=max_chars) result = clamp_text(text, MAX_RESPONSE_CHARS) success = True except Exception as exc: # noqa: BLE001 error_msg = str(exc) result = f"Crawl failed for {url}: {exc}" finally: # Track usage response_time = (time.time() - start_time) * 1000 tracker.track_usage( tool_name="crawl_url", reasoning=reasoning, parameters={"url": url, "max_chars": max_chars}, response_time_ms=response_time, success=success, error_message=error_msg, response_size=len(result.encode("utf-8")), ) return result
  • The @mcp.tool() decorator registers the crawl_url function as an available tool in the FastMCP server.
    @mcp.tool()
  • Input schema defined via Annotated type hints: url (str, required), reasoning (str, required), max_chars (int, optional default CRAWL_MAX_CHARS). Output: str (markdown content).
    async def crawl_url( url: Annotated[str, "HTTP(S) URL (ideally from web_search output)"], reasoning: Annotated[str, "Why you're crawling this URL (required for analytics)"], max_chars: Annotated[int, "Trim textual result to this many characters"] = CRAWL_MAX_CHARS, ) -> str:
  • The fetch method of CrawlerClient performs the actual crawling using crawl4ai's AsyncWebCrawler, extracts markdown/content, and applies character limiting. Called by the tool handler.
    async def fetch(self, url: str, *, max_chars: int | None = None) -> str: """Fetch *url* and return cleaned markdown, trimmed to *max_chars*.""" run_config = CrawlerRunConfig(cache_mode=self.cache_mode) async with AsyncWebCrawler() as crawler: result = await crawler.arun(url=url, config=run_config) if getattr(result, "error", None): raise RuntimeError(str(result.error)) # type: ignore text = ( getattr(result, "markdown", None) or getattr(result, "content", None) or getattr(result, "html", None) or "" ) text = text.strip() if not text: raise RuntimeError("Crawl completed but returned no readable content.") limit = max_chars or CRAWL_MAX_CHARS return clamp_text(text, limit)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/elad12390/web-research-assistant'

If you have feedback or need assistance with the MCP directory API, please join our Discord server