fetch_pages_batch
Fetch up to 25 web pages in parallel and return each as clean Markdown. Reduces wait time for batch scraping tasks.
Instructions
Fetch many web pages in parallel and return each one's clean Markdown.
Use this whenever you need to read more than one URL at once — it is far faster than calling fetch_page in a loop because the upstream scraper handles the concurrency.
Args: urls: Up to 25 URLs. max_tokens: Optional per-URL soft cap on the returned Markdown.
Returns:
A list of {url, ok, data?, error?} objects in the same order as the
input URLs. data is {title, word_count, markdown, links, ...} on
success; error contains the failure reason otherwise.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| urls | Yes | ||
| max_tokens | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |
Implementation Reference
- src/ai_first_scraper_mcp/server.py:59-81 (handler)The async handler that fetches many web pages in parallel. It POSTs the list of URLs to the upstream scraper's /batch endpoint and returns the JSON response containing clean Markdown for each page.
async def fetch_pages_batch(urls: list[str], max_tokens: Optional[int] = None) -> list[dict]: """Fetch many web pages in parallel and return each one's clean Markdown. Use this whenever you need to read more than one URL at once — it is far faster than calling fetch_page in a loop because the upstream scraper handles the concurrency. Args: urls: Up to 25 URLs. max_tokens: Optional per-URL soft cap on the returned Markdown. Returns: A list of `{url, ok, data?, error?}` objects in the same order as the input URLs. `data` is `{title, word_count, markdown, links, ...}` on success; `error` contains the failure reason otherwise. """ body: dict = {"urls": urls} if max_tokens: body["max_tokens"] = max_tokens async with httpx.AsyncClient(timeout=DEFAULT_TIMEOUT) as client: resp = await client.post(f"{SCRAPER_URL}/batch", json=body) resp.raise_for_status() return resp.json() - The function signature defines the input schema (list[str] urls, Optional[int] max_tokens) and return type (list[dict]). The docstring documents the output shape as {url, ok, data?, error?} with data containing {title, word_count, markdown, links, ...}.
async def fetch_pages_batch(urls: list[str], max_tokens: Optional[int] = None) -> list[dict]: """Fetch many web pages in parallel and return each one's clean Markdown. Use this whenever you need to read more than one URL at once — it is far faster than calling fetch_page in a loop because the upstream scraper handles the concurrency. Args: urls: Up to 25 URLs. max_tokens: Optional per-URL soft cap on the returned Markdown. Returns: A list of `{url, ok, data?, error?}` objects in the same order as the input URLs. `data` is `{title, word_count, markdown, links, ...}` on success; `error` contains the failure reason otherwise. """ body: dict = {"urls": urls} if max_tokens: body["max_tokens"] = max_tokens async with httpx.AsyncClient(timeout=DEFAULT_TIMEOUT) as client: resp = await client.post(f"{SCRAPER_URL}/batch", json=body) resp.raise_for_status() return resp.json() - src/ai_first_scraper_mcp/server.py:32-32 (registration)The tool is registered via the @mcp.tool() decorator on line 58 applied to the fetch_pages_batch function. FastMCP instance is created on line 32.
mcp = FastMCP("ai-first-scraper")