search_and_scrape
Search the web and scrape full content from top results in a single call, with parallel processing and deduplication across sources.
Instructions
Search the web and extract full content from top results in one call. Scrapes in parallel (max 5 concurrent), deduplicates content across sources, and scores each source on relevance and quality. Returns JSON with fields: query, combinedContent, sources (array of {url, title, content, contentType, scores} — included when include_sources=true), summary ({urlsSearched, urlsScraped, processingTimeMs}), sizeMetadata ({totalLength, estimatedTokens, sizeCategory}). On zero search matches returns empty combinedContent with urlsSearched: 0. Individual scrape failures are silently skipped (urlsScraped < urlsSearched indicates partial failures). num_results controls sources scraped (more = slower, typically 2-15s). Subject to per-tenant rate limit with provider fallback. Use web_search instead if you only need URLs; use scrape_page for a single known URL. Not cached (combines live search + scrape).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | The research question or topic to search and extract content for. Use natural language or keyword-rich queries.,required | |
| num_results | No | Number of top search results to scrape (1-10, default: 3). More sources = slower but more comprehensive. | |
| include_sources | No | Include per-source content and quality scores in response (default: true). Set false to reduce response size. | |
| deduplicate | No | Remove duplicate paragraphs across sources (default: true). Disable only if exact repetition matters. | |
| max_length_per_source | No | Max content bytes extracted per source (default: 50000). | |
| total_max_length | No | Max total bytes for combined output (default: 300000). Reduce for faster, more concise results. | |
| filter_by_query | No | Remove sources with low relevance to the query (default: false). Enable for precision over recall. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| combinedContent | No | ||
| query | No | ||
| sizeMetadata | No | ||
| sources | No | ||
| summary | No |