search
Run a multi-engine web search to find relevant links and snippets for discovery queries. Returns a ranked, deduplicated list of URLs to support further research.
Instructions
Run a multi-engine web search and return a ranked, deduplicated link list.
Best for:
- Discovery queries ("what is X", "find me X", "who is X").
- Getting a list of URLs you can hand to `fetch` / `fetch_batch` next.
- Topics likely to be after your knowledge cutoff (use `freshness="week"`).
- Filtering to specific domains (`include_domains=["python.org"]`) or
content types (`category="paper"|"pdf"|"github"|"news"|"forum"|"blog"`).
Not recommended for:
- You already know the URL -> use `fetch` instead.
- You want both links AND their full text in one call -> use `research`.
- You want to query pages already in the local cache -> use `cache_search`.
- Reading PDFs/DOCX from a known URL -> use `read_doc`.
Returns:
- markdown (default): numbered list of `n. title`, `<url>`, snippet — ~40%
fewer tokens than json.
- json: dict with `results` (list of {title,url,snippet,engines,score}),
`engines`, `cached`, optional `errors` map, optional `hint` string.
Common mistakes:
- Passing a URL as `query` — that's `fetch`'s job.
- Cranking `max_results` to 50 hoping for better recall; engines cap around
10-20 each, anything beyond is duplicate noise.
- Adding `engines=["brave","bing","baidu"]` by default — those need
captcha-friendly conditions; stick with defaults unless they returned 0.
- Using `category="news"` for breaking news without also setting
`freshness="day"` — the index lag is days, not minutes.
Args:
query: Natural-language query (the same string a human would type).
engines: Subset of `engines()`. None = duckduckgo+mojeek+startpage.
max_results: Merged result count after dedup. 5-20 is the useful range.
use_cache: Reuse the last result for this exact (query, engines,
max_results) within the cache TTL. False forces a re-fetch.
max_age_hours: Treat cached results older than this as a miss. Use
0 to force-refresh while keeping cache writes; None = use server
default TTL (7 days).
freshness: "day"|"week"|"month"|"year" — restrict to recent results.
include_domains: List of domains to restrict to (e.g. ["python.org"]).
exclude_domains: List of domains to exclude.
category: "news"|"pdf"|"github"|"paper"|"forum"|"blog" — content-type
shortcut. "paper" => arxiv/acm/springer/ieee/etc; "forum" =>
reddit/HN/stackexchange; "github" => github.com only.
include_text: Substring required in title or snippet (case-insensitive).
exclude_text: Substring forbidden in title or snippet.
format: "markdown" (default) or "json".Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | ||
| engines | No | ||
| max_results | No | ||
| use_cache | No | ||
| max_age_hours | No | ||
| freshness | No | ||
| include_domains | No | ||
| exclude_domains | No | ||
| category | No | ||
| include_text | No | ||
| exclude_text | No | ||
| format | No | markdown |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |