Skip to main content
Glama
whw23

searxng-http-mcp

search

Read-onlyIdempotent

Search the web using SearXNG, aggregating results from over 200 engines with privacy. Use categories to focus on specific content types or specify engines for precise sources.

Instructions

Search the web using SearXNG metasearch engine.

Aggregates results from 200+ search engines (Google, Bing, DuckDuckGo, Brave, etc.) with privacy. Returns results, answers, suggestions, corrections, and infoboxes. Use 'categories' to focus on specific content types. Use 'pages' for more results.

Not suitable for autocomplete suggestions (use autocomplete tool) or discovering available engines/categories (use engine_info tool). Results are cached for 60 seconds.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesThe search query to use
categoriesNoComma-separated category names to focus on (e.g., 'general,news,science'). Prefer this over 'engines' to narrow results — categories leverage multiple engines automatically. Call engine_info to discover available categories.
enginesNoComma-separated engine names to use (e.g., 'google,arxiv,wikipedia'). Only use when you need a specific source; otherwise prefer 'categories'. Overrides category-based engine selection when set.
languageNoSearch language code (e.g., 'en', 'zh', 'ja', 'de'). Filters results to the specified language. Omit to search all languages.
time_rangeNoRestrict results to those published within this time window. Omit for no time restriction.
safesearchNoSafe search level: 0=off, 1=moderate, 2=strict
pagenoNoStarting page number. Use with 'pages' for pagination.
pagesNoNumber of pages to fetch in parallel (multi-page fanout). Higher values return more results but increase latency. Use 2-3 for comprehensive research.
max_resultsNoMaximum number of results to return. Applied after aggregation across pages.
formatNoResult detail level: 'compact' returns title/url/content only, 'full' includes engines/score/category/date/thumbnailscompact

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The 'search' tool implementation — an async function decorated with @mcp.tool() that queries the SearXNG metasearch engine. Accepts parameters: query, categories, engines, language, time_range, safesearch, pageno, pages, max_results, format. Makes parallel HTTP requests to SearXNG, aggregates results, supports caching, and returns JSON output with results/answers/suggestions/corrections/infoboxes.
    @mcp.tool(
        annotations=ToolAnnotations(
            readOnlyHint=True,
            destructiveHint=False,
            idempotentHint=True,
            openWorldHint=True,
        )
    )
    async def search(
        query: Annotated[str, Field(
            description="The search query to use",
        )],
        categories: Annotated[str, Field(
            description=(
                "Comma-separated category names to focus on (e.g., 'general,news,science'). "
                "Prefer this over 'engines' to narrow results — categories leverage multiple engines automatically. "
                "Call engine_info to discover available categories."
            ),
        )] = "",
        engines: Annotated[str, Field(
            description=(
                "Comma-separated engine names to use (e.g., 'google,arxiv,wikipedia'). "
                "Only use when you need a specific source; otherwise prefer 'categories'. "
                "Overrides category-based engine selection when set."
            ),
        )] = "",
        language: Annotated[str, Field(
            description=(
                "Search language code (e.g., 'en', 'zh', 'ja', 'de'). "
                "Filters results to the specified language. Omit to search all languages."
            ),
        )] = "",
        time_range: Annotated[
            Literal["day", "week", "month", "year"] | None,
            Field(description="Restrict results to those published within this time window. Omit for no time restriction."),
        ] = None,
        safesearch: Annotated[
            Literal[0, 1, 2],
            Field(description="Safe search level: 0=off, 1=moderate, 2=strict"),
        ] = 0,
        pageno: Annotated[int, Field(
            ge=1,
            description="Starting page number. Use with 'pages' for pagination.",
        )] = 1,
        pages: Annotated[int, Field(
            ge=1, le=5,
            description=(
                "Number of pages to fetch in parallel (multi-page fanout). "
                "Higher values return more results but increase latency. Use 2-3 for comprehensive research."
            ),
        )] = 1,
        max_results: Annotated[int, Field(
            ge=1, le=100,
            description="Maximum number of results to return. Applied after aggregation across pages.",
        )] = 10,
        format: Annotated[
            Literal["compact", "full"],
            Field(description="Result detail level: 'compact' returns title/url/content only, 'full' includes engines/score/category/date/thumbnails"),
        ] = "compact",
    ) -> str:
        """Search the web using SearXNG metasearch engine.
    
        Aggregates results from 200+ search engines (Google, Bing, DuckDuckGo, Brave, etc.)
        with privacy. Returns results, answers, suggestions, corrections, and infoboxes.
        Use 'categories' to focus on specific content types. Use 'pages' for more results.
    
        Not suitable for autocomplete suggestions (use autocomplete tool) or discovering
        available engines/categories (use engine_info tool). Results are cached for 60 seconds.
        """
        fields = COMPACT_FIELDS if format == "compact" else FULL_FIELDS
    
        params: dict = {"q": query, "format": "json"}
        if categories:
            params["categories"] = categories
        if language:
            params["language"] = language
        if time_range is not None:
            params["time_range"] = time_range
        if safesearch:
            params["safesearch"] = str(safesearch)
        if engines:
            params["engines"] = engines
    
        cache_params = {**params, "pageno": pageno, "pages": pages, "format": format}
        cache_k = _cache_key(cache_params)
        cached = _get_cached(cache_k)
        if cached is not None:
            results = cached["results"][:max_results]
            output = {**cached, "results": results, "number_of_results": len(results), "cached": True}
            return json.dumps(output, ensure_ascii=False)
    
        all_results = []
        all_answers: set[str] = set()
        all_suggestions: set[str] = set()
        all_corrections: set[str] = set()
        all_infoboxes = []
        errors: list[str] = []
    
        client = await _get_client()
        tasks = []
        for page in range(pageno, pageno + pages):
            page_params = {**params, "pageno": str(page)}
            tasks.append(
                client.get(
                    f"{SEARXNG_BASE_URL}/search",
                    params=page_params,
                    timeout=30.0,
                )
            )
        responses = await asyncio.gather(*tasks, return_exceptions=True)
    
        for resp in responses:
            if isinstance(resp, Exception):
                errors.append(str(resp))
                continue
            if resp.status_code != 200:
                errors.append(f"HTTP {resp.status_code}")
                continue
            data = resp.json()
            all_results.extend(_trim_result(r, fields) for r in data.get("results", []))
            all_answers.update(data.get("answers", []))
            all_suggestions.update(data.get("suggestions", []))
            all_corrections.update(data.get("corrections", []))
            all_infoboxes.extend(data.get("infoboxes", []))
            for engine_name, error_msg in data.get("unresponsive_engines", []):
                errors.append(f"{engine_name}: {error_msg}")
    
        if not all_results and not all_answers:
            return json.dumps(_build_diagnostics(query, params, errors), ensure_ascii=False)
    
        output: dict = {
            "results": all_results,
            "number_of_results": len(all_results),
        }
        if all_answers:
            output["answers"] = list(all_answers)
        if all_suggestions:
            output["suggestions"] = list(all_suggestions)
        if all_corrections:
            output["corrections"] = list(all_corrections)
        if all_infoboxes:
            output["infoboxes"] = all_infoboxes
    
        _set_cache(cache_k, output)
    
        results = output["results"][:max_results]
        return_data = {**output, "results": results, "number_of_results": len(results)}
        return json.dumps(return_data, ensure_ascii=False)
  • Pydantic/FastMCP type annotations and Field definitions for the search tool's input parameters, defining types, defaults, descriptions, and constraints (e.g., max_results ge=1 le=100, pages ge=1 le=5, format Literal['compact','full']).
    async def search(
        query: Annotated[str, Field(
            description="The search query to use",
        )],
        categories: Annotated[str, Field(
            description=(
                "Comma-separated category names to focus on (e.g., 'general,news,science'). "
                "Prefer this over 'engines' to narrow results — categories leverage multiple engines automatically. "
                "Call engine_info to discover available categories."
            ),
        )] = "",
        engines: Annotated[str, Field(
            description=(
                "Comma-separated engine names to use (e.g., 'google,arxiv,wikipedia'). "
                "Only use when you need a specific source; otherwise prefer 'categories'. "
                "Overrides category-based engine selection when set."
            ),
        )] = "",
        language: Annotated[str, Field(
            description=(
                "Search language code (e.g., 'en', 'zh', 'ja', 'de'). "
                "Filters results to the specified language. Omit to search all languages."
            ),
        )] = "",
        time_range: Annotated[
            Literal["day", "week", "month", "year"] | None,
            Field(description="Restrict results to those published within this time window. Omit for no time restriction."),
        ] = None,
        safesearch: Annotated[
            Literal[0, 1, 2],
            Field(description="Safe search level: 0=off, 1=moderate, 2=strict"),
        ] = 0,
        pageno: Annotated[int, Field(
            ge=1,
            description="Starting page number. Use with 'pages' for pagination.",
        )] = 1,
        pages: Annotated[int, Field(
            ge=1, le=5,
            description=(
                "Number of pages to fetch in parallel (multi-page fanout). "
                "Higher values return more results but increase latency. Use 2-3 for comprehensive research."
            ),
        )] = 1,
        max_results: Annotated[int, Field(
            ge=1, le=100,
            description="Maximum number of results to return. Applied after aggregation across pages.",
        )] = 10,
        format: Annotated[
            Literal["compact", "full"],
            Field(description="Result detail level: 'compact' returns title/url/content only, 'full' includes engines/score/category/date/thumbnails"),
        ] = "compact",
    ) -> str:
  • The @mcp.tool() decorator registers the 'search' function as an MCP tool on the FastMCP instance, with annotations: readOnlyHint=True, destructiveHint=False, idempotentHint=True, openWorldHint=True.
    @mcp.tool(
        annotations=ToolAnnotations(
            readOnlyHint=True,
            destructiveHint=False,
            idempotentHint=True,
            openWorldHint=True,
        )
    )
  • Cache helper functions (_cache_key, _get_cached, _set_cache) used by the search tool for TTL-based caching of search results (60-second default CACHE_TTL, max 256 entries).
    def _cache_key(params: dict) -> str:
        return json.dumps(params, sort_keys=True)
    
    
    def _get_cached(key: str) -> dict | None:
        if key in _cache:
            ts, data = _cache[key]
            if time.monotonic() - ts < CACHE_TTL:
                return data
            del _cache[key]
        return None
    
    
    def _set_cache(key: str, data: dict):
        now = time.monotonic()
        expired = [k for k, (ts, _) in _cache.items() if now - ts >= CACHE_TTL]
        for k in expired:
            del _cache[k]
        if len(_cache) >= MAX_CACHE_SIZE:
            oldest = min(_cache, key=lambda k: _cache[k][0])
            del _cache[oldest]
        _cache[key] = (now, data)
  • Runtime modification of the search tool's description in the Starlette app lifespan — dynamically appends available categories from the SearXNG config into the tool's description.
    search_tool = mcp._tool_manager._tools.get("search")
    if search_tool:
        original_desc = search_tool.description or ""
        search_tool.description = (
            f"{original_desc}\n\n"
            f"Available categories: {categories_str}\n"
            f"Use the engine_info tool to discover available engines and their categories."
        )
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnly, idempotent, openWorld. Description adds non-obvious behaviors: 'Aggregates results from 200+ search engines... with privacy', 'Returns results, answers, suggestions, corrections, and infoboxes', and 'Results are cached for 60 seconds'. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, each serving a distinct purpose: purpose statement, aggregation details, usage tips, exclusions, and caching behavior. No wasted words, front-loaded with core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given presence of output schema and detailed parameter descriptions, the description covers what is needed: it specifies return types (results, answers, etc.), clarifies caching, and references sibling tools. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (baseline 3). Description adds contextual usage guidance for 'categories' and 'pages' beyond schema descriptions. However, it does not elaborate on other parameters, but the schema itself is detailed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Search the web using SearXNG metasearch engine' (specific verb+resource) and distinguishes from siblings by specifying 'Not suitable for autocomplete suggestions (use autocomplete tool) or discovering available engines/categories (use engine_info tool)'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use: 'Use categories to focus on specific content types. Use pages for more results.' and when not: 'Not suitable for autocomplete suggestions... or discovering available engines/categories'. Provides alternatives (autocomplete, engine_info).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/whw23/searxng_http_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server