Skip to main content
Glama
jspv

Google Search MCP Server

by jspv

search

Perform web searches using Google Custom Search Engine to find information, filter results by date, language, or site, and control SafeSearch settings.

Instructions

Google Programmable Search (CSE) via MCP.

Parameters:
    q: Query string. Trimmed; required.
    num: Number of results to return (1..10; clamped).
    start: 1-based index for pagination start (clamped to >=1).
    siteSearch: Limit results to a site (or domain) per CSE rules.
    siteSearchFilter: "i" to include or "e" to exclude `siteSearch`.
    safe: SafeSearch level: "off" or "active".
    gl: Geolocation/country code.
    hl: UI language.
    lr: Language restrict (e.g., "lang_en").
    useSiteRestrict: Use the siterestrict endpoint variant.
    dateRestrict: Time filter (e.g., "d7", "m3", "y1").
    exactTerms, orTerms, excludeTerms: Query modifiers.
    cxOverride: Override the configured CSE ID (avoid echoing to clients).
    lean_fields: If True, request a smaller response via fields projection.

Returns:
    A dict with keys: provider, query (sanitized), searchInfo, nextPage,
    latency_ms, results (normalized), raw (subset), trace (q hash).

Raises:
    ValueError: For invalid parameter values (e.g., unsupported safe).
    RuntimeError: For Google API errors or network failures.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
qYes
numNo
startNo
siteSearchNo
siteSearchFilterNo
safeNo
glNo
hlNo
lrNo
useSiteRestrictNo
dateRestrictNo
exactTermsNo
orTermsNo
excludeTermsNo
cxOverrideNo
lean_fieldsNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The primary handler for the 'search' tool. Decorated with @mcp.tool(), it defines the input parameters with types, docstring describing schema, performs API call to Google CSE, normalizes results, and returns structured data.
    @mcp.tool()
    async def search(
        q: str,
        num: int = 5,
        start: int = 1,
        siteSearch: str | None = None,
        siteSearchFilter: str | None = None,  # "i" include | "e" exclude
        safe: str | None = None,  # "off" | "active" (CSE)
        gl: str | None = None,
        hl: str | None = None,
        lr: str | None = None,
        useSiteRestrict: bool = False,
        dateRestrict: str | None = None,  # e.g. "d7", "m3", "y1"
        exactTerms: str | None = None,
        orTerms: str | None = None,
        excludeTerms: str | None = None,
        cxOverride: str | None = None,
        lean_fields: bool = True,  # shrink Google payload
    ) -> dict[str, Any]:
        """Google Programmable Search (CSE) via MCP.
    
        Parameters:
            q: Query string. Trimmed; required.
            num: Number of results to return (1..10; clamped).
            start: 1-based index for pagination start (clamped to >=1).
            siteSearch: Limit results to a site (or domain) per CSE rules.
            siteSearchFilter: "i" to include or "e" to exclude `siteSearch`.
            safe: SafeSearch level: "off" or "active".
            gl: Geolocation/country code.
            hl: UI language.
            lr: Language restrict (e.g., "lang_en").
            useSiteRestrict: Use the siterestrict endpoint variant.
            dateRestrict: Time filter (e.g., "d7", "m3", "y1").
            exactTerms, orTerms, excludeTerms: Query modifiers.
            cxOverride: Override the configured CSE ID (avoid echoing to clients).
            lean_fields: If True, request a smaller response via fields projection.
    
        Returns:
            A dict with keys: provider, query (sanitized), searchInfo, nextPage,
            latency_ms, results (normalized), raw (subset), trace (q hash).
    
        Raises:
            ValueError: For invalid parameter values (e.g., unsupported safe).
            RuntimeError: For Google API errors or network failures.
        """
        # --- Input hygiene
        q = q.strip()
        num = max(1, min(10, int(num)))
        start = max(1, int(start))
        if safe and safe not in {"off", "active"}:
            raise ValueError('safe must be "off" or "active" for Google CSE')
    
        # Endpoint selection
        endpoint = (
            GOOGLE_CSE_SITERESTRICT_ENDPOINT if useSiteRestrict else GOOGLE_CSE_ENDPOINT
        )
    
        # Base params
        params: dict[str, Any] = {
            "key": GOOGLE_API_KEY,
            "cx": cxOverride or GOOGLE_CX,
            "q": q,
            "num": num,
            "start": start,
        }
        # Optional params
        if siteSearch:
            params["siteSearch"] = siteSearch
        if siteSearchFilter:
            params["siteSearchFilter"] = siteSearchFilter  # "i" or "e"
        if safe:
            params["safe"] = safe
        if gl:
            params["gl"] = gl
        if hl:
            params["hl"] = hl
        if lr:
            params["lr"] = lr
        if dateRestrict:
            params["dateRestrict"] = dateRestrict
        if exactTerms:
            params["exactTerms"] = exactTerms
        if orTerms:
            params["orTerms"] = orTerms
        if excludeTerms:
            params["excludeTerms"] = excludeTerms
    
        # Lean response projection to save bandwidth (smaller payloads)
        if lean_fields:
            params["fields"] = (
                "items(title,link,snippet),"
                "queries(nextPage(startIndex)),"
                "searchInformation(searchTime,totalResults),"
                "kind"
            )
    
        t0 = time.perf_counter()
        data = await _cse_get(endpoint, params)
        dt = round((time.perf_counter() - t0) * 1000)
    
        # Optional query/latency logging (no secrets)
        if LOG_QUERIES:
            endpoint_name = (
                "siterestrict" if endpoint == GOOGLE_CSE_SITERESTRICT_ENDPOINT else "cse"
            )
            parts = [
                f"q_hash={_hash_q(q)}",
                f"dt_ms={dt}",
                f"num={num}",
                f"start={start}",
                f"safe={safe or '-'}",
                f"endpoint={endpoint_name}",
            ]
            if LOG_QUERY_TEXT:
                parts.append(f'q="{q}"')
            _logger.info("search %s", " ".join(parts))
    
        items = data.get("items") or []
        next_page = (data.get("queries", {}).get("nextPage") or [{}])[0].get("startIndex")
    
        # Avoid leaking secrets (e.g., API key/cx) in the echoed query payload
        echoed_query = {
            "q": q,
            "num": num,
            "start": start,
            "safe": safe,
            "gl": gl,
            "hl": hl,
            "lr": lr,
            "siteSearch": siteSearch,
            "siteSearchFilter": siteSearchFilter,
            "dateRestrict": dateRestrict,
            "exactTerms": exactTerms,
            "orTerms": orTerms,
            "excludeTerms": excludeTerms,
            # "cx": (cxOverride or GOOGLE_CX),  # omit unless you want it visible
        }
    
        return {
            "provider": "google-cse",
            "query": echoed_query,
            "searchInfo": data.get("searchInformation", {}),
            "nextPage": next_page,
            "latency_ms": dt,
            "results": _normalize(items),
            "raw": {"kind": data.get("kind")},
            "trace": {"q_hash": _hash_q(q)},
        }
  • server.py:39-39 (registration)
    Creates the FastMCP server instance named 'google-search' to which the 'search' tool is registered via decorator.
    mcp = FastMCP(name="google-search")
  • Helper function to normalize Google search results, applying domain filtering and formatting into {title, url, snippet, rank}.
    def _normalize(items: list[dict[str, Any]]) -> list[dict[str, Any]]:
        """Transform raw Google CSE `items` into a concise result list.
    
        Applies optional domain allowlist filtering (ALLOW_DOMAINS) and returns
        dictionaries with shape: {title, url, snippet, rank}.
        """
        results: list[dict[str, Any]] = []
        for i, it in enumerate(items or [], start=1):
            url = it.get("link")
            if ALLOW_DOMAINS:
                # Skip results outside the allowlist
                from urllib.parse import urlparse
    
                host = (urlparse(url).netloc or "").lower()
                if not any(host.endswith(dom) for dom in ALLOW_DOMAINS):
                    continue
            results.append(
                {
                    "title": it.get("title"),
                    "url": url,
                    "snippet": it.get("snippet"),
                    "rank": i,
                }
            )
        return results
  • Helper to perform HTTP GET to Google CSE API with retry logic for transient errors.
    async def _cse_get(endpoint: str, params: dict[str, Any]) -> dict[str, Any]:
        """Issue a GET to the Google CSE endpoint with small retries.
    
        Retries up to 3 times on common transient errors (429/5xx), with capped
        exponential backoff and jitter. Raises RuntimeError with status/text for
        non-retryable HTTP errors and wraps httpx exceptions on network errors.
        """
        # Retry small: 3 attempts on 429/5xx with jitter
        for attempt in range(3):
            try:
                r = await _http.get(endpoint, params=params)
                # Log status at DEBUG for observability (avoid logging params with secrets)
                if _logger.isEnabledFor(logging.DEBUG):
                    _logger.debug(
                        "cse_get attempt=%d status=%s", attempt + 1, r.status_code
                    )
                r.raise_for_status()
                return r.json()
            except httpx.HTTPStatusError as e:
                status = e.response.status_code
                retryable = status in (429, 500, 502, 503, 504) and attempt < 2
                # Respect Retry-After if present, else use capped backoff with jitter
                delay = None
                if retryable:
                    ra = e.response.headers.get("Retry-After")
                    if ra:
                        try:
                            delay = max(0.0, float(ra))
                        except ValueError:
                            delay = None
                    if delay is None:
                        delay = (0.2 + random.random() * 0.4) * (2**attempt)
                    if _logger.isEnabledFor(logging.DEBUG):
                        _logger.debug(
                            "cse_get retrying attempt=%d http_status=%s delay=%.2fs",
                            attempt + 1,
                            status,
                            delay,
                        )
                    await asyncio.sleep(delay)
                    continue
                # Bubble up a clean MCP error string
                detail = e.response.text[:500]
                raise RuntimeError(f"CSE request failed ({status}): {detail}") from e
            except httpx.HTTPError as e:
                if attempt < 2:
                    delay = (0.2 + random.random() * 0.4) * (2**attempt)
                    if _logger.isEnabledFor(logging.DEBUG):
                        _logger.debug(
                            "cse_get network error attempt=%d delay=%.2fs error=%s",
                            attempt + 1,
                            delay,
                            str(e),
                        )
                    await asyncio.sleep(delay)
                    continue
                raise RuntimeError(f"Network error contacting CSE: {e!s}") from e
  • Input/output schema inferred from type annotations and detailed docstring describing parameters, return value, and exceptions.
    async def search(
        q: str,
        num: int = 5,
        start: int = 1,
        siteSearch: str | None = None,
        siteSearchFilter: str | None = None,  # "i" include | "e" exclude
        safe: str | None = None,  # "off" | "active" (CSE)
        gl: str | None = None,
        hl: str | None = None,
        lr: str | None = None,
        useSiteRestrict: bool = False,
        dateRestrict: str | None = None,  # e.g. "d7", "m3", "y1"
        exactTerms: str | None = None,
        orTerms: str | None = None,
        excludeTerms: str | None = None,
        cxOverride: str | None = None,
        lean_fields: bool = True,  # shrink Google payload
    ) -> dict[str, Any]:
        """Google Programmable Search (CSE) via MCP.
    
        Parameters:
            q: Query string. Trimmed; required.
            num: Number of results to return (1..10; clamped).
            start: 1-based index for pagination start (clamped to >=1).
            siteSearch: Limit results to a site (or domain) per CSE rules.
            siteSearchFilter: "i" to include or "e" to exclude `siteSearch`.
            safe: SafeSearch level: "off" or "active".
            gl: Geolocation/country code.
            hl: UI language.
            lr: Language restrict (e.g., "lang_en").
            useSiteRestrict: Use the siterestrict endpoint variant.
            dateRestrict: Time filter (e.g., "d7", "m3", "y1").
            exactTerms, orTerms, excludeTerms: Query modifiers.
            cxOverride: Override the configured CSE ID (avoid echoing to clients).
            lean_fields: If True, request a smaller response via fields projection.
    
        Returns:
            A dict with keys: provider, query (sanitized), searchInfo, nextPage,
            latency_ms, results (normalized), raw (subset), trace (q hash).
    
        Raises:
            ValueError: For invalid parameter values (e.g., unsupported safe).
            RuntimeError: For Google API errors or network failures.
        """
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It does well by describing the return structure ('Returns: A dict with keys...'), error conditions ('Raises: ValueError...'), and implementation details like 'clamped' ranges and 'sanitized' queries. However, it doesn't mention rate limits, authentication requirements, or cost implications.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (Parameters, Returns, Raises) and uses bullet-like formatting. While comprehensive, some sentences could be more concise (e.g., 'Google Programmable Search (CSE) via MCP' could be simplified). Overall, it's efficiently organized with minimal wasted space.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (16 parameters, no annotations), the description provides substantial context. It covers parameters thoroughly, describes the return structure (though an output schema exists), and documents error conditions. The main gap is lack of usage guidance and some behavioral aspects like rate limits. For a search tool with many parameters, this is quite complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage for 16 parameters, the description provides excellent compensation. It explains every parameter's purpose, constraints (e.g., '1..10; clamped'), and special behaviors (e.g., 'Trimmed; required', 'avoid echoing to clients'). The parameter explanations add substantial meaning beyond what the bare schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states this tool performs 'Google Programmable Search (CSE) via MCP,' which is a specific verb+resource combination. However, without sibling tools to differentiate from, it cannot achieve the highest score of 5. The purpose is unambiguous: it executes search queries through Google's Custom Search Engine.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. There are no sibling tools mentioned, but it doesn't discuss typical use cases, prerequisites, or limitations. The agent receives no help in determining appropriate contexts for invoking this search tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jspv/google_search_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server