Skip to main content
Glama

Scan a site for AI agent readability

scan_site
Read-onlyIdempotent

Scans a public URL to evaluate AI agent readability, providing scores and actionable remediation hints for failing checks.

Instructions

Runs the agent-ready.dev scanner against a URL and returns structured results: Vercel score, llmstxt.org score, and per-check findings with remediation hints. Scans may take up to ~60s; if the local poll deadline elapses, the tool returns the scan id and asks you to poll with get_scan.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesFully-qualified URL to scan, including scheme (https://...).
pageLimitNoOptional maximum number of pages to crawl from the root URL. Capped by your plan (Free 25, Pro 250, Team 2000).

Implementation Reference

  • The main handler function for the scan_site tool. Accepts a URL (and optional pageLimit), posts the scan via the API, polls for completion up to scanTimeoutMs, and returns the result JSON or a running placeholder with the scan id for later polling.
    export async function scanSite(
      config: Config,
      input: { url: string; pageLimit?: number },
    ): Promise<ToolResult> {
      let placeholder: ScanPlaceholder;
      try {
        placeholder = (await postScan(config, {
          url: input.url,
          pageLimit: input.pageLimit,
        })) as ScanPlaceholder;
      } catch (err) {
        if (err instanceof ApiError) {
          throw new ToolError(err.code, err.message);
        }
        throw err;
      }
    
      if (!placeholder.id) {
        throw new ToolError(
          "invalid_response",
          "Scan accepted but no id returned from /api/v1/scans.",
        );
      }
    
      // The REST endpoint always returns 202 + a placeholder. Poll for completion
      // up to the configured scan timeout, then return whatever we have. Callers
      // can use get_scan with the id to fetch the final result later.
      const deadline = Date.now() + config.scanTimeoutMs;
      let last: ScanResult | ScanPlaceholder = placeholder;
      while (Date.now() < deadline) {
        await sleep(POLL_INTERVAL_MS);
        try {
          last = (await getScanFromApi(config, placeholder.id)) as ScanResult;
        } catch (err) {
          // Transient errors during polling — keep trying until the deadline.
          if (err instanceof ApiError && err.status === 429) {
            // Rate-limited reads — back off for one extra interval.
            await sleep(POLL_INTERVAL_MS);
            continue;
          }
          if (err instanceof ApiError && err.status === 404) {
            // Scan vanished — shouldn't happen, but surface it clearly.
            throw new ToolError(
              "not_found",
              `Scan ${placeholder.id} disappeared during polling.`,
            );
          }
          // Other errors: re-throw — the caller can decide whether to retry.
          if (err instanceof ApiError) {
            throw new ToolError(err.code, err.message);
          }
          throw err;
        }
    
        if (last && (last as ScanResult).status && (last as ScanResult).status !== "running") {
          // Terminal state — return the full payload.
          return { content: [{ type: "text", text: JSON.stringify(last) }] };
        }
      }
    
      // Deadline hit while scan is still running. Return the placeholder so the
      // caller can poll later via get_scan.
      return {
        content: [
          {
            type: "text",
            text: JSON.stringify({
              id: placeholder.id,
              status: "running",
              url: input.url,
              pollUrl: placeholder.pollUrl ?? `/api/v1/scans/${placeholder.id}`,
              message:
                "Scan still running after the local poll deadline. Call get_scan with this id to fetch the final result.",
            }),
          },
        ],
      };
    }
  • src/server.ts:28-44 (registration)
    Registration of the 'scan_site' tool on the MCP server, including title, description, inputSchema (from types.ts), annotations, and a callback that delegates to scanSite().
    server.registerTool(
      "scan_site",
      {
        title: "Scan a site for AI agent readability",
        description:
          "Runs the agent-ready.dev scanner against a URL and returns structured results: Vercel score, llmstxt.org score, and per-check findings with remediation hints. Scans may take up to ~60s; if the local poll deadline elapses, the tool returns the scan id and asks you to poll with get_scan.",
        inputSchema: scanSiteInputShape,
        annotations: READ_ONLY_OPEN_WORLD,
      },
      async (args) => {
        try {
          return await scanSite(config, args);
        } catch (err) {
          return toolErrorToContent(err);
        }
      },
    );
  • Input schema definition for scan_site tool using Zod. Defines 'url' (required, url-format, max 2000 chars) and 'pageLimit' (optional integer 1-2000) with descriptions.
    export const scanSiteInputShape = {
      url: z
        .string()
        .url()
        .max(2000)
        .describe("Fully-qualified URL to scan, including scheme (https://...)."),
      pageLimit: z
        .number()
        .int()
        .min(1)
        .max(2000)
        .optional()
        .describe(
          "Optional maximum number of pages to crawl from the root URL. Capped by your plan (Free 25, Pro 250, Team 2000).",
        ),
    } as const;
  • Supporting types (ToolResult, ScanPlaceholder, ScanResult), the ToolError class, sleep utility, and constants used by the scan_site handler.
    import type { Config } from "../client.js";
    import { ApiError, getScanFromApi, postScan } from "../client.js";
    
    export class ToolError extends Error {
      constructor(
        public readonly code: string,
        message: string,
      ) {
        super(message);
        this.name = "ToolError";
      }
    }
    
    interface ToolResult {
      [key: string]: unknown;
      content: Array<{ type: "text"; text: string }>;
    }
    
    interface ScanPlaceholder {
      id: string;
      status: "running" | "queued";
      url?: string;
      pollUrl?: string;
      message?: string;
    }
    
    interface ScanResult {
      status?: string;
      id?: string;
    }
    
    const POLL_INTERVAL_MS = 2_000;
    
    async function sleep(ms: number): Promise<void> {
      return new Promise((resolve) => setTimeout(resolve, ms));
    }
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint. Description adds timing behavior (up to ~60s) and fallback mechanism (return scan id for polling), which goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first defines purpose and output, second explains timing and polling. No filler, front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple scan tool with two parameters, description covers purpose, timing, polling fallback, and output summary. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters. Description adds plan capping info for pageLimit, providing context beyond the schema's description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it runs a scanner against a URL and returns structured scores and per-check findings. Differentiates from sibling 'get_scan' which is for polling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions scan may take ~60s and if deadline elapses, returns scan id for polling with get_scan. Provides clear context for when to use the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mlava/agent-ready-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server