Skip to main content
Glama

kya_web_fetch

Fetch web pages with your identity token to authenticate as an authorized actor, avoid bot detection, and automatically log visits in your shopping journal.

Instructions

Fetch a web page with your Badge identity attached. Your Kya-Token header is injected automatically — merchants see you as an authorized actor, not a bot. Your visit is automatically recorded in your shopping journal.

Call kya_getAgentIdentity first. Then use this instead of web_fetch when shopping at merchant sites. HTTPS only. Returns status, headers, and body (5MB max, 30s timeout). Redirects are not followed — check the Location header if you receive a 3xx status.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe HTTPS URL to fetch (e.g., 'https://etsy.com/products')
methodNoHTTP method (default: GET)
headersNoAdditional headers to include in the request

Implementation Reference

  • The webFetch function is the core implementation of kya_web_fetch. It validates URL (HTTPS only, SSRF check), injects the Kya-Token identity header, fetches the page with a 30s timeout and 5MB body limit, filters response headers, and fires a browse_declared journal event.
    export async function webFetch(
      url: string,
      method?: string,
      headers?: Record<string, string>,
    ): Promise<WebFetchResult> {
      // 1. URL validation (before identity — need merchant from URL)
      let parsed: URL;
      try {
        parsed = new URL(url);
      } catch {
        return { error: "Invalid URL", code: "INVALID_URL" };
      }
    
      // 2. Scheme check — HTTPS only (HTTP allowed in tests)
      const isHttps = parsed.protocol === "https:";
      const isTestHttp = parsed.protocol === "http:" && process.env.VITEST;
      if (!isHttps && !isTestHttp) {
        return { error: "URL must use HTTPS", code: "INVALID_URL" };
      }
    
      // 3. SSRF check
      if (!isPublicOrigin(url)) {
        return { error: "Cannot fetch private or internal URLs", code: "BLOCKED_URL" };
      }
    
      // 4. Method check (before identity — reject invalid methods without touching badge state)
      const resolvedMethod = (method ?? "GET").toUpperCase();
      if (!ALLOWED_METHODS.has(resolvedMethod)) {
        return {
          error: `Method ${resolvedMethod} not allowed. Use GET, HEAD, or OPTIONS.`,
          code: "METHOD_NOT_ALLOWED",
        };
      }
    
      // 5. Identity check — get badge token for this merchant
      const merchant = parsed.hostname.replace(/^www\./, "");
      const identitySession = getLatestIdentitySession();
      let token = identitySession?.verificationToken ?? getCachedBadgeToken(merchant);
      if (!token) {
        // Enroll on-the-fly for this merchant
        token = await enrollAndCacheBadgeToken(merchant);
      }
      if (!token) {
        return {
          error: "Call kya_getAgentIdentity with a merchant first to establish identity",
          code: "NO_IDENTITY",
        };
      }
    
      // 6. Build request headers — our token wins over any agent-provided Kya-Token
      // Filter out case-variant kya-token headers to prevent agents from overriding
      const sanitizedHeaders: Record<string, string> = {};
      if (headers) {
        for (const [k, v] of Object.entries(headers)) {
          if (k.toLowerCase() !== "kya-token") {
            sanitizedHeaders[k] = v;
          }
        }
      }
      const requestHeaders: Record<string, string> = {
        ...sanitizedHeaders,
        "Kya-Token": token,
      };
    
      // 7. Execute fetch
      let response: Response;
      try {
        response = await fetch(url, {
          method: resolvedMethod,
          headers: requestHeaders,
          redirect: "manual",
          signal: AbortSignal.timeout(FETCH_TIMEOUT_MS),
        });
      } catch (err) {
        if (err instanceof Error && (err.name === "TimeoutError" || err.name === "AbortError")) {
          return { error: "Request timed out", code: "TIMEOUT" };
        }
        return { error: "Failed to fetch URL", code: "FETCH_ERROR" };
      }
    
      // 8. Read body with size cap (streaming to avoid loading full response into memory)
      let body = "";
      let truncated = false;
      try {
        if (response.body) {
          const reader = response.body.getReader();
          const decoder = new TextDecoder();
          let bytesRead = 0;
          for (;;) {
            const { done, value } = await reader.read();
            if (done) break;
            const remaining = MAX_BODY_BYTES - bytesRead;
            if (value.byteLength > remaining) {
              // Slice chunk to exact byte limit before decoding
              body += decoder.decode(value.subarray(0, remaining), { stream: false });
              truncated = true;
              await reader.cancel();
              break;
            }
            body += decoder.decode(value, { stream: true });
            bytesRead += value.byteLength;
          }
          if (!truncated) body += decoder.decode(); // flush remaining
        } else {
          // Fallback if no readable stream
          body = await response.text();
          if (body.length > MAX_BODY_BYTES) {
            body = body.slice(0, MAX_BODY_BYTES);
            truncated = true;
          }
        }
      } catch (err) {
        process.stderr.write(`[badge] body read failed: ${err instanceof Error ? err.message : err}\n`);
        body = "";
      }
    
      // 9. Filter response headers
      const responseHeaders: Record<string, string> = {};
      for (const [key, value] of response.headers.entries()) {
        if (KEEP_HEADERS.has(key.toLowerCase())) {
          responseHeaders[key.toLowerCase()] = value;
        }
      }
    
      const result: WebFetchSuccess = {
        status: response.status,
        headers: responseHeaders,
        body,
        truncated,
        url: response.url || url,
      };
    
      // 10. Auto-declare (fire-and-forget)
      fireBrowseDeclared(merchant, identitySession?.tripId ?? null);
    
      return result;
    }
  • src/index.ts:233-261 (registration)
    The MCP server registration for 'kya_web_fetch'. Defines schema (url, method, headers params) and the handler callback that calls the webFetch function and formats the result.
    server.tool(
      "kya_web_fetch",
      `Fetch a web page with your Badge identity attached. Your Kya-Token header is injected automatically — merchants see you as an authorized actor, not a bot. Your visit is automatically recorded in your shopping journal.
    
    Call kya_getAgentIdentity first. Then use this instead of web_fetch when shopping at merchant sites. HTTPS only. Returns status, headers, and body (5MB max, 30s timeout). Redirects are not followed — check the Location header if you receive a 3xx status.`,
      {
        url: z.string().max(2000).describe(
          "The HTTPS URL to fetch (e.g., 'https://etsy.com/products')"
        ),
        method: z.enum(["GET", "HEAD", "OPTIONS"]).optional().describe(
          "HTTP method (default: GET)"
        ),
        headers: z.record(z.string(), z.string()).optional().describe(
          "Additional headers to include in the request"
        ),
      },
      async ({ url, method, headers }) => {
        const result = await webFetch(url, method, headers as Record<string, string> | undefined);
        if ("error" in result) {
          return {
            content: [{ type: "text" as const, text: JSON.stringify(result) }],
            isError: true,
          };
        }
        return {
          content: [{ type: "text" as const, text: JSON.stringify(result) }],
        };
      }
    );
  • Fire-and-forget helper that reports a browse_declared event to the Kya API to log the visit in the shopping journal.
    function fireBrowseDeclared(merchant: string, tripId: string | null): void {
      try {
        const apiUrl = getEnvApiUrl() || DEFAULT_API_URL;
        const installId = getOrCreateInstallId();
    
        const payload = {
          install_id: installId,
          badge_version: BADGE_VERSION,
          event_type: "browse_declared",
          merchant,
          agent_type: AGENT_TYPE,
          agent_model: getAgentModel(),
          trip_id: tripId ?? randomUUID(),
          timestamp: Date.now(),
        };
    
        const controller = new AbortController();
        const timer = setTimeout(() => controller.abort(), DECLARE_TIMEOUT_MS);
    
        fetch(`${apiUrl}/api/badge/report`, {
          method: "POST",
          headers: { "Content-Type": "application/json" },
          body: JSON.stringify(payload),
          signal: controller.signal,
        })
          .then((res) => {
            clearTimeout(timer);
            if (!res.ok) {
              process.stderr.write(`[badge] browse_declared failed: HTTP ${res.status}\n`);
            }
          })
          .catch(() => {
            clearTimeout(timer);
            process.stderr.write("[badge] browse_declared failed: network error\n");
          });
      } catch {
        // Never propagate — fire-and-forget
      }
    }
  • Zod schema definitions for the kya_web_fetch tool parameters: url (string, max 2000), method (GET/HEAD/OPTIONS, optional), headers (record, optional).
    {
      url: z.string().max(2000).describe(
        "The HTTPS URL to fetch (e.g., 'https://etsy.com/products')"
      ),
      method: z.enum(["GET", "HEAD", "OPTIONS"]).optional().describe(
        "HTTP method (default: GET)"
      ),
      headers: z.record(z.string(), z.string()).optional().describe(
        "Additional headers to include in the request"
      ),
    },
  • TypeScript interfaces defining the success (WebFetchSuccess) and error (WebFetchError) result types for the webFetch function.
    export interface WebFetchSuccess {
      status: number;
      headers: Record<string, string>;
      body: string;
      truncated: boolean;
      url: string;
    }
    
    export interface WebFetchError {
      error: string;
      code: string;
    }
    
    export type WebFetchResult = WebFetchSuccess | WebFetchError;
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully covers behavioral traits: automatic Kya-Token injection, shopping journal recording, 5MB max size, 30s timeout, no redirect following, and HTTPS requirement. This leaves little ambiguity about the tool's operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is concise (6 sentences), front-loaded with the main action, then elaborates on identity, prerequisites, and constraints. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description mentions return values (status, headers, body) and behavior on redirects. It also covers prerequisites and limits. Slight gap: what happens if getAgentIdentity is not called? Still, sufficient for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage with descriptions for all three parameters. The description does not add significant meaning beyond schema, except context that URL should be HTTPS. Baseline of 3 is appropriate as schema already documents params adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool fetches a web page with a badge identity, distinguishing it from a generic web_fetch by specifying use case (shopping at merchant sites). The verb 'fetch' and resource 'web page' are explicit, and the context of automatic identity attachment sets it apart from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit prerequisites (call kya_getAgentIdentity first), when to use (instead of web_fetch when shopping), and constraints (HTTPS only, redirects not followed). It also instructs on handling 3xx status by checking Location header, offering clear guidance for the agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kyalabs-Io/mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server