Skip to main content
Glama

fetch_with_license

Determine the license of any image or webpage URL by analyzing host heuristics and metadata, enabling go/no-go decisions before use.

Instructions

Given an arbitrary URL (image or webpage), determine its license via host heuristics + page metadata (, dc.rights, og tags). Set probe: true to also download the bytes. Use when an agent already has a URL and needs a go/no-go decision before shipping.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
probeNo

Implementation Reference

  • The core implementation of fetchWithLicense. Takes a URL, probes if it's an image (using HEAD request), then either returns license heuristics for images (with optional download via probe:true) or parses HTML page metadata for license info.
    export async function fetchWithLicense(
      url: string,
      opts: FetchWithLicenseOptions = {},
    ): Promise<FetchWithLicenseResult> {
      const fetcher = opts.fetcher ?? fetch;
      const ua = opts.userAgent ?? "webfetch-mcp/0.1";
      const publicUrl = assertPublicHttpUrl(url);
      if (!publicUrl.ok) throw new Error(publicUrl.error);
      // Probe the URL shallowly to see if it's an image.
      let head: Response;
      try {
        head = await fetcher(url, {
          method: "HEAD",
          headers: { "User-Agent": ua },
          signal: opts.signal,
        });
      } catch {
        // Some servers disallow HEAD; fall through to GET.
        head = new Response(null);
      }
      const ct = (head.headers.get("content-type") ?? "").split(";")[0]!.trim();
    
      if (ct.startsWith("image/")) {
        const heur = heuristicLicenseFromUrl(url);
        const res: FetchWithLicenseResult = {
          license: heur.license,
          confidence: heur.confidence,
          sourcePageUrl: url,
          attributionLine: buildAttribution({ license: heur.license, sourceUrl: url }),
        };
        if (opts.probe) {
          const dl = await downloadImage(url, { fetcher, userAgent: ua, signal: opts.signal });
          res.bytes = dl.bytes;
          res.mime = dl.mime;
          res.sha256 = dl.sha256;
          res.cachedPath = dl.cachedPath;
          try {
            const meta = await readImageMetadata(dl.bytes);
            res.embeddedMetadata = meta;
            // Reconcile — embedded metadata wins on tie.
            if (meta.artist && !res.author) res.author = meta.artist;
            if (meta.copyright) res.copyright = meta.copyright;
            if (meta.license !== "UNKNOWN" && meta.confidence.license >= res.confidence) {
              res.license = meta.license;
              res.confidence = Math.max(res.confidence, meta.confidence.license);
              if (meta.licenseUrl) res.licenseUrl = meta.licenseUrl;
            }
            res.attributionLine = buildAttribution({
              license: res.license,
              author: res.author,
              sourceUrl: res.sourcePageUrl,
            });
          } catch {
            // non-fatal — keep host-heuristic result
          }
        }
        return res;
      }
    
      // Treat as webpage: fetch HTML and extract hints.
      const resp = await fetcher(url, {
        headers: { "User-Agent": ua, Accept: "text/html" },
        signal: opts.signal,
      });
      if (!resp.ok) {
        return { license: "UNKNOWN", confidence: 0, sourcePageUrl: url };
      }
      const html = await resp.text();
      return parseHtmlLicense(html, url);
    }
  • parseHtmlLicense helper function that extracts license metadata from HTML using regex matching for <link rel=license>, dc.rights, og:image:license, and article:author tags.
    export function parseHtmlLicense(html: string, url: string): FetchWithLicenseResult {
      const metaLicense =
        matchAttr(html, /<link[^>]+rel=["']license["'][^>]+href=["']([^"']+)["']/i) ??
        matchAttr(html, /<meta[^>]+name=["']dc.rights["'][^>]+content=["']([^"']+)["']/i) ??
        matchAttr(html, /<meta[^>]+property=["']og:image:license["'][^>]+content=["']([^"']+)["']/i);
      const ogAuthor = matchAttr(
        html,
        /<meta[^>]+property=["']article:author["'][^>]+content=["']([^"']+)["']/i,
      );
      const heur = heuristicLicenseFromUrl(url);
    
      let license: License = heur.license;
      let confidence = heur.confidence;
      if (metaLicense) {
        const coerced = coerceLicense(metaLicense);
        if (coerced !== "UNKNOWN") {
          license = coerced;
          confidence = Math.max(confidence, 0.7);
        }
      }
    
      return {
        license,
        confidence,
        sourcePageUrl: url,
        author: ogAuthor ?? undefined,
        attributionLine: buildAttribution({
          license,
          author: ogAuthor ?? undefined,
          sourceUrl: url,
        }),
      };
    }
  • Zod input schema for the fetch_with_license MCP tool. Takes a url (string) and optional probe (boolean, default false) to also download bytes.
    export const fetchWithLicenseSchema = z.object({
      url: z.string().url(),
      probe: z
        .boolean()
        .default(false)
        .describe("When true, also download the bytes if this URL is an image"),
    });
  • MCP tool registration for 'fetch_with_license'. Defines name, description (prompt surface for LLM), inputSchema, and handler that calls core fetchWithLicense and renders the result as JSON.
    {
      name: "fetch_with_license",
      description:
        "Given an arbitrary URL (image or webpage), determine its license via host heuristics + page metadata (<link rel=license>, dc.rights, og tags). Set probe: true to also download the bytes. Use when an agent already has a URL and needs a go/no-go decision before shipping.",
      inputSchema: fetchWithLicenseSchema,
      async handler(args) {
        const r = await fetchWithLicense(args.url, { probe: args.probe });
        return renderJson({
          license: r.license,
          confidence: r.confidence,
          author: r.author,
          attributionLine: r.attributionLine,
          sourcePageUrl: r.sourcePageUrl,
          mime: r.mime,
          sha256: r.sha256,
          cachedPath: r.cachedPath,
          byteSize: r.bytes?.byteLength,
        });
      },
    },
  • MCP server handler that receives CallToolRequest, parses input via schema, and dispatches to the tool's handler. This is the entry point that wires the tool to the MCP protocol.
    server.setRequestHandler(CallToolRequestSchema, async (req) => {
      const tool = TOOLS.find((t) => t.name === req.params.name);
      if (!tool) {
        return { content: [{ type: "text", text: `unknown tool: ${req.params.name}` }], isError: true };
      }
      try {
        const parsed = tool.inputSchema.parse(req.params.arguments ?? {});
        const out = await tool.handler(parsed);
        return out as any;
      } catch (e) {
        const msg = (e as Error).message ?? "unknown error";
        return { content: [{ type: "text", text: `error: ${msg}` }], isError: true };
      }
    });
    
    const transport = new StdioServerTransport();
    await server.connect(transport);
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Given no annotations, the description carries full burden; it discloses heuristics, page metadata scanning, and the probe parameter triggering download. Sufficiently transparent for a simple tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no unnecessary words. First sentence describes function, second provides usage guidance. Efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description implies return of license info. For a 2-param tool, it covers behavior and usage. Could mention return format or error handling.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but the description explains the url parameter as target and probe parameter as triggering download, adding value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'determine its license' and the resource 'URL', and distinguishes from siblings like download_image and search_images through its specific function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage context: 'Use when an agent already has a URL and needs a go/no-go decision before shipping.' Lacks explicit alternatives or when-not-to-use, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ashlrai/webfetch'

If you have feedback or need assistance with the MCP directory API, please join our Discord server