Skip to main content
Glama

extract

Extract structured data from any URL using a JSON schema. Define the data structure you need and get organized results from web content.

Instructions

Extract structured data from a URL using a JSON schema. Costs 5 credits.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to extract data from
schemaNoJSON schema defining the structure to extract
promptNoNatural language instruction for extraction

Implementation Reference

  • The handler function for the 'extract' tool that processes the URL, schema, and prompt parameters, constructs the request body, and calls the API endpoint via apiPost('/extract', body)
    async ({ url, schema, prompt }) => {
      const body: Record<string, unknown> = { url };
      if (schema) body.schema = schema;
      if (prompt) body.prompt = prompt;
      return jsonResult(await apiPost("/extract", body));
    }
  • src/index.ts:109-123 (registration)
    Tool registration using server.tool() that defines the 'extract' tool name, description, input schema using Zod (url, optional schema, optional prompt), and the handler function
    server.tool(
      "extract",
      "Extract structured data from a URL using a JSON schema. Costs 5 credits.",
      {
        url: z.string().describe("URL to extract data from"),
        schema: z.record(z.unknown()).optional().describe("JSON schema defining the structure to extract"),
        prompt: z.string().optional().describe("Natural language instruction for extraction"),
      },
      async ({ url, schema, prompt }) => {
        const body: Record<string, unknown> = { url };
        if (schema) body.schema = schema;
        if (prompt) body.prompt = prompt;
        return jsonResult(await apiPost("/extract", body));
      }
    );
  • Input schema definition using Zod: url (required string), schema (optional record for JSON structure), and prompt (optional string for natural language instructions)
    {
      url: z.string().describe("URL to extract data from"),
      schema: z.record(z.unknown()).optional().describe("JSON schema defining the structure to extract"),
      prompt: z.string().optional().describe("Natural language instruction for extraction"),
    },
  • Helper function apiPost() that makes POST requests to the SearchClaw API with 30-second timeout, proper headers, and error handling
    async function apiPost(path: string, body: Record<string, unknown>) {
      const controller = new AbortController();
      const timeout = setTimeout(() => controller.abort(), 30000);
      try {
        const response = await fetch(`${API_BASE}${path}`, {
          method: "POST",
          headers: { ...headers, "Content-Type": "application/json" },
          body: JSON.stringify(body),
          signal: controller.signal,
        });
        if (!response.ok) {
          const text = await response.text();
          throw new Error(`SearchClaw API error ${response.status}: ${text}`);
        }
        return response.json();
      } finally {
        clearTimeout(timeout);
      }
    }
  • Helper function jsonResult() that formats API responses into MCP content format with JSON stringification
    function jsonResult(data: unknown) {
      return { content: [{ type: "text" as const, text: JSON.stringify(data, null, 2) }] };
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/CSteenkamp/searchclaw-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server