pipeline
Search web pages and extract structured data from top results in a single operation using a defined JSON schema.
Instructions
Search + extract in one call. The killer feature — find pages via search, then extract structured data from top results. Costs 3+ credits.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Search query to find relevant pages | |
| schema | Yes | JSON schema defining data to extract from results | |
| max_results | No | Max search results (default: 10) | |
| extract_from | No | Number of top results to extract from (default: 5) |
Implementation Reference
- src/index.ts:192-193 (handler)The handler function for the 'pipeline' tool that executes the search + extract logic. It calls apiPost('/pipeline', ...) with the query, schema, max_results, and extract_from parameters, then formats the result using jsonResult().
async ({ query, schema, max_results, extract_from }) => jsonResult(await apiPost("/pipeline", { query, schema, max_results, extract_from })) - src/index.ts:186-191 (schema)Input schema definition for the 'pipeline' tool using Zod. Defines 4 parameters: query (string, required), schema (record, required), max_results (number, optional, default 10), and extract_from (number, optional, default 5).
{ query: z.string().describe("Search query to find relevant pages"), schema: z.record(z.unknown()).describe("JSON schema defining data to extract from results"), max_results: z.number().optional().default(10).describe("Max search results (default: 10)"), extract_from: z.number().optional().default(5).describe("Number of top results to extract from (default: 5)"), }, - src/index.ts:183-194 (registration)Complete registration of the 'pipeline' tool with the MCP server using server.tool(). Includes tool name, description, input schema, and handler function.
server.tool( "pipeline", "Search + extract in one call. The killer feature — find pages via search, then extract structured data from top results. Costs 3+ credits.", { query: z.string().describe("Search query to find relevant pages"), schema: z.record(z.unknown()).describe("JSON schema defining data to extract from results"), max_results: z.number().optional().default(10).describe("Max search results (default: 10)"), extract_from: z.number().optional().default(5).describe("Number of top results to extract from (default: 5)"), }, async ({ query, schema, max_results, extract_from }) => jsonResult(await apiPost("/pipeline", { query, schema, max_results, extract_from })) ); - src/index.ts:41-59 (helper)Helper function apiPost() that makes HTTP POST requests to the SearchClaw API with timeout handling, error handling, and JSON response parsing. Used by the pipeline tool handler.
async function apiPost(path: string, body: Record<string, unknown>) { const controller = new AbortController(); const timeout = setTimeout(() => controller.abort(), 30000); try { const response = await fetch(`${API_BASE}${path}`, { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(body), signal: controller.signal, }); if (!response.ok) { const text = await response.text(); throw new Error(`SearchClaw API error ${response.status}: ${text}`); } return response.json(); } finally { clearTimeout(timeout); } } - src/index.ts:61-63 (helper)Helper function jsonResult() that formats API responses into the MCP tool result format with proper content type and JSON stringification.
function jsonResult(data: unknown) { return { content: [{ type: "text" as const, text: JSON.stringify(data, null, 2) }] }; }