read_webpage
Extract webpage content in formats optimized for LLM processing, including text, markdown, HTML, and screenshots with configurable options.
Instructions
Extract content from a webpage in a format optimized for LLMs
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | ||
| format | No | ||
| with_links | No | ||
| with_images | No | ||
| with_generated_alt | No | ||
| no_cache | No |
Implementation Reference
- index.ts:37-63 (handler)The handler function that implements the read_webpage tool. It makes a POST request to Jina AI's reader API (https://r.jina.ai/) with the provided URL and optional parameters, handles headers for additional features, and parses the response using ReaderResponseSchema.async function readWebPage(params: z.infer<typeof ReadWebPageSchema>) { const headers: Record<string, string> = { 'Authorization': `Bearer ${JINA_API_KEY}`, 'Content-Type': 'application/json', 'Accept': 'application/json' }; if (params.with_links) headers['X-With-Links-Summary'] = 'true'; if (params.with_images) headers['X-With-Images-Summary'] = 'true'; if (params.with_generated_alt) headers['X-With-Generated-Alt'] = 'true'; if (params.no_cache) headers['X-No-Cache'] = 'true'; const response = await fetch('https://r.jina.ai/', { method: 'POST', headers, body: JSON.stringify({ url: params.url, options: params.format || 'Default' }) }); if (!response.ok) { throw new Error(`Jina AI API error: ${response.statusText}`); } return ReaderResponseSchema.parse(await response.json()); }
- schemas.ts:35-42 (schema)Zod schema defining the input parameters for the read_webpage tool: required URL and optional flags for format, links, images, alt text generation, and cache.export const ReadWebPageSchema = z.object({ url: z.string(), format: z.enum(['Default', 'Markdown', 'HTML', 'Text', 'Screenshot', 'Pageshot']).optional(), with_links: z.boolean().optional(), with_images: z.boolean().optional(), with_generated_alt: z.boolean().optional(), no_cache: z.boolean().optional() });
- index.ts:113-117 (registration)Registration of the read_webpage tool in the MCP server's list tools handler. Specifies the tool name, description, and converts the Zod schema to JSON schema for the protocol.{ name: "read_webpage", description: "Extract content from a webpage in a format optimized for LLMs", inputSchema: zodToJsonSchema(ReadWebPageSchema) },