Skip to main content
Glama

read_webpage

Extract webpage content in formats optimized for LLM processing, including text, markdown, HTML, and screenshots with configurable options.

Instructions

Extract content from a webpage in a format optimized for LLMs

Input Schema

NameRequiredDescriptionDefault
urlYes
formatNo
with_linksNo
with_imagesNo
with_generated_altNo
no_cacheNo

Input Schema (JSON Schema)

{ "$schema": "http://json-schema.org/draft-07/schema#", "additionalProperties": false, "properties": { "format": { "enum": [ "Default", "Markdown", "HTML", "Text", "Screenshot", "Pageshot" ], "type": "string" }, "no_cache": { "type": "boolean" }, "url": { "type": "string" }, "with_generated_alt": { "type": "boolean" }, "with_images": { "type": "boolean" }, "with_links": { "type": "boolean" } }, "required": [ "url" ], "type": "object" }

Implementation Reference

  • index.ts:37-63 (handler)
    The handler function that implements the read_webpage tool. It makes a POST request to Jina AI's reader API (https://r.jina.ai/) with the provided URL and optional parameters, handles headers for additional features, and parses the response using ReaderResponseSchema.
    async function readWebPage(params: z.infer<typeof ReadWebPageSchema>) { const headers: Record<string, string> = { 'Authorization': `Bearer ${JINA_API_KEY}`, 'Content-Type': 'application/json', 'Accept': 'application/json' }; if (params.with_links) headers['X-With-Links-Summary'] = 'true'; if (params.with_images) headers['X-With-Images-Summary'] = 'true'; if (params.with_generated_alt) headers['X-With-Generated-Alt'] = 'true'; if (params.no_cache) headers['X-No-Cache'] = 'true'; const response = await fetch('https://r.jina.ai/', { method: 'POST', headers, body: JSON.stringify({ url: params.url, options: params.format || 'Default' }) }); if (!response.ok) { throw new Error(`Jina AI API error: ${response.statusText}`); } return ReaderResponseSchema.parse(await response.json()); }
  • Zod schema defining the input parameters for the read_webpage tool: required URL and optional flags for format, links, images, alt text generation, and cache.
    export const ReadWebPageSchema = z.object({ url: z.string(), format: z.enum(['Default', 'Markdown', 'HTML', 'Text', 'Screenshot', 'Pageshot']).optional(), with_links: z.boolean().optional(), with_images: z.boolean().optional(), with_generated_alt: z.boolean().optional(), no_cache: z.boolean().optional() });
  • index.ts:113-117 (registration)
    Registration of the read_webpage tool in the MCP server's list tools handler. Specifies the tool name, description, and converts the Zod schema to JSON schema for the protocol.
    { name: "read_webpage", description: "Extract content from a webpage in a format optimized for LLMs", inputSchema: zodToJsonSchema(ReadWebPageSchema) },

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/joeblockchain/mcp-jina-ai'

If you have feedback or need assistance with the MCP directory API, please join our Discord server