Skip to main content
Glama

read_url

Extract and convert web content from URLs into structured, LLM-readable text for analysis and processing.

Instructions

Convert any URL to LLM-friendly text using Jina.ai Reader

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to process
no_cacheNoBypass cache for fresh results
formatNoResponse format (json or stream)json
timeoutNoMaximum time in seconds to wait for webpage load
target_selectorNoCSS selector to focus on specific elements
wait_for_selectorNoCSS selector to wait for specific elements
remove_selectorNoCSS selector to exclude specific elements
with_links_summaryNoGather all links at the end of response
with_images_summaryNoGather all images at the end of response
with_generated_altNoAdd alt text to images lacking captions
with_iframeNoInclude iframe content in response

Implementation Reference

  • CallToolRequest handler that implements the core logic for the 'read_url' tool: validates input, constructs headers with optional parameters, fetches from Jina.ai Reader API, and returns the processed text content.
    this.server.setRequestHandler( CallToolRequestSchema, async (request) => { if (request.params.name !== 'read_url') { throw new McpError( ErrorCode.MethodNotFound, `Unknown tool: ${request.params.name}`, ); } const args = request.params.arguments as Record< string, unknown >; if ( !args || typeof args.url !== 'string' || !is_valid_url(args.url) ) { throw new McpError( ErrorCode.InvalidParams, 'Invalid or missing URL parameter', ); } try { const headers: Record<string, string> = { Accept: typeof args.format === 'string' && args.format === 'stream' ? 'text/event-stream' : 'application/json', 'Content-Type': 'application/json', Authorization: `Bearer ${JINAAI_API_KEY}`, }; // Optional headers from documentation if (typeof args.no_cache === 'boolean' && args.no_cache) { headers['X-No-Cache'] = 'true'; } if (typeof args.timeout === 'number') { headers['X-Timeout'] = args.timeout.toString(); } if (typeof args.target_selector === 'string') { headers['X-Target-Selector'] = args.target_selector; } if (typeof args.wait_for_selector === 'string') { headers['X-Wait-For-Selector'] = args.wait_for_selector; } if (typeof args.remove_selector === 'string') { headers['X-Remove-Selector'] = args.remove_selector; } if ( typeof args.with_links_summary === 'boolean' && args.with_links_summary ) { headers['X-With-Links-Summary'] = 'true'; } if ( typeof args.with_images_summary === 'boolean' && args.with_images_summary ) { headers['X-With-Images-Summary'] = 'true'; } if ( typeof args.with_generated_alt === 'boolean' && args.with_generated_alt ) { headers['X-With-Generated-Alt'] = 'true'; } if ( typeof args.with_iframe === 'boolean' && args.with_iframe ) { headers['X-With-Iframe'] = 'true'; } const response = await fetch(this.base_url + args.url, { headers, }); if (!response.ok) { throw new Error(`HTTP error! status: ${response.status}`); } const result = await response.text(); return { content: [ { type: 'text', text: result, }, ], }; } catch (error) { const message = error instanceof Error ? error.message : String(error); throw new McpError( ErrorCode.InternalError, `Failed to process URL: ${message}`, ); } }, );
  • Input schema defining parameters for the 'read_url' tool, including required 'url' and various optional Jina.ai Reader options.
    inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'URL to process', }, no_cache: { type: 'boolean', description: 'Bypass cache for fresh results', default: false, }, format: { type: 'string', description: 'Response format (json or stream)', enum: ['json', 'stream'], default: 'json', }, timeout: { type: 'number', description: 'Maximum time in seconds to wait for webpage load', }, target_selector: { type: 'string', description: 'CSS selector to focus on specific elements', }, wait_for_selector: { type: 'string', description: 'CSS selector to wait for specific elements', }, remove_selector: { type: 'string', description: 'CSS selector to exclude specific elements', }, with_links_summary: { type: 'boolean', description: 'Gather all links at the end of response', }, with_images_summary: { type: 'boolean', description: 'Gather all images at the end of response', }, with_generated_alt: { type: 'boolean', description: 'Add alt text to images lacking captions', }, with_iframe: { type: 'boolean', description: 'Include iframe content in response', }, }, required: ['url'], },
  • src/index.ts:61-131 (registration)
    Registers the 'read_url' tool in the ListToolsRequest handler, providing name, description, and schema.
    ListToolsRequestSchema, async () => ({ tools: [ { name: 'read_url', description: 'Convert any URL to LLM-friendly text using Jina.ai Reader', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'URL to process', }, no_cache: { type: 'boolean', description: 'Bypass cache for fresh results', default: false, }, format: { type: 'string', description: 'Response format (json or stream)', enum: ['json', 'stream'], default: 'json', }, timeout: { type: 'number', description: 'Maximum time in seconds to wait for webpage load', }, target_selector: { type: 'string', description: 'CSS selector to focus on specific elements', }, wait_for_selector: { type: 'string', description: 'CSS selector to wait for specific elements', }, remove_selector: { type: 'string', description: 'CSS selector to exclude specific elements', }, with_links_summary: { type: 'boolean', description: 'Gather all links at the end of response', }, with_images_summary: { type: 'boolean', description: 'Gather all images at the end of response', }, with_generated_alt: { type: 'boolean', description: 'Add alt text to images lacking captions', }, with_iframe: { type: 'boolean', description: 'Include iframe content in response', }, }, required: ['url'], }, }, ], }), );
  • Utility function to validate if the provided URL string is valid, used in the read_url handler for input validation.
    const is_valid_url = (url: string): boolean => { try { new URL(url); return true; } catch { return false; } };

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/spences10/mcp-jinaai-reader'

If you have feedback or need assistance with the MCP directory API, please join our Discord server