read_url

Extract and convert web content from URLs into structured, LLM-readable text for analysis and processing.

Instructions

Convert any URL to LLM-friendly text using Jina.ai Reader

Input Schema

TableJSON Schema

Name	Required	Description	Default
`url`	Yes	URL to process
`no_cache`	No	Bypass cache for fresh results
`format`	No	Response format (json or stream)	json
`timeout`	No	Maximum time in seconds to wait for webpage load
`target_selector`	No	CSS selector to focus on specific elements
`wait_for_selector`	No	CSS selector to wait for specific elements
`remove_selector`	No	CSS selector to exclude specific elements
`with_links_summary`	No	Gather all links at the end of response
`with_images_summary`	No	Gather all images at the end of response
`with_generated_alt`	No	Add alt text to images lacking captions
`with_iframe`	No	Include iframe content in response

Implementation Reference

src/index.ts:133-238 (handler)

CallToolRequest handler that implements the core logic for the 'read_url' tool: validates input, constructs headers with optional parameters, fetches from Jina.ai Reader API, and returns the processed text content.

this.server.setRequestHandler(
	CallToolRequestSchema,
	async (request) => {
		if (request.params.name !== 'read_url') {
			throw new McpError(
				ErrorCode.MethodNotFound,
				`Unknown tool: ${request.params.name}`,
			);
		}

		const args = request.params.arguments as Record<
			string,
			unknown
		>;

		if (
			!args ||
			typeof args.url !== 'string' ||
			!is_valid_url(args.url)
		) {
			throw new McpError(
				ErrorCode.InvalidParams,
				'Invalid or missing URL parameter',
			);
		}

		try {
			const headers: Record<string, string> = {
				Accept:
					typeof args.format === 'string' &&
					args.format === 'stream'
						? 'text/event-stream'
						: 'application/json',
				'Content-Type': 'application/json',
				Authorization: `Bearer ${JINAAI_API_KEY}`,
			};

			// Optional headers from documentation
			if (typeof args.no_cache === 'boolean' && args.no_cache) {
				headers['X-No-Cache'] = 'true';
			}
			if (typeof args.timeout === 'number') {
				headers['X-Timeout'] = args.timeout.toString();
			}
			if (typeof args.target_selector === 'string') {
				headers['X-Target-Selector'] = args.target_selector;
			}
			if (typeof args.wait_for_selector === 'string') {
				headers['X-Wait-For-Selector'] = args.wait_for_selector;
			}
			if (typeof args.remove_selector === 'string') {
				headers['X-Remove-Selector'] = args.remove_selector;
			}
			if (
				typeof args.with_links_summary === 'boolean' &&
				args.with_links_summary
			) {
				headers['X-With-Links-Summary'] = 'true';
			}
			if (
				typeof args.with_images_summary === 'boolean' &&
				args.with_images_summary
			) {
				headers['X-With-Images-Summary'] = 'true';
			}
			if (
				typeof args.with_generated_alt === 'boolean' &&
				args.with_generated_alt
			) {
				headers['X-With-Generated-Alt'] = 'true';
			}
			if (
				typeof args.with_iframe === 'boolean' &&
				args.with_iframe
			) {
				headers['X-With-Iframe'] = 'true';
			}

			const response = await fetch(this.base_url + args.url, {
				headers,
			});

			if (!response.ok) {
				throw new Error(`HTTP error! status: ${response.status}`);
			}

			const result = await response.text();

			return {
				content: [
					{
						type: 'text',
						text: result,
					},
				],
			};
		} catch (error) {
			const message =
				error instanceof Error ? error.message : String(error);
			throw new McpError(
				ErrorCode.InternalError,
				`Failed to process URL: ${message}`,
			);
		}
	},
);

src/index.ts:68-127 (schema)

Input schema defining parameters for the 'read_url' tool, including required 'url' and various optional Jina.ai Reader options.

inputSchema: {
	type: 'object',
	properties: {
		url: {
			type: 'string',
			description: 'URL to process',
		},
		no_cache: {
			type: 'boolean',
			description: 'Bypass cache for fresh results',
			default: false,
		},
		format: {
			type: 'string',
			description: 'Response format (json or stream)',
			enum: ['json', 'stream'],
			default: 'json',
		},
		timeout: {
			type: 'number',
			description:
				'Maximum time in seconds to wait for webpage load',
		},
		target_selector: {
			type: 'string',
			description:
				'CSS selector to focus on specific elements',
		},
		wait_for_selector: {
			type: 'string',
			description:
				'CSS selector to wait for specific elements',
		},
		remove_selector: {
			type: 'string',
			description:
				'CSS selector to exclude specific elements',
		},
		with_links_summary: {
			type: 'boolean',
			description:
				'Gather all links at the end of response',
		},
		with_images_summary: {
			type: 'boolean',
			description:
				'Gather all images at the end of response',
		},
		with_generated_alt: {
			type: 'boolean',
			description:
				'Add alt text to images lacking captions',
		},
		with_iframe: {
			type: 'boolean',
			description: 'Include iframe content in response',
		},
	},
	required: ['url'],
},

src/index.ts:61-131 (registration)

Registers the 'read_url' tool in the ListToolsRequest handler, providing name, description, and schema.

	ListToolsRequestSchema,
	async () => ({
		tools: [
			{
				name: 'read_url',
				description:
					'Convert any URL to LLM-friendly text using Jina.ai Reader',
				inputSchema: {
					type: 'object',
					properties: {
						url: {
							type: 'string',
							description: 'URL to process',
						},
						no_cache: {
							type: 'boolean',
							description: 'Bypass cache for fresh results',
							default: false,
						},
						format: {
							type: 'string',
							description: 'Response format (json or stream)',
							enum: ['json', 'stream'],
							default: 'json',
						},
						timeout: {
							type: 'number',
							description:
								'Maximum time in seconds to wait for webpage load',
						},
						target_selector: {
							type: 'string',
							description:
								'CSS selector to focus on specific elements',
						},
						wait_for_selector: {
							type: 'string',
							description:
								'CSS selector to wait for specific elements',
						},
						remove_selector: {
							type: 'string',
							description:
								'CSS selector to exclude specific elements',
						},
						with_links_summary: {
							type: 'boolean',
							description:
								'Gather all links at the end of response',
						},
						with_images_summary: {
							type: 'boolean',
							description:
								'Gather all images at the end of response',
						},
						with_generated_alt: {
							type: 'boolean',
							description:
								'Add alt text to images lacking captions',
						},
						with_iframe: {
							type: 'boolean',
							description: 'Include iframe content in response',
						},
					},
					required: ['url'],
				},
			},
		],
	}),
);

src/index.ts:27-34 (helper)
Utility function to validate if the provided URL string is valid, used in the read_url handler for input validation.
```
const is_valid_url = (url: string): boolean => {
	try {
		new URL(url);
		return true;
	} catch {
		return false;
	}
};
```

MCP JinaAI Reader Server

read_url

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API