Skip to main content
Glama
spences10

MCP JinaAI Reader Server

read_url

Extract and convert web content from URLs into structured, LLM-readable text for analysis and processing.

Instructions

Convert any URL to LLM-friendly text using Jina.ai Reader

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to process
no_cacheNoBypass cache for fresh results
formatNoResponse format (json or stream)json
timeoutNoMaximum time in seconds to wait for webpage load
target_selectorNoCSS selector to focus on specific elements
wait_for_selectorNoCSS selector to wait for specific elements
remove_selectorNoCSS selector to exclude specific elements
with_links_summaryNoGather all links at the end of response
with_images_summaryNoGather all images at the end of response
with_generated_altNoAdd alt text to images lacking captions
with_iframeNoInclude iframe content in response

Implementation Reference

  • CallToolRequest handler that implements the core logic for the 'read_url' tool: validates input, constructs headers with optional parameters, fetches from Jina.ai Reader API, and returns the processed text content.
    this.server.setRequestHandler(
    	CallToolRequestSchema,
    	async (request) => {
    		if (request.params.name !== 'read_url') {
    			throw new McpError(
    				ErrorCode.MethodNotFound,
    				`Unknown tool: ${request.params.name}`,
    			);
    		}
    
    		const args = request.params.arguments as Record<
    			string,
    			unknown
    		>;
    
    		if (
    			!args ||
    			typeof args.url !== 'string' ||
    			!is_valid_url(args.url)
    		) {
    			throw new McpError(
    				ErrorCode.InvalidParams,
    				'Invalid or missing URL parameter',
    			);
    		}
    
    		try {
    			const headers: Record<string, string> = {
    				Accept:
    					typeof args.format === 'string' &&
    					args.format === 'stream'
    						? 'text/event-stream'
    						: 'application/json',
    				'Content-Type': 'application/json',
    				Authorization: `Bearer ${JINAAI_API_KEY}`,
    			};
    
    			// Optional headers from documentation
    			if (typeof args.no_cache === 'boolean' && args.no_cache) {
    				headers['X-No-Cache'] = 'true';
    			}
    			if (typeof args.timeout === 'number') {
    				headers['X-Timeout'] = args.timeout.toString();
    			}
    			if (typeof args.target_selector === 'string') {
    				headers['X-Target-Selector'] = args.target_selector;
    			}
    			if (typeof args.wait_for_selector === 'string') {
    				headers['X-Wait-For-Selector'] = args.wait_for_selector;
    			}
    			if (typeof args.remove_selector === 'string') {
    				headers['X-Remove-Selector'] = args.remove_selector;
    			}
    			if (
    				typeof args.with_links_summary === 'boolean' &&
    				args.with_links_summary
    			) {
    				headers['X-With-Links-Summary'] = 'true';
    			}
    			if (
    				typeof args.with_images_summary === 'boolean' &&
    				args.with_images_summary
    			) {
    				headers['X-With-Images-Summary'] = 'true';
    			}
    			if (
    				typeof args.with_generated_alt === 'boolean' &&
    				args.with_generated_alt
    			) {
    				headers['X-With-Generated-Alt'] = 'true';
    			}
    			if (
    				typeof args.with_iframe === 'boolean' &&
    				args.with_iframe
    			) {
    				headers['X-With-Iframe'] = 'true';
    			}
    
    			const response = await fetch(this.base_url + args.url, {
    				headers,
    			});
    
    			if (!response.ok) {
    				throw new Error(`HTTP error! status: ${response.status}`);
    			}
    
    			const result = await response.text();
    
    			return {
    				content: [
    					{
    						type: 'text',
    						text: result,
    					},
    				],
    			};
    		} catch (error) {
    			const message =
    				error instanceof Error ? error.message : String(error);
    			throw new McpError(
    				ErrorCode.InternalError,
    				`Failed to process URL: ${message}`,
    			);
    		}
    	},
    );
  • Input schema defining parameters for the 'read_url' tool, including required 'url' and various optional Jina.ai Reader options.
    inputSchema: {
    	type: 'object',
    	properties: {
    		url: {
    			type: 'string',
    			description: 'URL to process',
    		},
    		no_cache: {
    			type: 'boolean',
    			description: 'Bypass cache for fresh results',
    			default: false,
    		},
    		format: {
    			type: 'string',
    			description: 'Response format (json or stream)',
    			enum: ['json', 'stream'],
    			default: 'json',
    		},
    		timeout: {
    			type: 'number',
    			description:
    				'Maximum time in seconds to wait for webpage load',
    		},
    		target_selector: {
    			type: 'string',
    			description:
    				'CSS selector to focus on specific elements',
    		},
    		wait_for_selector: {
    			type: 'string',
    			description:
    				'CSS selector to wait for specific elements',
    		},
    		remove_selector: {
    			type: 'string',
    			description:
    				'CSS selector to exclude specific elements',
    		},
    		with_links_summary: {
    			type: 'boolean',
    			description:
    				'Gather all links at the end of response',
    		},
    		with_images_summary: {
    			type: 'boolean',
    			description:
    				'Gather all images at the end of response',
    		},
    		with_generated_alt: {
    			type: 'boolean',
    			description:
    				'Add alt text to images lacking captions',
    		},
    		with_iframe: {
    			type: 'boolean',
    			description: 'Include iframe content in response',
    		},
    	},
    	required: ['url'],
    },
  • src/index.ts:61-131 (registration)
    Registers the 'read_url' tool in the ListToolsRequest handler, providing name, description, and schema.
    	ListToolsRequestSchema,
    	async () => ({
    		tools: [
    			{
    				name: 'read_url',
    				description:
    					'Convert any URL to LLM-friendly text using Jina.ai Reader',
    				inputSchema: {
    					type: 'object',
    					properties: {
    						url: {
    							type: 'string',
    							description: 'URL to process',
    						},
    						no_cache: {
    							type: 'boolean',
    							description: 'Bypass cache for fresh results',
    							default: false,
    						},
    						format: {
    							type: 'string',
    							description: 'Response format (json or stream)',
    							enum: ['json', 'stream'],
    							default: 'json',
    						},
    						timeout: {
    							type: 'number',
    							description:
    								'Maximum time in seconds to wait for webpage load',
    						},
    						target_selector: {
    							type: 'string',
    							description:
    								'CSS selector to focus on specific elements',
    						},
    						wait_for_selector: {
    							type: 'string',
    							description:
    								'CSS selector to wait for specific elements',
    						},
    						remove_selector: {
    							type: 'string',
    							description:
    								'CSS selector to exclude specific elements',
    						},
    						with_links_summary: {
    							type: 'boolean',
    							description:
    								'Gather all links at the end of response',
    						},
    						with_images_summary: {
    							type: 'boolean',
    							description:
    								'Gather all images at the end of response',
    						},
    						with_generated_alt: {
    							type: 'boolean',
    							description:
    								'Add alt text to images lacking captions',
    						},
    						with_iframe: {
    							type: 'boolean',
    							description: 'Include iframe content in response',
    						},
    					},
    					required: ['url'],
    				},
    			},
    		],
    	}),
    );
  • Utility function to validate if the provided URL string is valid, used in the read_url handler for input validation.
    const is_valid_url = (url: string): boolean => {
    	try {
    		new URL(url);
    		return true;
    	} catch {
    		return false;
    	}
    };
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden but only states the basic function without disclosing behavioral traits like rate limits, authentication needs, error handling, or performance characteristics. It mentions the external service (Jina.ai Reader) but doesn't explain implications of using a third-party service.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that clearly states the tool's purpose without unnecessary words. It's appropriately sized and front-loaded with the core functionality, making every word earn its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 11 parameters and no output schema, the description is insufficient. It doesn't explain what 'LLM-friendly text' means in practice, doesn't describe the response format, and provides no guidance on parameter interactions or error cases. The lack of output schema increases the need for more descriptive context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, providing comprehensive parameter documentation. The description adds no parameter-specific information beyond the schema, maintaining the baseline score. It doesn't explain relationships between parameters or provide usage examples.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verb ('Convert') and resource ('any URL') while specifying the method ('using Jina.ai Reader') and output format ('LLM-friendly text'). It distinguishes this as a URL-to-text conversion tool with no siblings to differentiate from.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('Convert any URL to LLM-friendly text') but provides no explicit guidance on when to use this tool versus alternatives, prerequisites, or limitations. With no sibling tools, the baseline is adequate but lacks specific usage scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/spences10/mcp-jinaai-reader'

If you have feedback or need assistance with the MCP directory API, please join our Discord server