Skip to main content
Glama
jina-ai

Jina AI Remote MCP Server

Official
by jina-ai

parallel_read_url

Extract clean content from multiple web pages in parallel to efficiently compare or gather information. Input up to five URLs for optimal performance, enabling simultaneous analysis of diverse sources.

Instructions

Read multiple web pages in parallel to extract clean content efficiently. For best results, provide multiple URLs that you need to extract simultaneously. This is useful for comparing content across multiple sources or gathering information from multiple pages at once. 💡 Use this when you need to analyze multiple sources simultaneously for efficiency.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
timeoutNoTimeout in milliseconds for all URL reads
urlsYesArray of URL configurations to read in parallel (maximum 5 URLs for optimal performance)

Implementation Reference

  • Registers the parallel_read_url tool with the MCP server, defines its input schema, and provides the handler function that deduplicates URLs, calls executeParallelUrlReads utility, formats results as YAML text content items, applies token guardrail, and handles errors.
    server.tool( "parallel_read_url", "Read multiple web pages in parallel to extract clean content efficiently. For best results, provide multiple URLs that you need to extract simultaneously. This is useful for comparing content across multiple sources or gathering information from multiple pages at once.", { urls: z.array(z.object({ url: z.string().url().describe("The complete URL of the webpage or PDF file to read and convert"), withAllLinks: z.boolean().default(false).describe("Set to true to extract and return all hyperlinks found on the page as structured data"), withAllImages: z.boolean().default(false).describe("Set to true to extract and return all images found on the page as structured data") })).max(5).describe("Array of URL configurations to read in parallel (maximum 5 URLs for optimal performance)"), timeout: z.number().default(30000).describe("Timeout in milliseconds for all URL reads") }, async ({ urls, timeout }: { urls: Array<{ url: string; withAllLinks: boolean; withAllImages: boolean }>; timeout: number }) => { try { const props = getProps(); const uniqueUrls = urls.filter((urlConfig, index, self) => index === self.findIndex(u => u.url === urlConfig.url) ); // Import the utility functions const { executeParallelUrlReads } = await import("../utils/read.js"); // Execute parallel URL reads using the utility const results = await executeParallelUrlReads(uniqueUrls, props.bearerToken, timeout); // Format results for consistent output const contentItems: Array<{ type: 'text'; text: string }> = []; for (const result of results) { if ('success' in result && result.success) { contentItems.push({ type: "text" as const, text: yamlStringify(result.structuredData), }); } } return applyTokenGuardrail({ content: contentItems, }, props.bearerToken, getClientName()); } catch (error) { return createErrorResponse(`Error: ${error instanceof Error ? error.message : String(error)}`); } }, );
  • Helper function executeParallelUrlReads that runs multiple single URL reads in parallel using Promise.all with an overall timeout, used by the parallel_read_url handler.
    export async function executeParallelUrlReads( urlConfigs: ReadUrlConfig[], bearerToken?: string, timeout: number = 30000 ): Promise<ReadUrlResponse[]> { const timeoutPromise = new Promise<never>((_, reject) => setTimeout(() => reject(new Error('Parallel URL read timeout')), timeout) ); const readPromises = urlConfigs.map(urlConfig => readUrlFromConfig(urlConfig, bearerToken)); return Promise.race([ Promise.all(readPromises), timeoutPromise ]); }
  • Core helper function readUrlFromConfig that performs the actual API call to r.jina.ai for a single URL, handles normalization, headers for links/images, parses response into structured data with title, content, optional links/images.
    export async function readUrlFromConfig( urlConfig: ReadUrlConfig, bearerToken?: string ): Promise<ReadUrlResponse> { try { // Normalize the URL first const normalizedUrl = normalizeUrl(urlConfig.url); if (!normalizedUrl) { return { error: "Invalid or unsupported URL", url: urlConfig.url }; } const headers: Record<string, string> = { 'Accept': 'application/json', 'Content-Type': 'application/json', 'X-Md-Link-Style': 'discarded', }; // Add Authorization header if bearer token is available if (bearerToken) { headers['Authorization'] = `Bearer ${bearerToken}`; } if (urlConfig.withAllLinks) { headers['X-With-Links-Summary'] = 'all'; } if (urlConfig.withAllImages) { headers['X-With-Images-Summary'] = 'true'; } else { headers['X-Retain-Images'] = 'none'; } const response = await fetch('https://r.jina.ai/', { method: 'POST', headers, body: JSON.stringify({ url: normalizedUrl }), }); if (!response.ok) { return { error: `HTTP ${response.status}: ${response.statusText}`, url: urlConfig.url }; } const data = await response.json() as any; if (!data.data) { return { error: "Invalid response data from r.jina.ai", url: urlConfig.url }; } // Prepare structured data const structuredData: any = { url: data.data.url, title: data.data.title, }; if (urlConfig.withAllLinks && data.data.links) { structuredData.links = data.data.links.map((link: [string, string]) => ({ anchorText: link[0], url: link[1] })); } if (urlConfig.withAllImages && data.data.images) { structuredData.images = data.data.images; } structuredData.content = data.data.content || ""; return { success: true, url: urlConfig.url, structuredData, withAllLinks: urlConfig.withAllLinks || false, withAllImages: urlConfig.withAllImages || false }; } catch (error) { return { error: error instanceof Error ? error.message : String(error), url: urlConfig.url }; } }
  • Includes 'parallel_read_url' in GUARDRAIL_TOOLS array, applying token truncation guardrail to prevent exceeding client limits on tool responses.
    export const GUARDRAIL_TOOLS = [ 'read_url', 'parallel_read_url', ];
  • src/index.ts:21-21 (registration)
    Calls registerJinaTools which includes the parallel_read_url tool registration.
    registerJinaTools(this.server, () => this.props);

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jina-ai/MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server