Skip to main content
Glama

jina_reader_process

Extract and convert webpage content from URLs into clean, LLM-friendly text using the Jina Reader API. Supports single or multiple URLs with basic or advanced extraction depth options.

Instructions

Convert any URL to clean, LLM-friendly text using Jina Reader API

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
extract_depthNoThe depth of the extraction process. "advanced" retrieves more data but costs more credits.basic
urlYes

Implementation Reference

  • Core handler logic for processing URLs with Jina Reader API, extracting clean LLM-friendly text, metadata like title, date, word count. Used by the tool handler.
    async process_content(url: string): Promise<ProcessingResult> { // Validate URL validate_processing_urls(url, this.name); const process_url = async () => { const api_key = validate_api_key( config.processing.jina_reader.api_key, this.name, ); const data = await http_json<any>( this.name, 'https://r.jina.ai/', { method: 'POST', headers: { Authorization: `Bearer ${api_key}`, Accept: 'application/json', 'Content-Type': 'application/json', }, body: JSON.stringify({ url }), signal: AbortSignal.timeout( config.processing.jina_reader.timeout ?? 30000, ), }, ); if (!data.data) { throw new ProviderError( ErrorType.API_ERROR, 'Invalid response format from Jina Reader', this.name, ); } return { content: data.data.content || '', metadata: { title: data.data.title || '', date: data.data.timestamp || '', word_count: (data.data.content || '') .split(/\s+/) .filter(Boolean).length, }, source_provider: this.name, }; }; try { return await retry_with_backoff(process_url); } catch (error: unknown) { handle_provider_error(error, this.name, 'process content'); } }
  • Registers the MCP tool 'jina_reader_process' (dynamically as `${'jina_reader'}_process`) with input schema (url, optional extract_depth) and handler that delegates to JinaReaderProvider.process_content, formats result as JSON text response.
    this.processing_providers.forEach((provider) => { server.tool( { name: `${provider.name}_process`, description: provider.description, schema: v.object({ url: v.pipe( v.union([v.string(), v.array(v.string())]), v.description('URL(s)'), ), extract_depth: v.optional( v.pipe( v.union([v.literal('basic'), v.literal('advanced')]), v.description('Extraction depth'), ), ), }), }, async ({ url, extract_depth }) => { try { const result = await provider.process_content( url, extract_depth, ); return { content: [ { type: 'text' as const, text: JSON.stringify(result, null, 2), }, ], }; } catch (error) { const error_response = create_error_response( error as Error, ); return { content: [ { type: 'text' as const, text: error_response.error, }, ], isError: true, }; } }, ); });
  • Input schema for the jina_reader_process tool: accepts single or array of URLs, optional extract_depth ('basic' or 'advanced').
    schema: v.object({ url: v.pipe( v.union([v.string(), v.array(v.string())]), v.description('URL(s)'), ), extract_depth: v.optional( v.pipe( v.union([v.literal('basic'), v.literal('advanced')]), v.description('Extraction depth'), ), ), }),
  • JinaReaderProvider class definition with name 'jina_reader' and description, constructor validates API key, implements process_content.
    export class JinaReaderProvider implements ProcessingProvider { name = 'jina_reader'; description = 'Convert any URL to clean, LLM-friendly text using Jina Reader API'; constructor() { // Validate API key exists at construction time validate_api_key( config.processing.jina_reader.api_key, this.name, ); } async process_content(url: string): Promise<ProcessingResult> { // Validate URL validate_processing_urls(url, this.name); const process_url = async () => { const api_key = validate_api_key( config.processing.jina_reader.api_key, this.name, ); const data = await http_json<any>( this.name, 'https://r.jina.ai/', { method: 'POST', headers: { Authorization: `Bearer ${api_key}`, Accept: 'application/json', 'Content-Type': 'application/json', }, body: JSON.stringify({ url }), signal: AbortSignal.timeout( config.processing.jina_reader.timeout ?? 30000, ), }, ); if (!data.data) { throw new ProviderError( ErrorType.API_ERROR, 'Invalid response format from Jina Reader', this.name, ); } return { content: data.data.content || '', metadata: { title: data.data.title || '', date: data.data.timestamp || '', word_count: (data.data.content || '') .split(/\s+/) .filter(Boolean).length, }, source_provider: this.name, }; }; try { return await retry_with_backoff(process_url); } catch (error: unknown) { handle_provider_error(error, this.name, 'process content'); } } }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/spences10/mcp-omnisearch'

If you have feedback or need assistance with the MCP directory API, please join our Discord server