fetch_page
Extract and process web page content for LLM context using a specified URL and optional content length limit. Integrates with Cloudflare Browser Rendering for direct functionality in Cline or Claude Desktop.
Instructions
Fetches and processes a web page for LLM context
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| maxContentLength | No | Maximum content length to return | |
| url | Yes | URL to fetch |
Implementation Reference
- src/server.ts:219-262 (handler)The primary handler for the 'fetch_page' tool. Validates input arguments, fetches HTML content using BrowserClient, processes it for LLM use with ContentProcessor, applies length truncation, and returns formatted text content or error.* Handle the fetch_page tool */ private async handleFetchPage(args: any) { // Validate arguments if (typeof args !== 'object' || args === null || typeof args.url !== 'string') { throw new McpError(ErrorCode.InvalidParams, 'Invalid arguments for fetch_page'); } const { url, maxContentLength = 10000 } = args; try { // Fetch the page content const html = await this.browserClient.fetchContent(url); // Process the content for LLM const processedContent = this.contentProcessor.processForLLM(html, url); // Truncate if necessary const truncatedContent = processedContent.length > maxContentLength ? processedContent.substring(0, maxContentLength) + '...' : processedContent; // Return the content return { content: [ { type: 'text', text: truncatedContent, }, ], }; } catch (error) { console.error('[Error] Error fetching page:', error); return { content: [ { type: 'text', text: `Error fetching page: ${error instanceof Error ? error.message : String(error)}`, }, ], isError: true, }; } }
- src/server.ts:72-89 (schema)Tool schema definition including name, description, and input schema specifying 'url' (required) and optional 'maxContentLength'.{ name: 'fetch_page', description: 'Fetches and processes a web page for LLM context', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'URL to fetch', }, maxContentLength: { type: 'number', description: 'Maximum content length to return', }, }, required: ['url'], }, },
- src/server.ts:182-186 (registration)Registration in the tool dispatch switch statement within the CallToolRequestSchema handler, routing 'fetch_page' calls to handleFetchPage.switch (name) { case 'fetch_page': console.error(`[API] Fetching page: ${args?.url}`); return await this.handleFetchPage(args); case 'search_documentation':
- src/browser-client.ts:23-53 (helper)Helper function in BrowserClient that performs the actual HTTP POST to Cloudflare Browser Rendering API (/content endpoint) to fetch rendered HTML, with error handling.async fetchContent(url: string): Promise<string> { try { console.error(`[API] Fetching content from: ${url}`); // Make the API call to the Cloudflare Worker const response = await axios.post(`${this.apiEndpoint}/content`, { url, rejectResourceTypes: ['image', 'font', 'media'], waitUntil: 'networkidle0', }); // Check if the response has the expected structure if (response.data && response.data.content) { return response.data.content; } // If we can't find the content, log the response and throw an error console.error('[Error] Unexpected response structure:', JSON.stringify(response.data, null, 2)); throw new Error('Unexpected response structure from Cloudflare Worker'); } catch (error: any) { console.error('[Error] Error fetching content:', error); // Log more detailed error information if available if (error.response) { console.error('[Error] Response status:', error.response.status); console.error('[Error] Response data:', JSON.stringify(error.response.data, null, 2)); } throw new Error(`Failed to fetch content: ${error instanceof Error ? error.message : String(error)}`); } }
- src/content-processor.ts:11-20 (helper)Helper function that orchestrates content processing: extracts metadata, cleans HTML to markdown-like text, and formats with metadata for LLM context.processForLLM(html: string, url: string): string { // Extract metadata const metadata = this.extractMetadata(html, url); // Clean the content const cleanedContent = this.cleanContent(html); // Format for LLM context return this.formatForLLM(cleanedContent, metadata); }