fetch_page

fetch_page

Retrieve and process web pages for LLM context using a URL, with options to include screenshots or limit content length. Part of the Web Content MCP Server for enhanced data extraction.

Instructions

Fetches and processes a web page for LLM context

Input Schema

TableJSON Schema

Name	Required	Description
`includeScreenshot`	No	Whether to include a screenshot (base64 encoded)
`maxContentLength`	No	Maximum content length to return
`url`	Yes	URL to fetch

Implementation Reference

src/server.ts:176-227 (handler)
Main handler function executing the fetch_page tool logic: validates input, fetches and processes page content, optionally includes screenshot, truncates if needed, and formats response.
private async handleFetchPage(args: any) { // Validate arguments if (typeof args !== 'object' || args === null || typeof args.url !== 'string') { throw new McpError(ErrorCode.InvalidParams, 'Invalid arguments for fetch_page'); } const { url, includeScreenshot = false, maxContentLength = 10000 } = args; try { // Fetch the page content const html = await this.browserClient.fetchContent(url); // Process the content for LLM const processedContent = this.contentProcessor.processForLLM(html, url); // Truncate if necessary const truncatedContent = processedContent.length > maxContentLength ? processedContent.substring(0, maxContentLength) + '...' : processedContent; // Get screenshot if requested let screenshot = null; if (includeScreenshot) { screenshot = await this.browserClient.takeScreenshot(url); } // Return the result return { content: [ { type: 'text', text: truncatedContent, }, ...(screenshot ? [{ type: 'image', image: screenshot, }] : []), ], }; } catch (error) { console.error('Error fetching page:', error); return { content: [ { type: 'text', text: `Error fetching page: ${error instanceof Error ? error.message : String(error)}`, }, ], isError: true, }; } }
src/server.ts:59-79 (schema)
Input schema and metadata definition for the fetch_page tool in the ListTools response.
name: 'fetch_page', description: 'Fetches and processes a web page for LLM context', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'URL to fetch', }, includeScreenshot: { type: 'boolean', description: 'Whether to include a screenshot (base64 encoded)', }, maxContentLength: { type: 'number', description: 'Maximum content length to return', }, }, required: ['url'], }, },
src/server.ts:146-147 (registration)
Registration of the fetch_page handler in the tool call dispatcher switch statement.
case 'fetch_page': return await this.handleFetchPage(args);
src/browser-client.ts:20-36 (helper)
Helper method in BrowserClient that fetches rendered HTML content from Cloudflare Browser Rendering API.
async fetchContent(url: string): Promise<string> { try { console.log(`Fetching content from: ${url}`); // Make the API call to the Cloudflare Worker const response = await axios.post(`${this.apiEndpoint}/content`, { url, rejectResourceTypes: ['image', 'font', 'media'], waitUntil: 'networkidle0', }); return response.data.content; } catch (error) { console.error('Error fetching content:', error); throw new Error(`Failed to fetch content: ${error instanceof Error ? error.message : String(error)}`); } }
src/content-processor.ts:11-20 (helper)
Helper method in ContentProcessor that converts HTML to LLM-friendly markdown with extracted metadata.
processForLLM(html: string, url: string): string { // Extract metadata const metadata = this.extractMetadata(html, url); // Clean the content const cleanedContent = this.cleanContent(html); // Format for LLM context return this.formatForLLM(cleanedContent, metadata); }

Web Content MCP Server

Instructions

Input Schema

Implementation Reference

Other Tools

Related Tools

Latest Blog Posts

MCP directory API