get_url_content
Extract and retrieve the full content of a URL for analysis and processing, enabling detailed insights from live web pages. Ideal for integrating with search and scraping workflows.
Instructions
Get the content of a URL. Use this for further information retrieving to understand the content of each URL.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL |
Implementation Reference
- index.ts:284-302 (handler)The main execution logic for the 'get_url_content' tool. Checks if the Puppeteer scraper is ready, fetches the URL content using the helper function, and returns it as markdown text or an error if not ready.if (name === "get_url_content") { if (!scraperReady) { return { content: [ { type: 'text', text: 'Tool not ready: Puppeteer is still initializing. Please try again in a few moments.' } ], isError: true } } const { url } = args const result = await fetchAndConvertToMarkdown(url as string) return { content: [{ type: 'text', text: result }], isError: false } }
- index.ts:61-76 (schema)Defines the Tool schema for 'get_url_content' including name, description, and input schema requiring a 'url' string.const READ_URL_TOOL: Tool = { name: "get_url_content", description: "Get the content of a URL. " + "Use this for further information retrieving to understand the content of each URL.", inputSchema: { type: "object", properties: { url: { type: "string", description: "URL", }, }, required: ["url"], }, };
- index.ts:92-95 (registration)Registers the 'get_url_content' tool in the server capabilities section, referencing its description and input schema.get_url_content: { description: READ_URL_TOOL.description, schema: READ_URL_TOOL.inputSchema, },
- index.ts:246-248 (registration)Registers the 'get_url_content' tool (via READ_URL_TOOL) in the listTools response handler.server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [WEB_SEARCH_TOOL, READ_URL_TOOL], }));
- index.ts:226-244 (helper)Core helper function that performs the actual URL scraping using PuppeteerScraper and returns the content as text (markdown). Called by the tool handler.async function fetchAndConvertToMarkdown(url: string, timeoutMs: number = 10000) { if (!scraperReady || !scraper) { throw new Error('Puppeteer is not ready. Please try again in a few moments.') } try { const response = await scraper.scrapePage(url) if (response == null) { throw new Error(`Failed to fetch the URL: ${url}`) } const { content } = response return content.text } catch (error: any) { console.error('Error during scrape:', error.message) throw error } }