get_webpage_content
Retrieves webpage content in markdown format by scraping URLs, with options to control timing and geographic location for data extraction.
Instructions
Retrieve content of a webpage in markdown
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url_to_scrape | Yes | The URL of the webpage to scrape. | |
| wait_before_scraping | No | Time to wait in milliseconds before starting the scrape. | |
| country | No | Residential country to load the request from (e.g., US, CA, GB). Optional. |
Implementation Reference
- src/tools/getWebpageMarkdown.ts:21-81 (handler)The main handler function that fetches the webpage content using the Olostep API, processes the response, and returns markdown content or error.handler: async ({ url_to_scrape, wait_before_scraping, country }: { url_to_scrape: string; wait_before_scraping: number; country?: string }, apiKey: string, orbitKey?: string) => { try { const headers = new Headers({ 'Content-Type': 'application/json', 'Authorization': `Bearer ${apiKey}` }); const payload = { url_to_scrape: url_to_scrape, wait_before_scraping: wait_before_scraping, formats: ["markdown"], ...(country && { country: country }), ...(orbitKey && { force_connection_id: orbitKey }) }; const response = await fetch(OLOSTEP_SCRAPE_API_URL, { method: 'POST', headers: headers, body: JSON.stringify(payload) }); if (!response.ok) { const errorDetails = await response.json(); return { isError: true, content: [{ type: "text", text: `Olostep API Error: ${response.status} ${response.statusText}. Details: ${JSON.stringify(errorDetails)}` }] }; } const data = await response.json() as OlostepScrapeApiResponse; if (data.result?.markdown_content) { return { content: [{ type: "text", text: data.result.markdown_content }] }; } else { return { isError: true, content: [{ type: "text", text: "Error: No markdown content found in Olostep API response." }] }; } } catch (error: unknown) { return { isError: true, content: [{ type: "text", text: `Error: Failed to scrape webpage. ${error instanceof Error ? error.message : String(error)}` }] }; } }
- Zod schema defining the input parameters for the tool: url_to_scrape (required URL), wait_before_scraping (optional ms), country (optional).schema: { url_to_scrape: z.string().url().describe("The URL of the webpage to scrape."), wait_before_scraping: z.number().int().min(0).default(0).describe("Time to wait in milliseconds before starting the scrape."), country: z.string().optional().describe("Residential country to load the request from (e.g., US, CA, GB). Optional."), },
- src/index.ts:134-146 (registration)MCP server registration of the tool using server.tool(), providing name, description, schema, and a wrapper handler that checks API key and calls the tool's handler.server.tool( getWebpageMarkdown.name, getWebpageMarkdown.description, getWebpageMarkdown.schema, async (params) => { if (!OLOSTEP_API_KEY) return missingApiKeyError; const result = await getWebpageMarkdown.handler(params, OLOSTEP_API_KEY, ORBIT_KEY); return { ...result, content: result.content.map(item => ({ ...item, type: item.type as "text" })) }; } );