get_webpage_content

Retrieves webpage content in markdown format by scraping URLs, with options to control timing and geographic location for data extraction.

Instructions

Retrieve content of a webpage in markdown

Input Schema

TableJSON Schema

Name	Required	Description
`url_to_scrape`	Yes	The URL of the webpage to scrape.
`wait_before_scraping`	No	Time to wait in milliseconds before starting the scrape.
`country`	No	Residential country to load the request from (e.g., US, CA, GB). Optional.

Implementation Reference

src/tools/getWebpageMarkdown.ts:21-81 (handler)

The main handler function that fetches the webpage content using the Olostep API, processes the response, and returns markdown content or error.

handler: async ({ url_to_scrape, wait_before_scraping, country }: { url_to_scrape: string; wait_before_scraping: number; country?: string }, apiKey: string, orbitKey?: string) => {
    try {
        const headers = new Headers({
            'Content-Type': 'application/json',
            'Authorization': `Bearer ${apiKey}`
        });

        const payload = {
            url_to_scrape: url_to_scrape,
            wait_before_scraping: wait_before_scraping,
            formats: ["markdown"],
            ...(country && { country: country }),
            ...(orbitKey && { force_connection_id: orbitKey })
        };

        const response = await fetch(OLOSTEP_SCRAPE_API_URL, {
            method: 'POST',
            headers: headers,
            body: JSON.stringify(payload)
        });

        if (!response.ok) {
            const errorDetails = await response.json();
            return {
                isError: true,
                content: [{
                    type: "text",
                    text: `Olostep API Error: ${response.status} ${response.statusText}. Details: ${JSON.stringify(errorDetails)}`
                }]
            };
        }

        const data = await response.json() as OlostepScrapeApiResponse;

        if (data.result?.markdown_content) {
            return {
                content: [{
                    type: "text",
                    text: data.result.markdown_content
                }]
            };
        } else {
            return {
                isError: true,
                content: [{
                    type: "text",
                    text: "Error: No markdown content found in Olostep API response."
                }]
            };
        }

    } catch (error: unknown) {
        return {
            isError: true,
            content: [{
                type: "text",
                text: `Error: Failed to scrape webpage. ${error instanceof Error ? error.message : String(error)}`
            }]
        };
    }
}

src/tools/getWebpageMarkdown.ts:16-20 (schema)

Zod schema defining the input parameters for the tool: url_to_scrape (required URL), wait_before_scraping (optional ms), country (optional).

schema: {
    url_to_scrape: z.string().url().describe("The URL of the webpage to scrape."),
    wait_before_scraping: z.number().int().min(0).default(0).describe("Time to wait in milliseconds before starting the scrape."),
    country: z.string().optional().describe("Residential country to load the request from (e.g., US, CA, GB). Optional."),
},

src/index.ts:134-146 (registration)

MCP server registration of the tool using server.tool(), providing name, description, schema, and a wrapper handler that checks API key and calls the tool's handler.

server.tool(
    getWebpageMarkdown.name,
    getWebpageMarkdown.description,
    getWebpageMarkdown.schema,
    async (params) => {
        if (!OLOSTEP_API_KEY) return missingApiKeyError;
        const result = await getWebpageMarkdown.handler(params, OLOSTEP_API_KEY, ORBIT_KEY);
        return {
            ...result,
            content: result.content.map(item => ({ ...item, type: item.type as "text" }))
        };
    }
);

olostep-mcp