visit_page

Navigate to a specific URL and extract page content in readable format, with optional screenshot capture. Analyze articles, documentation, or verify information directly from the source.

Instructions

Navigates to a specific URL and extracts the page content in readable format, with option to capture a screenshot. Use this tool to deeply analyze specific web pages, read articles, examine documentation, or verify information directly from the source. Especially useful for in-depth research after identifying relevant pages via search.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`url`	Yes	URL to visit
`takeScreenshot`	No	Whether to take a screenshot

Implementation Reference

index.ts:155-166 (schema)

Tool schema definition for 'visit_page' - defines the tool's name, description, and input schema (url required, takeScreenshot optional).

{
    name: "visit_page",
    description: "Navigates to a specific URL and extracts the page content in readable format, with option to capture a screenshot. Use this tool to deeply analyze specific web pages, read articles, examine documentation, or verify information directly from the source. Especially useful for in-depth research after identifying relevant pages via search.",
    inputSchema: {
        type: "object",
        properties: {
            url: { type: "string", description: "URL to visit" },
            takeScreenshot: { type: "boolean", description: "Whether to take a screenshot" },
        },
        required: ["url"],
    },
},

index.ts:143-186 (registration)

The TOOLS array that registers all available tools including 'visit_page' with the MCP server.

const TOOLS: Tool[] = [
    {
        name: "search_google",
        description: "Performs a web search using Google, ideal for finding current information, news, websites, and general knowledge. Use this tool when you need to research topics, find recent information, or gather data from the web. Returns structured search results with titles, URLs, and snippets.",
        inputSchema: {
            type: "object",
            properties: {
                query: { type: "string", description: "Search query" },
            },
            required: ["query"],
        },
    },
{
    name: "visit_page",
    description: "Navigates to a specific URL and extracts the page content in readable format, with option to capture a screenshot. Use this tool to deeply analyze specific web pages, read articles, examine documentation, or verify information directly from the source. Especially useful for in-depth research after identifying relevant pages via search.",
    inputSchema: {
        type: "object",
        properties: {
            url: { type: "string", description: "URL to visit" },
            takeScreenshot: { type: "boolean", description: "Whether to take a screenshot" },
        },
        required: ["url"],
    },
},
{
    name: "take_screenshot",
    description: "Captures a visual image of the currently loaded webpage. Use this tool when you need to preserve visual information, analyze page layouts, or document the current state of a webpage. Perfect for situations where textual content alone doesn't convey the full context.",
    inputSchema: {
        type: "object",
        properties: {},  // No parameters needed
    },
},
{
    name: "search_scholar",
    description: "Searches Google Scholar for academic papers and scholarly articles. Use this tool when researching scientific topics, looking for peer-reviewed research, academic citations, or scholarly literature. Returns structured data including titles, authors, publication details, and citation counts. Ideal for academic research and evidence-based inquiries.",
    inputSchema: {
        type: "object",
        properties: {
            query: { type: "string", description: "Academic search query" },
        },
        required: ["query"],
    },
},
];

index.ts:1156-1250 (handler)

The main handler for 'visit_page' tool - validates URL, visits the page, extracts content as markdown, optionally takes a screenshot, stores the result in session, and returns formatted output.

case "visit_page": {
    // Extract URL and screenshot flag from request
    const { url, takeScreenshot } = request.params.arguments as {
        url: string;                    // Target URL to visit
        takeScreenshot?: boolean;       // Optional screenshot flag
    };

    // Step 1: Validate URL format and security
    if (!isValidUrl(url)) {
        return {
            content: [{
                type: "text" as const,
                text: `Invalid URL: ${url}. Only http and https protocols are supported.`
            }],
            isError: true
        };
    }

    try {
        // Step 2: Visit page and extract content with retry mechanism
        const result = await withRetry(async () => {
            // Navigate to target URL safely
            await safePageNavigation(page, url);
            const title = await page.title();

            // Step 3: Extract and process page content
            const content = await withRetry(async () => {
                // Convert page content to markdown
                const extractedContent = await extractContentAsMarkdown(page);

                // If no content is extracted, throw an error
                if (!extractedContent) {
                    throw new Error('Failed to extract content');
                }

                // Return the extracted content
                return extractedContent;
            });

            // Step 4: Create result object with page data
            const pageResult: ResearchResult = {
                url,      // Original URL
                title,    // Page title
                content,  // Markdown content
                timestamp: new Date().toISOString(),  // Capture time
            };

            // Step 5: Take screenshot if requested
            let screenshotUri: string | undefined;
            if (takeScreenshot) {
                // Capture and process screenshot
                const screenshot = await takeScreenshotWithSizeLimit(page);
                pageResult.screenshotPath = await saveScreenshot(screenshot, title);

                // Get the index for the resource URI
                const resultIndex = currentSession ? currentSession.results.length : 0;
                screenshotUri = `research://screenshots/${resultIndex}`;

                // Notify clients about new screenshot resource
                server.notification({
                    method: "notifications/resources/list_changed"
                });
            }

            // Step 6: Store result in session
            addResult(pageResult);
            return { pageResult, screenshotUri };
        });

        // Step 7: Return formatted result with screenshot URI if taken
        const response: ToolResult = {
            content: [{
                type: "text" as const,
                text: JSON.stringify({
                    url: result.pageResult.url,
                    title: result.pageResult.title,
                    content: result.pageResult.content,
                    timestamp: result.pageResult.timestamp,
                    screenshot: result.screenshotUri ? `View screenshot via *MCP Resources* (Paperclip icon) @ URI: ${result.screenshotUri}` : undefined
                }, null, 2)
            }]
        };

        return response;
    } catch (error) {
        // Handle and format page visit errors
        return {
            content: [{
                type: "text" as const,
                text: `Failed to visit page: ${(error as Error).message}`
            }],
            isError: true
        };
    }
}

index.ts:534-537 (registration)

Server request handler registration that returns the TOOLS list including 'visit_page' when ListToolsRequestSchema is received.

// Register handler for tool listing requests
server.setRequestHandler(ListToolsRequestSchema, async () => ({
    tools: TOOLS  // Return list of available research tools
}));

index.ts:768-779 (helper)

URL validation helper used by visit_page to ensure only http/https URLs are accepted.

function isValidUrl(urlString: string): boolean {
    try {
        // Attempt to parse URL string
        const url = new URL(urlString);

        // Only allow HTTP and HTTPS protocols for security
        return url.protocol === 'http:' || url.protocol === 'https:';
    } catch {
        // Return false for any invalid URL format
        return false;
    }
}

MCP Web Research Server

visit_page

Instructions

Input Schema

Implementation Reference

Tool Definition Quality

Other Tools

Latest Blog Posts

MCP directory API