Skip to main content
Glama
devskido

Playwright MCP Server

by devskido

playwright_get_visible_html

Extract visible HTML content from web pages with configurable cleaning options for scripts, styles, and comments to obtain clean, structured markup.

Instructions

Get the HTML content of the current page. By default, all tags are removed from the output unless removeScripts is explicitly set to false.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
selectorNoCSS selector to limit the HTML to a specific container
removeScriptsNoRemove all script tags from the HTML (default: true)
removeCommentsNoRemove all HTML comments (default: false)
removeStylesNoRemove all style tags from the HTML (default: false)
removeMetaNoRemove all meta tags from the HTML (default: false)
cleanHtmlNoPerform comprehensive HTML cleaning (default: false)
minifyNoMinify the HTML output (default: false)
maxLengthNoMaximum number of characters to return (default: 20000)

Implementation Reference

  • The VisibleHtmlTool class and its execute method implement the core handler logic for the 'playwright_get_visible_html' tool. It retrieves HTML from the page or a selector, applies optional cleaning (remove scripts by default, comments, styles, meta, minify), truncates long output, and handles errors.
    export class VisibleHtmlTool extends BrowserToolBase { /** * Execute the visible HTML page tool */ async execute(args: any, context: ToolContext): Promise<ToolResponse> { // Check if browser is available if (!context.browser || !context.browser.isConnected()) { // If browser is not connected, we need to reset the state to force recreation resetBrowserState(); return createErrorResponse( "Browser is not connected. The connection has been reset - please retry your navigation." ); } // Check if page is available and not closed if (!context.page || context.page.isClosed()) { return createErrorResponse( "Page is not available or has been closed. Please retry your navigation." ); } return this.safeExecute(context, async (page) => { try { const { selector, removeComments, removeStyles, removeMeta, minify, cleanHtml } = args; // Default removeScripts to true unless explicitly set to false const removeScripts = args.removeScripts === false ? false : true; // Get the HTML content let htmlContent: string; if (selector) { // If a selector is provided, get only the HTML for that element const element = await page.$(selector); if (!element) { return createErrorResponse(`Element with selector "${selector}" not found`); } htmlContent = await page.evaluate((el) => el.outerHTML, element); } else { // Otherwise get the full page HTML htmlContent = await page.content(); } // Determine if we need to apply filters const shouldRemoveScripts = removeScripts || cleanHtml; const shouldRemoveComments = removeComments || cleanHtml; const shouldRemoveStyles = removeStyles || cleanHtml; const shouldRemoveMeta = removeMeta || cleanHtml; // Apply filters in the browser context if (shouldRemoveScripts || shouldRemoveComments || shouldRemoveStyles || shouldRemoveMeta || minify) { htmlContent = await page.evaluate( ({ html, removeScripts, removeComments, removeStyles, removeMeta, minify }) => { // Create a DOM parser to work with the HTML const parser = new DOMParser(); const doc = parser.parseFromString(html, 'text/html'); // Remove script tags if requested if (removeScripts) { const scripts = doc.querySelectorAll('script'); scripts.forEach(script => script.remove()); } // Remove style tags if requested if (removeStyles) { const styles = doc.querySelectorAll('style'); styles.forEach(style => style.remove()); } // Remove meta tags if requested if (removeMeta) { const metaTags = doc.querySelectorAll('meta'); metaTags.forEach(meta => meta.remove()); } // Remove HTML comments if requested if (removeComments) { const removeComments = (node) => { const childNodes = node.childNodes; for (let i = childNodes.length - 1; i >= 0; i--) { const child = childNodes[i]; if (child.nodeType === 8) { // 8 is for comment nodes node.removeChild(child); } else if (child.nodeType === 1) { // 1 is for element nodes removeComments(child); } } }; removeComments(doc.documentElement); } // Get the processed HTML let result = doc.documentElement.outerHTML; // Minify if requested if (minify) { // Simple minification: remove extra whitespace result = result.replace(/>\s+</g, '><').trim(); } return result; }, { html: htmlContent, removeScripts: shouldRemoveScripts, removeComments: shouldRemoveComments, removeStyles: shouldRemoveStyles, removeMeta: shouldRemoveMeta, minify } ); } // Truncate logic const maxLength = typeof args.maxLength === 'number' ? args.maxLength : 20000; let output = htmlContent; if (output.length > maxLength) { output = output.slice(0, maxLength) + '\n<!-- Output truncated due to size limits -->'; } return createSuccessResponse(`HTML content:\n${output}`); } catch (error) { return createErrorResponse(`Failed to get visible HTML content: ${(error as Error).message}`); } }); } }
  • Defines the tool name, description, and input schema for 'playwright_get_visible_html', specifying parameters for HTML retrieval and cleaning options.
    { name: "playwright_get_visible_html", description: "Get the HTML content of the current page. By default, all <script> tags are removed from the output unless removeScripts is explicitly set to false.", inputSchema: { type: "object", properties: { selector: { type: "string", description: "CSS selector to limit the HTML to a specific container" }, removeScripts: { type: "boolean", description: "Remove all script tags from the HTML (default: true)" }, removeComments: { type: "boolean", description: "Remove all HTML comments (default: false)" }, removeStyles: { type: "boolean", description: "Remove all style tags from the HTML (default: false)" }, removeMeta: { type: "boolean", description: "Remove all meta tags from the HTML (default: false)" }, cleanHtml: { type: "boolean", description: "Perform comprehensive HTML cleaning (default: false)" }, minify: { type: "boolean", description: "Minify the HTML output (default: false)" }, maxLength: { type: "number", description: "Maximum number of characters to return (default: 20000)" } }, required: [], }, },
  • In the handleToolCall switch statement, the case for 'playwright_get_visible_html' dispatches execution to the visibleHtmlTool instance.
    case "playwright_get_visible_html": return await visibleHtmlTool.execute(args, context);
  • Initializes the VisibleHtmlTool instance in the initializeTools function.
    if (!visibleHtmlTool) visibleHtmlTool = new VisibleHtmlTool(server);
  • Lists 'playwright_get_visible_html' in the BROWSER_TOOLS array used for conditional browser launching.
    "playwright_get_visible_html",

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/devskido/customed-playwright'

If you have feedback or need assistance with the MCP directory API, please join our Discord server