BrowserTools MCP

Overview Schema Related Servers Score Discussions

takeScreenshot

Capture browser tab screenshots for monitoring and analysis. This tool enables AI applications to document web content visually through a Chrome extension.

Instructions

Take a screenshot of the current browser tab

Input Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Implementation Reference

browser-tools-mcp/mcp-server.ts:251-299 (handler)

MCP tool handler for 'takeScreenshot': registers the tool and implements the logic by sending a POST request to the browser connector server's /capture-screenshot endpoint, returning success or error message.

server.tool(
  "takeScreenshot",
  "Take a screenshot of the current browser tab",
  async () => {
    return await withServerConnection(async () => {
      try {
        const response = await fetch(
          `http://${discoveredHost}:${discoveredPort}/capture-screenshot`,
          {
            method: "POST",
          }
        );

        const result = await response.json();

        if (response.ok) {
          return {
            content: [
              {
                type: "text",
                text: "Successfully saved screenshot",
              },
            ],
          };
        } else {
          return {
            content: [
              {
                type: "text",
                text: `Error taking screenshot: ${result.error}`,
              },
            ],
          };
        }
      } catch (error: any) {
        const errorMessage =
          error instanceof Error ? error.message : String(error);
        return {
          content: [
            {
              type: "text",
              text: `Failed to take screenshot: ${errorMessage}`,
            },
          ],
        };
      }
    });
  }
);

browser-tools-server/browser-connector.ts:940-1249 (handler)

Server-side HTTP POST /capture-screenshot handler: sends 'take-screenshot' WebSocket message to Chrome extension, waits for base64 PNG data response, saves to file in configured path (default Downloads/mcp-screenshots), optionally auto-pastes into Cursor on macOS.

async captureScreenshot(req: express.Request, res: express.Response) {
  console.log("Browser Connector: Starting captureScreenshot method");
  console.log("Browser Connector: Request headers:", req.headers);
  console.log("Browser Connector: Request method:", req.method);

  if (!this.activeConnection) {
    console.log(
      "Browser Connector: No active WebSocket connection to Chrome extension"
    );
    return res.status(503).json({ error: "Chrome extension not connected" });
  }

  try {
    console.log("Browser Connector: Starting screenshot capture...");
    const requestId = Date.now().toString();
    console.log("Browser Connector: Generated requestId:", requestId);

    // Create promise that will resolve when we get the screenshot data
    const screenshotPromise = new Promise<{
      data: string;
      path?: string;
      autoPaste?: boolean;
    }>((resolve, reject) => {
      console.log(
        `Browser Connector: Setting up screenshot callback for requestId: ${requestId}`
      );
      // Store callback in map
      screenshotCallbacks.set(requestId, { resolve, reject });
      console.log(
        "Browser Connector: Current callbacks:",
        Array.from(screenshotCallbacks.keys())
      );

      // Set timeout to clean up if we don't get a response
      setTimeout(() => {
        if (screenshotCallbacks.has(requestId)) {
          console.log(
            `Browser Connector: Screenshot capture timed out for requestId: ${requestId}`
          );
          screenshotCallbacks.delete(requestId);
          reject(
            new Error(
              "Screenshot capture timed out - no response from Chrome extension"
            )
          );
        }
      }, 10000);
    });

    // Send screenshot request to extension
    const message = JSON.stringify({
      type: "take-screenshot",
      requestId: requestId,
    });
    console.log(
      `Browser Connector: Sending WebSocket message to extension:`,
      message
    );
    this.activeConnection.send(message);

    // Wait for screenshot data
    console.log("Browser Connector: Waiting for screenshot data...");
    const {
      data: base64Data,
      path: customPath,
      autoPaste,
    } = await screenshotPromise;
    console.log("Browser Connector: Received screenshot data, saving...");
    console.log("Browser Connector: Custom path from extension:", customPath);
    console.log("Browser Connector: Auto-paste setting:", autoPaste);

    // Always prioritize the path from the Chrome extension
    let targetPath = customPath;

    // If no path provided by extension, fall back to defaults
    if (!targetPath) {
      targetPath =
        currentSettings.screenshotPath || getDefaultDownloadsFolder();
    }

    // Convert the path for the current platform
    targetPath = convertPathForCurrentPlatform(targetPath);

    console.log(`Browser Connector: Using path: ${targetPath}`);

    if (!base64Data) {
      throw new Error("No screenshot data received from Chrome extension");
    }

    try {
      fs.mkdirSync(targetPath, { recursive: true });
      console.log(`Browser Connector: Created directory: ${targetPath}`);
    } catch (err) {
      console.error(
        `Browser Connector: Error creating directory: ${targetPath}`,
        err
      );
      throw new Error(
        `Failed to create screenshot directory: ${
          err instanceof Error ? err.message : String(err)
        }`
      );
    }

    const timestamp = new Date().toISOString().replace(/[:.]/g, "-");
    const filename = `screenshot-${timestamp}.png`;
    const fullPath = path.join(targetPath, filename);
    console.log(`Browser Connector: Full screenshot path: ${fullPath}`);

    // Remove the data:image/png;base64, prefix if present
    const cleanBase64 = base64Data.replace(/^data:image\/png;base64,/, "");

    // Save the file
    try {
      fs.writeFileSync(fullPath, cleanBase64, "base64");
      console.log(`Browser Connector: Screenshot saved to: ${fullPath}`);
    } catch (err) {
      console.error(
        `Browser Connector: Error saving screenshot to: ${fullPath}`,
        err
      );
      throw new Error(
        `Failed to save screenshot: ${
          err instanceof Error ? err.message : String(err)
        }`
      );
    }

    // Check if running on macOS before executing AppleScript
    if (os.platform() === "darwin" && autoPaste === true) {
      console.log(
        "Browser Connector: Running on macOS with auto-paste enabled, executing AppleScript to paste into Cursor"
      );

      // Create the AppleScript to copy the image to clipboard and paste into Cursor
      // This version is more robust and includes debugging
      const appleScript = `
        -- Set path to the screenshot
        set imagePath to "${fullPath}"
        
        -- Copy the image to clipboard
        try
          set the clipboard to (read (POSIX file imagePath) as «class PNGf»)
        on error errMsg
          log "Error copying image to clipboard: " & errMsg
          return "Failed to copy image to clipboard: " & errMsg
        end try
        
        -- Activate Cursor application
        try
          tell application "Cursor"
            activate
          end tell
        on error errMsg
          log "Error activating Cursor: " & errMsg
          return "Failed to activate Cursor: " & errMsg
        end try
        
        -- Wait for the application to fully activate
        delay 3
        
        -- Try to interact with Cursor
        try
          tell application "System Events"
            tell process "Cursor"
              -- Get the frontmost window
              if (count of windows) is 0 then
                return "No windows found in Cursor"
              end if
              
              set cursorWindow to window 1
              
              -- Try Method 1: Look for elements of class "Text Area"
              set foundElements to {}
              
              -- Try different selectors to find the text input area
              try
                -- Try with class
                set textAreas to UI elements of cursorWindow whose class is "Text Area"
                if (count of textAreas) > 0 then
                  set foundElements to textAreas
                end if
              end try
              
              if (count of foundElements) is 0 then
                try
                  -- Try with AXTextField role
                  set textFields to UI elements of cursorWindow whose role is "AXTextField"
                  if (count of textFields) > 0 then
                    set foundElements to textFields
                  end if
                end try
              end if
              
              if (count of foundElements) is 0 then
                try
                  -- Try with AXTextArea role in nested elements
                  set allElements to UI elements of cursorWindow
                  repeat with anElement in allElements
                    try
                      set childElements to UI elements of anElement
                      repeat with aChild in childElements
                        try
                          if role of aChild is "AXTextArea" or role of aChild is "AXTextField" then
                            set end of foundElements to aChild
                          end if
                        end try
                      end repeat
                    end try
                  end repeat
                end try
              end if
              
              -- If no elements found with specific attributes, try a broader approach
              if (count of foundElements) is 0 then
                -- Just try to use the Command+V shortcut on the active window
                 -- This assumes Cursor already has focus on the right element
                  keystroke "v" using command down
                  delay 1
                  keystroke "here is the screenshot"
                  delay 1
                 -- Try multiple methods to press Enter
                 key code 36 -- Use key code for Return key
                 delay 0.5
                 keystroke return -- Use keystroke return as alternative
                 return "Used fallback method: Command+V on active window"
              else
                -- We found a potential text input element
                set inputElement to item 1 of foundElements
                
                -- Try to focus and paste
                try
                  set focused of inputElement to true
                  delay 0.5
                  
                  -- Paste the image
                  keystroke "v" using command down
                  delay 1
                  
                  -- Type the text
                  keystroke "here is the screenshot"
                  delay 1
                  -- Try multiple methods to press Enter
                  key code 36 -- Use key code for Return key
                  delay 0.5
                  keystroke return -- Use keystroke return as alternative
                  return "Successfully pasted screenshot into Cursor text element"
                on error errMsg
                  log "Error interacting with found element: " & errMsg
                  -- Fallback to just sending the key commands
                  keystroke "v" using command down
                  delay 1
                  keystroke "here is the screenshot"
                  delay 1
                  -- Try multiple methods to press Enter
                  key code 36 -- Use key code for Return key
                  delay 0.5
                  keystroke return -- Use keystroke return as alternative
                  return "Used fallback after element focus error: " & errMsg
                end try
              end if
            end tell
          end tell
        on error errMsg
          log "Error in System Events block: " & errMsg
          return "Failed in System Events: " & errMsg
        end try
      `;

      // Execute the AppleScript
      exec(`osascript -e '${appleScript}'`, (error, stdout, stderr) => {
        if (error) {
          console.error(
            `Browser Connector: Error executing AppleScript: ${error.message}`
          );
          console.error(`Browser Connector: stderr: ${stderr}`);
          // Don't fail the response; log the error and proceed
        } else {
          console.log(`Browser Connector: AppleScript executed successfully`);
          console.log(`Browser Connector: stdout: ${stdout}`);
        }
      });
    } else {
      if (os.platform() === "darwin" && !autoPaste) {
        console.log(
          `Browser Connector: Running on macOS but auto-paste is disabled, skipping AppleScript execution`
        );
      } else {
        console.log(
          `Browser Connector: Not running on macOS, skipping AppleScript execution`
        );
      }
    }

    res.json({
      path: fullPath,
      filename: filename,
    });
  } catch (error) {
    const errorMessage =
      error instanceof Error ? error.message : String(error);
    console.error(
      "Browser Connector: Error capturing screenshot:",
      errorMessage
    );
    res.status(500).json({
      error: errorMessage,
    });
  }
}

browser-tools-server/browser-connector.ts:620-632 (registration)

Express route registration for POST /capture-screenshot endpoint in BrowserConnector constructor.

this.app.post(
  "/capture-screenshot",
  async (req: express.Request, res: express.Response) => {
    console.log(
      "Browser Connector: Received request to /capture-screenshot endpoint"
    );
    console.log("Browser Connector: Request body:", req.body);
    console.log(
      "Browser Connector: Active WebSocket connection:",
      !!this.activeConnection
    );
    await this.captureScreenshot(req, res);
  }

chrome-extension/devtools.js:967-1005 (handler)

Chrome extension WebSocket message handler for 'take-screenshot': invokes chrome.tabs.captureVisibleTab API to capture visible tab as PNG base64 dataUrl, sends back via WS with path and autoPaste settings.

} else if (message.type === "take-screenshot") {
  console.log("Chrome Extension: Taking screenshot...");
  // Capture screenshot of the current tab
  chrome.tabs.captureVisibleTab(null, { format: "png" }, (dataUrl) => {
    if (chrome.runtime.lastError) {
      console.error(
        "Chrome Extension: Screenshot capture failed:",
        chrome.runtime.lastError
      );
      ws.send(
        JSON.stringify({
          type: "screenshot-error",
          error: chrome.runtime.lastError.message,
          requestId: message.requestId,
        })
      );
      return;
    }

    console.log("Chrome Extension: Screenshot captured successfully");
    // Just send the screenshot data, let the server handle paths
    const response = {
      type: "screenshot-data",
      data: dataUrl,
      requestId: message.requestId,
      // Only include path if it's configured in settings
      ...(settings.screenshotPath && { path: settings.screenshotPath }),
      // Include auto-paste setting
      autoPaste: settings.allowAutoPaste,
    };

    console.log("Chrome Extension: Sending screenshot data response", {
      ...response,
      data: "[base64 data]",
    });

    ws.send(JSON.stringify(response));
  });
} else if (message.type === "get-current-url") {

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden but offers minimal behavioral context. It states what the tool does but doesn't disclose important traits like whether it requires specific permissions, how it handles errors, what format the screenshot returns, or if it affects browser state.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with zero wasted words. It's appropriately sized for a simple tool and front-loads the essential information without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no annotations and no output schema, the description is insufficiently complete. It doesn't explain what the tool returns (image format, size, encoding) or important behavioral aspects like error conditions, making it inadequate for an agent to use confidently.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has 0 parameters with 100% schema description coverage, so the schema already fully documents the lack of inputs. The description doesn't need to add parameter information, and it correctly implies no parameters are required for this operation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('take a screenshot') and target ('current browser tab'), providing a specific verb+resource combination. However, it doesn't differentiate from sibling tools like 'getSelectedElement' or 'runAuditMode' which might also capture visual elements in different contexts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., browser must be open), exclusions, or how it differs from sibling tools that might capture visual data in other ways.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/oenius/browser-tools-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server