Skip to main content
Glama
BrowserGenie

BrowserGenie MCP Server

by BrowserGenie

screenshot_element

Take a screenshot of a single webpage element by providing its CSS selector or XPath, capturing only that part.

Instructions

Take a screenshot of a specific element only. Provide a CSS selector or XPath to capture just that element.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
selectorYesCSS selector or XPath for the element to screenshot
selectorTypeNoSelector typecss
formatNoImage format: png (default) or jpeg
qualityNoJPEG quality 0-100
tabIdNoTarget tab ID (defaults to currently active tab)
apiKeyNoAPI key for authentication if enabled

Implementation Reference

  • The handler function for the 'screenshot_element' tool. It accepts a CSS selector or XPath, optionally a selectorType, format, quality, tabId, and apiKey. It sends the 'screenshot_element' command via the WebSocket bridge and returns the resulting image.
    server.tool(
      'screenshot_element',
      'Take a screenshot of a specific element only. Provide a CSS selector or XPath to capture just that element.',
      {
        selector: z.string().describe('CSS selector or XPath for the element to screenshot'),
        selectorType: z.enum(['css', 'xpath']).optional().default('css').describe('Selector type'),
        format: z.enum(['png', 'jpeg']).optional().describe('Image format: png (default) or jpeg'),
        quality: z.number().min(0).max(100).optional().describe('JPEG quality 0-100'),
        tabId: z.number().optional().describe('Target tab ID (defaults to currently active tab)'),
        apiKey: z.string().optional().describe('API key for authentication if enabled'),
      },
      async ({ selector, selectorType, format, quality, tabId, apiKey }) => {
        const result = await bridge.sendCommand({
          command: 'screenshot_element',
          params: { selector, selectorType, format, quality },
          tabId,
          apiKey,
          timeout: LONG_TIMEOUT,
        });
        if (!result.success) {
          return { content: [{ type: 'text', text: `Error: ${result.error?.message}` }], isError: true };
        }
        const data = result.data as { image: string; mimeType: string };
        return {
          content: [{
            type: 'image',
            data: data.image,
            mimeType: data.mimeType,
          }],
        };
      }
    );
  • Input schema for screenshot_element using Zod validation. Accepts: selector (required string), selectorType (optional enum 'css'/'xpath', default 'css'), format (optional enum 'png'/'jpeg'), quality (optional 0-100), tabId (optional number), apiKey (optional string).
    {
      selector: z.string().describe('CSS selector or XPath for the element to screenshot'),
      selectorType: z.enum(['css', 'xpath']).optional().default('css').describe('Selector type'),
      format: z.enum(['png', 'jpeg']).optional().describe('Image format: png (default) or jpeg'),
      quality: z.number().min(0).max(100).optional().describe('JPEG quality 0-100'),
      tabId: z.number().optional().describe('Target tab ID (defaults to currently active tab)'),
      apiKey: z.string().optional().describe('API key for authentication if enabled'),
    },
  • The tool 'screenshot_element' is registered with the MCP server via server.tool() inside registerScreenshotTools(). This function is called from src/tools/index.ts line 33.
    server.tool(
      'screenshot_element',
      'Take a screenshot of a specific element only. Provide a CSS selector or XPath to capture just that element.',
      {
        selector: z.string().describe('CSS selector or XPath for the element to screenshot'),
        selectorType: z.enum(['css', 'xpath']).optional().default('css').describe('Selector type'),
        format: z.enum(['png', 'jpeg']).optional().describe('Image format: png (default) or jpeg'),
        quality: z.number().min(0).max(100).optional().describe('JPEG quality 0-100'),
        tabId: z.number().optional().describe('Target tab ID (defaults to currently active tab)'),
        apiKey: z.string().optional().describe('API key for authentication if enabled'),
      },
      async ({ selector, selectorType, format, quality, tabId, apiKey }) => {
        const result = await bridge.sendCommand({
          command: 'screenshot_element',
          params: { selector, selectorType, format, quality },
          tabId,
          apiKey,
          timeout: LONG_TIMEOUT,
        });
        if (!result.success) {
          return { content: [{ type: 'text', text: `Error: ${result.error?.message}` }], isError: true };
        }
        const data = result.data as { image: string; mimeType: string };
        return {
          content: [{
            type: 'image',
            data: data.image,
            mimeType: data.mimeType,
          }],
        };
      }
    );
  • Registration call chain: registerAllTools() calls registerScreenshotTools(), which registers 'screenshot_element' on the MCP server.
    registerScreenshotTools(server, bridge);
    registerClickTools(server, bridge);
    registerInputTools(server, bridge);
  • WebSocketBridge.sendCommand() is the helper that dispatches the 'screenshot_element' command to the Chrome extension via WebSocket, returning the result asynchronously.
    export class WebSocketBridge {
      private wss: WebSocketServer;
      private client: WebSocket | null = null;
      private pending = new Map<string, PendingRequest>();
      private port: number;
    
      constructor(port: number) {
        this.port = port;
        this.wss = new WebSocketServer({ port });
    
        this.wss.on('connection', (ws) => {
          if (this.client) {
            this.client.close();
          }
          this.client = ws;
          console.error(`[MCP Bridge] Extension connected on port ${this.port}`);
    
          ws.on('message', (data) => {
            try {
              const message = JSON.parse(data.toString());
              if (message.type === 'response' && message.requestId) {
                const pending = this.pending.get(message.requestId);
                if (pending) {
                  clearTimeout(pending.timer);
                  this.pending.delete(message.requestId);
                  pending.resolve({
                    success: message.success,
                    data: message.data,
                    error: message.error,
                  });
                }
              }
            } catch (err) {
              console.error('[MCP Bridge] Failed to parse message:', err);
            }
          });
    
          ws.on('close', () => {
            console.error('[MCP Bridge] Extension disconnected');
            if (this.client === ws) {
              this.client = null;
            }
            this.rejectAllPending('Extension disconnected');
          });
    
          ws.on('error', (err) => {
            console.error('[MCP Bridge] WebSocket error:', err.message);
          });
        });
    
        this.wss.on('listening', () => {
          console.error(`[MCP Bridge] WebSocket server listening on port ${this.port}`);
        });
      }
    
      isConnected(): boolean {
        return this.client !== null && this.client.readyState === WebSocket.OPEN;
      }
    
      async sendCommand(cmd: BridgeCommand): Promise<BridgeResponse> {
        if (!this.isConnected()) {
          return {
            success: false,
            error: {
              code: 'NOT_CONNECTED',
              message: 'Chrome extension is not connected. Ensure the extension is installed, enabled, and the browser is running.',
            },
          };
        }
    
        const id = crypto.randomUUID();
        const timeout = cmd.timeout ?? DEFAULT_TIMEOUT;
    
        return new Promise<BridgeResponse>((resolve, reject) => {
          const timer = setTimeout(() => {
            this.pending.delete(id);
            resolve({
              success: false,
              error: {
                code: 'TIMEOUT',
                message: `Command '${cmd.command}' timed out after ${timeout}ms`,
              },
            });
          }, timeout);
    
          this.pending.set(id, { resolve, reject, timer });
    
          const message = {
            id,
            type: 'request',
            command: cmd.command,
            params: cmd.params,
            tabId: cmd.tabId,
            apiKey: cmd.apiKey,
            timestamp: Date.now(),
          };
    
          this.client!.send(JSON.stringify(message));
        });
      }
    
      private rejectAllPending(reason: string): void {
        for (const [id, pending] of this.pending) {
          clearTimeout(pending.timer);
          pending.resolve({
            success: false,
            error: {
              code: 'CONNECTION_LOST',
              message: reason,
            },
          });
        }
        this.pending.clear();
      }
    
      async close(): Promise<void> {
        this.rejectAllPending('Server shutting down');
        if (this.client) {
          this.client.close();
          this.client = null;
        }
        return new Promise((resolve) => {
          this.wss.close(() => resolve());
        });
      }
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description should disclose behavioral details like visibility requirements, scrolling behavior, or error handling for missing elements, but it only states the basic function, leaving important gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no wasted words, immediately communicates the action and required input (CSS/XPath), satisfying conciseness and front-loading.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 6 parameters and no output schema, the description is too brief; it omits return value, behavior on missing element, scrolling, or tab handling, leaving significant gaps for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All parameters are covered by schema descriptions (100% coverage), so the description adds little value beyond clarifying that selector is for targeting the element; baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool captures a screenshot of a specific element, using a CSS selector or XPath, which distinguishes it from sibling tools like screenshot_full_page and screenshot_viewport.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for capturing a specific element rather than full page or viewport, but lacks explicit guidance on when to use this tool versus alternatives, prerequisites, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/BrowserGenie/mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server