Skip to main content
Glama
livoras

Better Playwright MCP

by livoras

browserPressKey

Simulate keyboard key presses on web pages using Playwright, with options to target specific elements and adjust wait times for efficient browser automation.

Instructions

按键盘按键

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
keyYes按键名称,如 'Enter', 'Tab', 'ArrowDown' 等
pageIdYes页面ID
refNo可选:元素的xp引用值,如果指定则在该元素上按键
waitForTimeoutNo操作后等待获取快照的延迟时间(毫秒,默认2000)

Implementation Reference

  • The handler function that presses the specified key using the page's keyboard.press method. This is the core implementation of the tool logic.
    handle: async (tab, params, response) => {
        response.setIncludeSnapshot();
        response.addCode(`// Press ${params.key}`);
        response.addCode(`await page.keyboard.press('${params.key}');`);
        await tab.waitForCompletion(async () => {
            await tab.page.keyboard.press(params.key);
        });
    },
  • Tool schema defining the name 'browser_press_key' (maps to client 'browserPressKey'), input schema requiring 'key' string, and metadata.
    schema: {
        name: 'browser_press_key',
        title: 'Press a key',
        description: 'Press a key on the keyboard',
        inputSchema: z.object({
            key: z.string().describe('Name of the key to press or a character to generate, such as `ArrowLeft` or `a`'),
        }),
        type: 'destructive',
    },
  • lib/tools.js:32-49 (registration)
    Central registration of all browser tools by spreading exports from individual modules, including keyboard.js which provides browser_press_key.
    export const allTools = [
        ...common,
        ...console,
        ...dialogs,
        ...evaluate,
        ...files,
        ...form,
        ...install,
        ...keyboard,
        ...navigate,
        ...network,
        ...mouse,
        ...pdf,
        ...screenshot,
        ...snapshot,
        ...tabs,
        ...wait,
    ];
  • MCP backend registration: imports filteredTools, sets _tools, and exposes tool schemas via listTools() for the protocol.
    import { filteredTools } from './tools.js';
    import { toMcpTool } from './mcp/tool.js';
    export class BrowserServerBackend {
        _tools;
        _context;
        _sessionLog;
        _config;
        _browserContextFactory;
        constructor(config, factory) {
            this._config = config;
            this._browserContextFactory = factory;
            this._tools = filteredTools(config);
        }
        async initialize(server, clientVersion, roots) {
            let rootPath;
            if (roots.length > 0) {
                const firstRootUri = roots[0]?.uri;
                const url = firstRootUri ? new URL(firstRootUri) : undefined;
                rootPath = url ? fileURLToPath(url) : undefined;
            }
            this._sessionLog = this._config.saveSession ? await SessionLog.create(this._config, rootPath) : undefined;
            this._context = new Context({
                tools: this._tools,
                config: this._config,
                browserContextFactory: this._browserContextFactory,
                sessionLog: this._sessionLog,
                clientInfo: { ...clientVersion, rootPath },
            });
        }
        async listTools() {
            return this._tools.map(tool => toMcpTool(tool.schema));
        }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden for behavioral disclosure. The description only states the action ('press keyboard key') without mentioning any behavioral traits such as what happens after pressing (e.g., page navigation, form submission), error conditions, or performance implications. It lacks context about browser state changes or interaction effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with just three Chinese characters ('按键盘按键'), which translates to 'Press keyboard key'. It's front-loaded with zero wasted words, making it easy to parse quickly. Every character serves the core purpose without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (browser automation with 4 parameters) and lack of annotations and output schema, the description is incomplete. It doesn't explain what the tool returns, how it interacts with browser state, or potential side effects. For a tool that performs actions in a browser environment, more context about behavior and outcomes is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 4 parameters thoroughly. The description adds no additional meaning beyond what's in the schema (e.g., it doesn't explain parameter relationships or usage examples). With high schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate but doesn't detract either.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '按键盘按键' (Press keyboard key) is a tautology that essentially restates the tool name 'browserPressKey' in Chinese. It doesn't specify what resource is being acted upon (a browser page) or distinguish this from sibling tools like 'browserType' (which also involves keyboard input). The purpose is stated but lacks specificity and differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided about when to use this tool versus alternatives. For example, it doesn't explain when to use 'browserPressKey' versus 'browserType' (which types text) or other browser interaction tools. The description offers no context about appropriate use cases or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/livoras/better-playwright-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server