MCP Selenium Server
A Model Context Protocol (MCP) server implementation for Selenium WebDriver, enabling browser automation through standardized MCP clients.
Video Demo (Click to Watch)
Features
- Start browser sessions with customizable options
- Navigate to URLs
- Find elements using various locator strategies
- Click, type, and interact with elements
- Perform mouse actions (hover, drag and drop)
- Handle keyboard input
- Take screenshots
- Upload files
- Support for headless mode
Supported Browsers
- Chrome
- Firefox
- MS Edge
Use with Goose
Option 1: One-click install
Copy and paste the link below into a browser address bar to add this extension to goose desktop:
Option 2: Add manually to desktop or CLI
- Name:
Selenium MCP
- Description:
automates browser interactions
- Command:
npx -y @angiejones/mcp-selenium
Use with other MCP clients (e.g. Claude Desktop, etc)
Development
To work on this project:
- Clone the repository
- Install dependencies:
npm install
- Run the server:
npm start
Installation
Installing via Smithery
To install MCP Selenium for Claude Desktop automatically via Smithery:
Manual Installation
Usage
Start the server by running:
Or use with NPX in your MCP configuration:
Tools
start_browser
Launches a browser session.
Parameters:
browser
(required): Browser to launch- Type: string
- Enum: ["chrome", "firefox"]
options
: Browser configuration options- Type: object
- Properties:
headless
: Run browser in headless mode- Type: boolean
arguments
: Additional browser arguments- Type: array of strings
Example:
navigate
Navigates to a URL.
Parameters:
url
(required): URL to navigate to- Type: string
Example:
find_element
Finds an element on the page.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
click_element
Clicks an element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
send_keys
Sends keys to an element (typing).
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
text
(required): Text to enter into the element- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
get_element_text
Gets the text() of an element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
hover
Moves the mouse to hover over an element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
drag_and_drop
Drags an element and drops it onto another element.
Parameters:
by
(required): Locator strategy for source element- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the source locator strategy- Type: string
targetBy
(required): Locator strategy for target element- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
targetValue
(required): Value for the target locator strategy- Type: string
timeout
: Maximum time to wait for elements in milliseconds- Type: number
- Default: 10000
Example:
double_click
Performs a double click on an element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
right_click
Performs a right click (context click) on an element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
press_key
Simulates pressing a keyboard key.
Parameters:
key
(required): Key to press (e.g., 'Enter', 'Tab', 'a', etc.)- Type: string
Example:
upload_file
Uploads a file using a file input element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
filePath
(required): Absolute path to the file to upload- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
take_screenshot
Captures a screenshot of the current page.
Parameters:
outputPath
(optional): Path where to save the screenshot. If not provided, returns base64 data.- Type: string
Example:
close_session
Closes the current browser session and cleans up resources.
Parameters: None required
Example:
License
MIT
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
通过 MCP 使用 Selenium WebDriver 实现浏览器自动化,支持浏览器管理、元素定位以及基本和高级用户交互。
Related Resources
Related MCP Servers
- AsecurityAlicenseAqualityA MCP server that provides browser automation tools, allowing users to navigate websites, take screenshots, click elements, fill forms, and execute JavaScript through Playwright.Last updated -8Apache 2.0
- -securityAlicense-qualityA tool that enables automated browser control using Pyppeteer within the MCP framework, allowing navigation, screenshot capture, and element interaction with websites.Last updated -Apache 2.0
- AsecurityAlicenseAqualityAn MCP service that automates Chrome browser control while bypassing anti-bot detection mechanisms, enabling web scraping, testing and automation on sites with sophisticated bot protection.Last updated -164MIT License
- -securityAlicense-qualityA session-based MCP server that provides advanced browser automation capabilities, allowing users to control browsers, navigate websites, interact with elements, capture screenshots, generate PDFs, and manage cookies through natural language.Last updated -MIT License