The Selenium MCP Server enables AI assistants to perform comprehensive web browser automation through a standardized MCP interface.
Browser Management: Open, navigate, refresh, resize, and close browser sessions (Chrome, Firefox, Edge) with headless mode and custom arguments. Switch between windows/tabs and manage page history.
Element Interaction: Find elements using various locators (
id
,css
,xpath
,name
,tag
,class
,link
,partialLink
), then click, double-click, right-click, type, clear, hover, drag and drop, and upload files.Element State & Data: Check element visibility, retrieve text content and attribute values, and wait for elements to appear with configurable timeouts.
Form & Dropdown Handling: Select dropdown options by text or value and interact with form elements.
Advanced Actions: Execute custom JavaScript, simulate keyboard input, scroll to elements, capture screenshots, and switch into/out of iframes.
Multi-context Support: Manage multiple browser windows, tabs, and iframe contexts seamlessly.
Provides comprehensive web browser automation tools including multi-browser support (Chrome, Firefox, Edge), element interaction (click, type, hover, drag & drop), navigation control, wait strategies, and JavaScript execution for automated testing and web scraping.
Selenium MCP Server
This is a server implementation that bridges the gap between MCP clients (AI assistants) and Selenium WebDriver. It exposes Selenium WebDriver's functionalities as MCP tools, allowing AI models to utilize them for tasks like:
- Browser management (launching, navigating, closing browsers)
- Element interaction (clicking, typing, finding elements)
- Web scraping and automated testing
- Advanced operations like screenshots, cookie management, and JavaScript execution
In essence, the selenium webdriver mcp setup allows AI assistants to leverage the power of Selenium Webdriver for web automation, by communicating with a dedicated Selenium MCP server via the Model Context Protocol. This facilitates tasks such as automated web interactions, testing, and data extraction, all controlled by AI.
🚀 Overview
A Model Context Protocol (MCP) server for Selenium that provides comprehensive Selenium WebDriver automation tools for AI assistants and applications. This server enables automated web browser interactions, testing, and scraping through a standardized interface.
Built with TypeScript and modern ES modules, it offers type-safe browser automation capabilities through the Model Context Protocol.
✨ Key Features
- Multi-Browser Support: Chrome, Firefox, Safari, and Edge browser automation
- Comprehensive Element Interaction: Click, type, hover, drag & drop, file uploads
- Advanced Navigation: Forward, backward, refresh, window management
- Wait Strategies: Intelligent waiting for elements and page states
- Type Safety: Full TypeScript implementation with Zod validation
🤝 Integration
MCP Client Integration
Configure your MCP client to connect to the Selenium server:
Standard Configuration (applicable to Windsurf, Warp, Gemini CLI etc)
Installation in VS Code
Update your mcp.json
in VS Code with below configuration
NOTE: If you're new to MCP servers, follow this link Use MCP servers in VS Code
Example 'stdio' type connection
Example 'http' type connection
After installation, the Selenium MCP server will be available for use with your GitHub Copilot agent in VS Code.
To install the Selenium MCP server using the VS Code CLI
To install the package using either npm, or Smithery
Using npm:
Using Smithery
To install Selenium MCP for Claude Desktop automatically via
Claude Desktop Integration
Add to your Claude Desktop configuration:
Screenshot
Prompts
An example prompt to start AI Agent interaction:
Using selenium mcp tools, navigate to <https://parabank.parasoft.com/> click the 'Register' link and signup using dynamic test data and click register. Then generate selenium tests in <YOUR_FAVOURITE_PROGRAMMING_LANGUAGE> using pom, create tests using cucumber features, steps and execute the tests.
Note: For more prompts, look at examples directory of the project
🛠️ MCP Available Tools
Browser Management Tools
Tool | Description | Parameters |
---|---|---|
browser_open | Open a new browser session | browser , options |
browser_navigate | Navigate to a URL | url |
browser_navigate_back | Navigate back in history | None |
browser_navigate_forward | Navigate forward in history | None |
browser_title | Get the current page title | None |
browser_refresh | Refresh the current page | None |
browser_get_url | Get the current page URL | None |
browser_get_page_source | Get the current page HTML source | None |
browser_maximize | Maximize the browser window | None |
browser_resize | Resize browser window | width , height |
browser_close | Close current browser session | None |
Cookie Management Tools
Tool | Description | Parameters |
---|---|---|
browser_get_cookies | Get all cookies from the current browser session | None |
browser_get_cookie_by_name | Get a specific cookie by name | cookie (cookie name) |
browser_add_cookie_by_name | Add a new cookie to the browser | cookie (cookie name), value |
browser_set_cookie_object | Set a cookie object in the browser | cookie (cookie object as string) |
browser_delete_cookie | Delete a specific cookie by name | value (cookie name to delete) |
browser_delete_cookies | Delete all cookies from the current browser session | None |
Window Management Tools
Tool | Description | Parameters |
---|---|---|
browser_switch_to_window | Switch to a different browser window by handle | windowHandle |
browser_switch_to_original_window | Switch back to the original browser window | None |
browser_switch_to_window_by_title | Switch to a window by its page title | title |
browser_switch_window_by_index | Switch to a window by its index position | index |
browser_switch_to_window_by_url | Switch to a window by its URL | url |
Element Interaction Tools
Tool | Description | Parameters |
---|---|---|
browser_find_element | Find an element on the page | by , value , timeout |
browser_find_elements | Find multiple elements on the page | by , value , timeout |
browser_click | Click on an element | by , value , timeout |
browser_type | Type text into an element | by , value , text , timeout |
browser_get_element_text | Get text content of element | by , value , timeout |
browser_file_upload | Upload file via input element | by , value , filePath , timeout |
browser_clear | Clear text from an element | by , value , timeout |
browser_get_attribute | Get element attribute value | by , value , attribute , timeout |
Element State Validation Tools
Tool | Description | Parameters |
---|---|---|
browser_element_is_displayed | Check if an element is visible on the page | by , value , timeout |
browser_element_is_enabled | Check if an element is enabled for interaction | by , value , timeout |
browser_element_is_selected | Check if an element is selected (checkboxes, radio buttons) | by , value , timeout |
Frame Management Tools
Tool | Description | Parameters |
---|---|---|
browser_switch_to_frame | Switch to an iframe element | by , value , timeout |
browser_switch_to_parent_frame | Switch to the parent frame (from nested iframe) | None |
browser_switch_to_default_content | Switch back to the main page content | None |
Advanced Action Tools
Tool | Description | Parameters |
---|---|---|
browser_hover | Hover over an element | by , value , timeout |
browser_double_click | Double-click on an element | by , value , timeout |
browser_right_click | Right-click (context menu) | by , value , timeout |
browser_drag_and_drop | Drag from source to target | by , value , targetBy , targetValue , timeout |
browser_wait_for_element | Wait for element to appear | by , value , timeout |
browser_execute_script | Execute JavaScript code | script , args |
browser_screenshot | Take a screenshot | filename (optional) |
browser_select_dropdown_by_text | Select dropdown option by visible text | by , value , text , timeout |
browser_select_dropdown_by_value | Select dropdown option by value | by , value , dropdownValue , timeout |
browser_key_press | Press a keyboard key in the browser | key , timeout |
Scrolling Tools
Tool | Description | Parameters |
---|---|---|
browser_scroll_to_element | Scroll to bring an element into view | by , value , timeout |
browser_scroll_to_top | Scroll to the top of the page | None |
browser_scroll_to_bottom | Scroll to the bottom of the page | None |
browser_scroll_to_coordinates | Scroll to specific coordinates | x , y |
browser_scroll_by_pixels | Scroll by specified number of pixels | x , y |
Form Interaction Tools
Tool | Description | Parameters |
---|---|---|
browser_select_checkbox | Select/check a checkbox | by , value , timeout |
browser_unselect_checkbox | Unselect/uncheck a checkbox | by , value , timeout |
browser_submit_form | Submit a form element | by , value , timeout |
browser_focus_element | Focus on a specific element | by , value , timeout |
browser_blur_element | Remove focus from a specific element | by , value , timeout |
Element Locator Strategies
id
: Find by element IDcss
: Find by CSS selectorxpath
: Find by XPath expressionname
: Find by name attributetag
: Find by HTML tag nameclass
: Find by CSS class name
📋 Requirements
- Node.js: Version 18.0.0 or higher
- Browsers: Chrome, Firefox, Safari, or Edge installed
- WebDrivers: Automatically managed by selenium-webdriver
- Operating System: Windows, macOS, or Linux
🚦 Development
Getting Started
Clone the repository
Install dependencies
Build the project
Running the Server
Production Mode
Development Mode (with auto-reload)
Direct Execution
Using as CLI Tool
After building, you can use the server as a global command:
📝 License
MIT License - see LICENSE file for details.
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Badges/Mentions
Built with ❤️ for the Model Context Protocol ecosystem
local-only server
The server can only run on the client's local machine because it depends on local resources.
Tools
Enables AI assistants to automate web browser interactions through Selenium WebDriver. Supports multi-browser automation, element interaction, navigation, and web testing capabilities.
Related MCP Servers
- AsecurityAlicenseAqualityFacilitates browser automation with custom capabilities and agent-based interactions, integrated through the browser-use library.Last updated -1811MIT License
- -securityAlicense-qualityEmpowers AI agents to perform web browsing, automation, and scraping tasks with minimal supervision using natural language instructions and Selenium.Last updated -4Apache 2.0
- AsecurityAlicenseAqualityAllows AI agents to control web browser sessions via Selenium WebDriver, enabling web automation tasks like scraping, testing, and form filling through the Model Context Protocol.Last updated -6233MIT License
- -securityFlicense-qualityEnables AI assistants to control a browser through a set of tools, allowing them to perform web automation tasks like navigation, typing, clicking, and taking screenshots.Last updated -