Skip to main content
Glama

Selenium MCP Server

Selenium MCP Server

npm version

npm downloads

GitHub issues

This is a server implementation that bridges the gap between MCP clients (AI assistants) and Selenium WebDriver. It exposes Selenium WebDriver's functionalities as MCP tools, allowing AI models to utilize them for tasks like:

  • Browser management (launching, navigating, closing browsers)
  • Element interaction (clicking, typing, finding elements)
  • Web scraping and automated testing
  • Advanced operations like screenshots, cookie management, and JavaScript execution

In essence, the selenium webdriver mcp setup allows AI assistants to leverage the power of Selenium Webdriver for web automation, by communicating with a dedicated Selenium MCP server via the Model Context Protocol. This facilitates tasks such as automated web interactions, testing, and data extraction, all controlled by AI.

🚀 Overview

A Model Context Protocol (MCP) server for Selenium that provides comprehensive Selenium WebDriver automation tools for AI assistants and applications. This server enables automated web browser interactions, testing, and scraping through a standardized interface.

Built with TypeScript and modern ES modules, it offers type-safe browser automation capabilities through the Model Context Protocol.

✨ Key Features

  • Multi-Browser Support: Chrome, Firefox, Safari, and Edge browser automation
  • Comprehensive Element Interaction: Click, type, hover, drag & drop, file uploads
  • Advanced Navigation: Forward, backward, refresh, window management
  • Wait Strategies: Intelligent waiting for elements and page states
  • Type Safety: Full TypeScript implementation with Zod validation

🤝 Integration

MCP Client Integration

Configure your MCP client to connect to the Selenium server:

Standard Configuration (applicable to Windsurf, Warp, Gemini CLI etc)

{ "servers": { "selenium-mcp": { "command": "npx", "args": ["-y", "selenium-webdriver-mcp@latest"] } } }

Installation in VS Code

Update your mcp.json in VS Code with below configuration

NOTE: If you're new to MCP servers, follow this link Use MCP servers in VS Code

Example 'stdio' type connection

{ "servers": { "selenium-mcp": { "command": "npx", "args": [ "-y", "selenium-webdriver-mcp@latest" ], "type": "stdio" } }, "inputs": [] }

Example 'http' type connection

{ "servers": { "Selenium": { "url": "https://smithery.ai/server/@pshivapr/selenium-mcp", "type": "http" } }, "inputs": [] }

After installation, the Selenium MCP server will be available for use with your GitHub Copilot agent in VS Code.

To install the Selenium MCP server using the VS Code CLI

# For VS Code code --add-mcp '{\"name\":\"selenium-mcp\",\"command\": \"npx\",\"args\": [\"selenium-webdriver-mcp@latest\"]}'
# For VS Code Insiders vscode-insiders --add-mcp '{\"name\":\"selenium-mcp\",\"command\": \"npx\",\"args\": [\"selenium-webdriver-mcp@latest\"]}'

To install the package using either npm, or Smithery

Using npm:

npm install -g selenium-webdriver-mcp@latest

Using Smithery

To install Selenium MCP for Claude Desktop automatically via

npx @smithery/cli install @pshivapr/selenium-mcp --client claude

Claude Desktop Integration

Add to your Claude Desktop configuration:

{ "mcpServers": { "selenium-mcp": { "command": "npx", "args": ["-y", "selenium-webdriver-mcp@latest"] } } }

Screenshot

Selenium + Claude

Prompts

An example prompt to start AI Agent interaction:

Using selenium mcp tools, navigate to <https://parabank.parasoft.com/> click the 'Register' link and signup using dynamic test data and click register. Then generate selenium tests in <YOUR_FAVOURITE_PROGRAMMING_LANGUAGE> using pom, create tests using cucumber features, steps and execute the tests.

Note: For more prompts, look at examples directory of the project

🛠️ MCP Available Tools

Browser Management Tools

ToolDescriptionParameters
browser_openOpen a new browser sessionbrowser, options
browser_navigateNavigate to a URLurl
browser_navigate_backNavigate back in historyNone
browser_navigate_forwardNavigate forward in historyNone
browser_titleGet the current page titleNone
browser_refreshRefresh the current pageNone
browser_get_urlGet the current page URLNone
browser_get_page_sourceGet the current page HTML sourceNone
browser_maximizeMaximize the browser windowNone
browser_resizeResize browser windowwidth, height
browser_closeClose current browser sessionNone
ToolDescriptionParameters
browser_get_cookiesGet all cookies from the current browser sessionNone
browser_get_cookie_by_nameGet a specific cookie by namecookie (cookie name)
browser_add_cookie_by_nameAdd a new cookie to the browsercookie (cookie name), value
browser_set_cookie_objectSet a cookie object in the browsercookie (cookie object as string)
browser_delete_cookieDelete a specific cookie by namevalue (cookie name to delete)
browser_delete_cookiesDelete all cookies from the current browser sessionNone

Window Management Tools

ToolDescriptionParameters
browser_switch_to_windowSwitch to a different browser window by handlewindowHandle
browser_switch_to_original_windowSwitch back to the original browser windowNone
browser_switch_to_window_by_titleSwitch to a window by its page titletitle
browser_switch_window_by_indexSwitch to a window by its index positionindex
browser_switch_to_window_by_urlSwitch to a window by its URLurl

Element Interaction Tools

ToolDescriptionParameters
browser_find_elementFind an element on the pageby, value, timeout
browser_find_elementsFind multiple elements on the pageby, value, timeout
browser_clickClick on an elementby, value, timeout
browser_typeType text into an elementby, value, text, timeout
browser_get_element_textGet text content of elementby, value, timeout
browser_file_uploadUpload file via input elementby, value, filePath, timeout
browser_clearClear text from an elementby, value, timeout
browser_get_attributeGet element attribute valueby, value, attribute, timeout

Element State Validation Tools

ToolDescriptionParameters
browser_element_is_displayedCheck if an element is visible on the pageby, value, timeout
browser_element_is_enabledCheck if an element is enabled for interactionby, value, timeout
browser_element_is_selectedCheck if an element is selected (checkboxes, radio buttons)by, value, timeout

Frame Management Tools

ToolDescriptionParameters
browser_switch_to_frameSwitch to an iframe elementby, value, timeout
browser_switch_to_parent_frameSwitch to the parent frame (from nested iframe)None
browser_switch_to_default_contentSwitch back to the main page contentNone

Advanced Action Tools

ToolDescriptionParameters
browser_hoverHover over an elementby, value, timeout
browser_double_clickDouble-click on an elementby, value, timeout
browser_right_clickRight-click (context menu)by, value, timeout
browser_drag_and_dropDrag from source to targetby, value, targetBy, targetValue, timeout
browser_wait_for_elementWait for element to appearby, value, timeout
browser_execute_scriptExecute JavaScript codescript, args
browser_screenshotTake a screenshotfilename (optional)
browser_select_dropdown_by_textSelect dropdown option by visible textby, value, text, timeout
browser_select_dropdown_by_valueSelect dropdown option by valueby, value, dropdownValue, timeout
browser_key_pressPress a keyboard key in the browserkey, timeout

Scrolling Tools

ToolDescriptionParameters
browser_scroll_to_elementScroll to bring an element into viewby, value, timeout
browser_scroll_to_topScroll to the top of the pageNone
browser_scroll_to_bottomScroll to the bottom of the pageNone
browser_scroll_to_coordinatesScroll to specific coordinatesx, y
browser_scroll_by_pixelsScroll by specified number of pixelsx, y

Form Interaction Tools

ToolDescriptionParameters
browser_select_checkboxSelect/check a checkboxby, value, timeout
browser_unselect_checkboxUnselect/uncheck a checkboxby, value, timeout
browser_submit_formSubmit a form elementby, value, timeout
browser_focus_elementFocus on a specific elementby, value, timeout
browser_blur_elementRemove focus from a specific elementby, value, timeout

Element Locator Strategies

  • id: Find by element ID
  • css: Find by CSS selector
  • xpath: Find by XPath expression
  • name: Find by name attribute
  • tag: Find by HTML tag name
  • class: Find by CSS class name

📋 Requirements

  • Node.js: Version 18.0.0 or higher
  • Browsers: Chrome, Firefox, Safari, or Edge installed
  • WebDrivers: Automatically managed by selenium-webdriver
  • Operating System: Windows, macOS, or Linux

🚦 Development

Getting Started

Clone the repository
git clone https://github.com/pshivapr/selenium-mcp.git cd selenium-mcp
Install dependencies
npm install
Build the project
npm run build

Running the Server

Production Mode
npm start
Development Mode (with auto-reload)
npm run dev
Direct Execution
node dist/index.js

Using as CLI Tool

After building, you can use the server as a global command:

npx selenium-webdriver-mcp@latest

📝 License

MIT License - see LICENSE file for details.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Badges/Mentions

MCP Market

Pulse


Built with ❤️ for the Model Context Protocol ecosystem

Related MCP Servers

  • A
    security
    A
    license
    A
    quality
    Facilitates browser automation with custom capabilities and agent-based interactions, integrated through the browser-use library.
    Last updated -
    1
    811
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    Empowers AI agents to perform web browsing, automation, and scraping tasks with minimal supervision using natural language instructions and Selenium.
    Last updated -
    4
    Apache 2.0
    • Apple
  • A
    security
    A
    license
    A
    quality
    Allows AI agents to control web browser sessions via Selenium WebDriver, enabling web automation tasks like scraping, testing, and form filling through the Model Context Protocol.
    Last updated -
    6
    23
    3
    MIT License
  • -
    security
    F
    license
    -
    quality
    Enables AI assistants to control a browser through a set of tools, allowing them to perform web automation tasks like navigation, typing, clicking, and taking screenshots.
    Last updated -

View all related MCP servers

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pshivapr/selenium-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server