Enables browser automation with Firefox, supporting both headless and full browser modes for web navigation, element manipulation, and other Selenium-based interactions.
Enables integration with Google Gemini as an MCP client, allowing the AI to perform web automation tasks through the Selenium WebDriver.
Provides a bridge to Selenium WebDriver capabilities, allowing programmatic control of web browsers for automation tasks including navigation, element interaction, and advanced actions like JavaScript execution and screenshots.
Selenium MCP Server
A powerful Model Context Protocol (MCP) server that brings Selenium WebDriver automation to AI assistants. This server enables AI tools like Claude Desktop and Cursor AI to control web browsers programmatically, making web automation accessible through natural language commands.
📚 Table of Contents
- Features
- Quick Start
- Getting Started
- Client Integration
- Available Tools & Examples
- Advanced Configuration
- Environment Variables
- Testing Your Configuration
- FAQ / Troubleshooting
- Contributing
- Changelog
- Project Structure
- License
- Contact & Support
🚀 Features
- Multi-browser support: Chrome and Firefox (headless or full mode)
- Session management: Start, list, switch, and close browser sessions
- Navigation: Go to URLs, reload, and retrieve page info
- Element interaction: Find, click, type, hover, drag-and-drop, double/right click, upload files
- Advanced actions: Execute JavaScript, take screenshots, get/set element text
- File operations: Upload files, download files, take full-page screenshots
- Robust error handling
- Easy integration with any MCP-compatible client (Cursor AI, Claude Desktop, Google Gemini, etc.)
- PEP 517/518 compliant packaging
⚡ Quick Start
Add this to your MCP client config (e.g., Cursor AI):
🏁 Getting Started
- Install
- Run the server
- Connect your MCP client (see config above)
Windows users: If you see a .exe.deleteme
error, delete any selenium-mcp-server.exe
or .exe.deleteme
files in your Python Scripts
directory, then retry the install. You can always run the server with python -m selenium_mcp_server
.
🤖 Client Integration
- Cursor AI:
~/.cursor/mcp_config.json
- Claude Desktop:
~/.config/claude-desktop/config.json
(Linux/macOS) or%APPDATA%\claude-desktop\config.json
(Windows) - Other MCP Clients: See your client's documentation
🛠️ Available Tools & Examples
Browser Management
- Start Browser
- List Sessions
- Switch Session
- Close Session
Navigation & Page Info
- Navigate
- Get Page Info
Element Interaction
- Find Element
- Click Element
- Send Keys
- Get Element Text
- Hover
- Drag and Drop
- Double Click / Right Click
- Press Key
- Upload File
- Wait for Element
Advanced Actions
- Take Screenshot
- Execute Script
📊 Example Automation Flow
⚙️ Advanced Configuration
You can configure the Selenium MCP Server in several ways:
Option 1: Installed Package (Recommended)
Option 2: Direct File Execution
- Windows:
- macOS/Linux:
Option 3: Console Script
🌐 Environment Variables
PYTHONUNBUFFERED=1
: Ensures Python output is not bufferedSELENIUM_LOG_LEVEL=INFO
: Sets logging level (DEBUG, INFO, WARNING, ERROR)PYTHONPATH
: Points to the directory containing the Python modules (needed for direct file execution)
🧪 Testing Your Configuration
After configuring, test with:
You should get an empty list if no sessions are active.
❓ FAQ / Troubleshooting
Q: I see "0 tools enabled" in Cursor AI.
- Make sure the package is installed:
pip install selenium-mcp-server
(orpip install -e .
for development) - Verify the module works:
- Check if the entry point works:
- Try using the console script entry point in your config.
Q: "Module not found" errors
- Make sure you've installed the package:
pip install selenium-mcp-server
orpip install -e .
- Check that the
PYTHONPATH
points to the correct directory if running from source - Verify the file paths are correct for your system
Q: "Command not found" errors
- Ensure Python is in your system PATH
- Try using the full path to Python:
C:\Python312\python.exe
(Windows) or/usr/bin/python3
(Linux/macOS)
Q: Permission errors
- On Windows, try running your MCP client as administrator
- On Linux/macOS, check file permissions:
chmod +x src/selenium_mcp_server.py
Q: I get a .exe.deleteme
error on Windows when upgrading.
- Close all terminals, delete any
selenium-mcp-server.exe
or.exe.deleteme
files in your PythonScripts
directory, and retry the install. You can always run the server withpython -m selenium_mcp_server
.
Q: How do I check the server version?
- The server prints its version on startup. You can also check with
pip show selenium-mcp-server
.
Q: Can I use this with headless browsers?
- Yes! The server supports both headless and full browser modes.
Q: How do I contribute or report issues?
- See the Contributing section below.
🤝 Contributing
- Fork the repo and create a feature branch
- Make your changes (see
src/selenium_mcp_server/
) - Add/modify tests if needed
- Open a pull request with a clear description
- For issues, use the GitHub Issues tab
🗒️ Changelog
- 1.1.6: Improved error handling, updated dependencies, enhanced Windows compatibility
- 1.1.5: Simplified packaging, removed legacy scripts, improved docs
- 1.1.4 and earlier: Initial releases, core MCP and Selenium features
Project Structure
- All source code is in
src/selenium_mcp_server/
- No unnecessary files or scripts in the root or src directory
- The package is fully PEP 517/518 compliant and ready for PyPI distribution
License
MIT License - feel free to use this in your own projects.
Contact & Support
- For questions, open a GitHub Issue
- For discussions, feature requests, or help, use the GitHub Discussions or Issues
- Maintained by Krishna Pollu
Note: This server is designed for legitimate automation tasks. Please respect websites' terms of service and robots.txt files when using this tool.
This server cannot be installed
A Model Context Protocol server that enables AI assistants to control web browsers programmatically, allowing for web automation through natural language commands.
Related MCP Servers
- AsecurityAlicenseAqualityAI-driven browser automation server that implements the Model Context Protocol to enable natural language control of web browsers for tasks like navigation, form filling, and visual interaction.Last updated -11PythonMIT License
- -securityFlicense-qualityA Model Control Protocol server that enables AI assistants to control a browser through tools for web automation tasks like navigation, typing, clicking, and taking screenshots.Last updated -TypeScript
- -securityFlicense-qualityA Model Control Protocol server that enables AI assistants to control a browser through tools for web automation tasks like navigation, typing, clicking, and taking screenshots.Last updated -TypeScript
- -securityFlicense-qualityA Model Context Protocol server that enables AI assistants to control a real web browser with stealth capabilities, avoiding bot detection while performing tasks like clicking, filling forms, taking screenshots, and extracting data.Last updated -1019TypeScript