The WebDriverIO MCP Server enables AI assistants to automate web browsers and mobile applications through a unified interface.
Browser Automation
Launch and manage sessions with Chrome, Firefox, Edge, or Safari (headed/headless, custom dimensions)
Attach to existing Chrome instances via remote debugging port
Navigate URLs, click elements, fill forms, scroll pages, and capture screenshots
Retrieve visible/interactable elements and accessibility tree for page analysis
Manage cookies (get, set, delete) with full attribute control
Emulate mobile/tablet devices (iPhone 15, Pixel 7, etc.) with viewport, DPR, user-agent, and touch event emulation
Execute arbitrary JavaScript in the browser context
Mobile App Automation (iOS/Android via Appium)
Start and manage native app sessions (.app/.ipa/.apk) with configurable state preservation (noReset/fullReset)
Perform touch gestures: tap, swipe, drag-and-drop
Control app lifecycle and check app state (installed, running, background, foreground)
Switch between native and webview contexts for hybrid app testing
Control device orientation, keyboard visibility, and GPS geolocation
Support diverse selectors: CSS, XPath, Accessibility ID, UiAutomator (Android), iOS Predicates
Automatically grant permissions and handle system alerts
Execute platform-specific mobile commands
Session Management & Recording
Maintain one active session at a time; close or detach while preserving state
All tool calls are automatically recorded and exportable as runnable WebDriverIO JavaScript scripts via MCP resources (
wdio://sessions,wdio://session/current/steps)
Prerequisites for Mobile: A running Appium server with platform-specific drivers (XCUITest for iOS, UiAutomator2/Espresso for Android) and configured devices/emulators.
Enables automated testing of Android applications (.apk) on emulators and physical devices with support for UiAutomator selectors, device-specific gestures, key codes, and system interactions like notifications and keyboard control.
Provides mobile app automation for iOS and Android applications with native app testing, touch gestures, app lifecycle management, context switching for hybrid apps, device control, and cross-platform element selection.
Enables automated testing of iOS applications (.app/.ipa) on simulators and physical devices with XCUITest support, iOS Predicate selectors, and iOS-specific features like device shake functionality.
Enables browser automation for Chrome with session management, navigation, element interaction, cookie management, screenshot capture, and accessibility tree analysis in both headless and headed modes.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@WebDriverIO MCP Serverstart a browser session and navigate to example.com"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
WebDriverIO MCP Server
A Model Context Protocol (MCP) server that enables AI assistants to interact with web browsers and mobile applications using WebDriverIO. Automate Chrome, Firefox, Edge, and Safari browsers plus iOS and Android apps—all through a unified interface.
Installation
Add the following configuration to your MCP client settings:
Standard config (works in most clients):
{
"mcpServers": {
"wdio-mcp": {
"command": "npx",
"args": ["-y", "@wdio/mcp@latest"]
}
}
}
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS), %APPDATA%\Claude\claude_desktop_config.json (Windows), or ~/.config/Claude/claude_desktop_config.json (Linux):
{
"mcpServers": {
"wdio-mcp": {
"command": "npx",
"args": ["-y", "@wdio/mcp@latest"]
}
}
}claude mcp add wdio-mcp -- npx -y @wdio/mcp@latestAdd to your VS Code settings.json or cline_mcp_settings.json file:
{
"mcpServers": {
"wdio-mcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@wdio/mcp@latest"]
}
}
}Go to Cursor Settings → MCP → Add new MCP Server, or create .cursor/mcp.json:
{
"mcpServers": {
"wdio-mcp": {
"command": "npx",
"args": ["-y", "@wdio/mcp@latest"]
}
}
}Use the Codex CLI:
codex mcp add wdio-mcp npx "@wdio/mcp@latest"Or edit ~/.codex/config.toml:
[mcp_servers.wdio-mcp]
command = "npx"
args = ["@wdio/mcp@latest"]Go to Advanced settings → Extensions → Add custom extension, or run:
goose configureOr edit ~/.config/goose/config.yaml:
extensions:
wdio-mcp:
name: WebDriverIO MCP
cmd: npx
args: [-y, "@wdio/mcp@latest"]
enabled: true
type: stdioEdit ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"wdio-mcp": {
"command": "npx",
"args": ["-y", "@wdio/mcp@latest"]
}
}
}Edit Zed settings (~/.config/zed/settings.json):
{
"context_servers": {
"wdio-mcp": {
"source": "custom",
"command": "npx",
"args": ["-y", "@wdio/mcp@latest"]
}
}
}code --add-mcp '{"name":"wdio-mcp","command":"npx","args":["-y","@wdio/mcp@latest"]}'⚠️ Restart Required: After adding the configuration, fully restart your MCP client to apply the changes.
Option 2: Global Installation
If you prefer to install globally:
npm install -g @wdio/mcpThen use wdio-mcp as the command:
{
"mcpServers": {
"wdio-mcp": {
"command": "wdio-mcp"
}
}
}📖 Need help? Follow the MCP install guide.
Prerequisites For Mobile App Automation
Appium Server: Install globally with
npm install -g appiumPlatform Drivers:
iOS:
appium driver install xcuitest(requires Xcode on macOS)Android:
appium driver install uiautomator2(requires Android Studio)
Devices/Emulators:
iOS Simulator (macOS) or physical device
Android Emulator or physical device
For iOS Real Devices: You'll need the device's UDID (Unique Device Identifier)
Find UDID on macOS: Connect device → Open Finder → Select device → Click device name/model to reveal UDID
Find UDID on Windows: Connect device → iTunes or Apple Devices app → Click device icon → Click "Serial Number" to reveal UDID
Xcode method: Window → Devices and Simulators → Select device → UDID shown as "Identifier"
Start the Appium server before using mobile features:
appium
# Server runs at http://127.0.0.1:4723 by defaultRelated MCP server: MCP Appium
BrowserStack
Run browser and mobile app tests on BrowserStack real devices and browsers without any local setup.
Prerequisites
Set your credentials as environment variables:
export BROWSERSTACK_USERNAME=your_username
export BROWSERSTACK_ACCESS_KEY=your_access_keyOr add them to your MCP client config:
{
"mcpServers": {
"wdio-mcp": {
"command": "npx",
"args": ["-y", "@wdio/mcp@latest"],
"env": {
"BROWSERSTACK_USERNAME": "your_username",
"BROWSERSTACK_ACCESS_KEY": "your_access_key"
}
}
}
}Browser Sessions
Run a browser on a specific OS/version combination:
start_session({
provider: 'browserstack',
platform: 'browser',
browser: 'chrome', // chrome | firefox | edge | safari
browserVersion: 'latest', // default: latest
os: 'Windows', // e.g. "Windows", "OS X"
osVersion: '11', // e.g. "11", "Sequoia"
reporting: {
project: 'My Project',
build: 'v1.2.0',
session: 'Login flow'
}
})Mobile App Sessions
Test on BrowserStack real devices. First upload your app (or use an existing bs:// URL):
// Upload a local .apk or .ipa (returns a bs:// URL)
upload_app({ path: '/path/to/app.apk' })
// Start a session with the returned URL
start_session({
provider: 'browserstack',
platform: 'android', // android | ios
app: 'bs://abc123...', // bs:// URL or custom_id from upload
deviceName: 'Samsung Galaxy S23',
platformVersion: '13.0',
reporting: {
project: 'My Project',
build: 'v1.2.0',
session: 'Checkout flow'
}
})Use list_apps to see previously uploaded apps:
list_apps() // own uploads, sorted by date
list_apps({ sortBy: 'app_name' })
list_apps({ organizationWide: true }) // all uploads in your orgBrowserStack Local
To test against URLs that are only accessible on your local machine or internal network, enable the BrowserStack Local tunnel:
start_session({
provider: 'browserstack',
platform: 'browser',
browser: 'chrome',
browserstackLocal: true // starts tunnel automatically
})Reporting Labels
All session types support reporting labels that appear in the BrowserStack Automate dashboard:
Field | Description |
| Group sessions under a project name |
| Tag sessions with a build/version label |
| Name for the individual test session |
BrowserStack Tools
Tool | Description |
| Upload a local |
| List apps previously uploaded to your BrowserStack account |
Features
Browser Automation
Session Management: Start and close browser sessions (Chrome, Firefox, Edge, Safari) with headless/headed modes
Navigation & Interaction: Navigate URLs, click elements, fill forms, and retrieve content
Page Analysis: Get visible elements, accessibility trees, take screenshots
Cookie Management: Get, set, and delete cookies
Scrolling: Smooth scrolling with configurable distances
Attach to running Chrome: Connect to an existing Chrome window via
--remote-debugging-port— ideal for testing authenticated or pre-configured sessionsDevice emulation: Apply mobile/tablet presets (iPhone 15, Pixel 7, etc.) to simulate responsive layouts without a physical device
Session Recording: All tool calls are automatically recorded and exportable as runnable WebdriverIO JS
Mobile App Automation (iOS/Android)
Native App Testing: Test iOS (.app/.ipa) and Android (.apk) apps via Appium
Touch Gestures: Tap, swipe, long-press, drag-and-drop
App Lifecycle: Launch, background, terminate, check app state
Context Switching: Seamlessly switch between native and webview contexts for hybrid apps
Device Control: Rotate, lock/unlock, geolocation, keyboard control, notifications
Cross-Platform Selectors: Accessibility IDs, XPath, UiAutomator (Android), Predicates (iOS)
Available Tools
Session Management
Tool | Description |
| Start a browser or app session. Use |
| Launch a new Chrome instance with remote debugging enabled (for use with |
| Close or detach from the current session (supports |
| Emulate a mobile/tablet device preset (viewport, DPR, UA, touch); requires BiDi session |
Navigation & Page Interaction (Web & Mobile)
Tool | Description |
| Navigate to a URL |
| Get visible, interactable elements on the page. Supports |
| Scroll in a direction (up/down) by specified pixels |
| Execute arbitrary JavaScript in the browser context |
| Switch to a different browser tab by index or URL |
Element Interaction (Web & Mobile)
Tool | Description |
| Click an element |
| Type text into input fields |
Cookie Management (Web)
Tool | Description |
| Set a cookie with name, value, and optional attributes |
| Delete all cookies or a specific cookie |
Mobile Gestures (iOS/Android)
Tool | Description |
| Tap an element by selector or coordinates |
| Swipe in a direction (up/down/left/right) |
| Drag from one location to another |
Context Switching (Hybrid Apps)
Tool | Description |
| Switch between native and webview contexts |
Device Control (iOS/Android)
Tool | Description |
| Rotate to portrait or landscape |
| Hide on-screen keyboard |
| Set device GPS location |
MCP Resources (read-only, no tool call needed)
Resource | Description |
| Index of all recorded sessions |
| Step log for the active session |
| Generated runnable WebdriverIO JS for the active session |
| Step log for any past session by ID |
| Generated JS for any past session by ID |
| Interactable elements (viewport-only by default) |
| Accessibility tree |
| Screenshot (base64) |
| Browser cookies |
| Open browser tabs |
| Native/webview contexts (mobile) |
| Currently active context (mobile) |
| Mobile app state |
| Device geolocation |
| Resolved WebDriver capabilities for the active session |
| BrowserStack Local binary download URL and start command |
Usage Examples
Real-World Test Cases
Example 1: Testing Demo Android App (Book Scanning)
Test the Demo Android app at C:\Users\demo-liveApiGbRegionNonMinifiedRelease-3018788.apk on emulator-5554:
1. Start the app with auto-grant permissions
2. Get visible elements on the onboarding screen
3. Tap "Skip" to bypass onboarding
4. Verify main screen loads
5. Take a screenshotExample 2: Testing World of Books E-commerce Site
You are a Testing expert, and want to assess the basic workflows of worldofbooks.com:
- Open World of Books (accept all cookies)
- Get visible elements to see navigation structure
- Search for a fiction book
- Choose one and validate if there are NEW and used book options
- Report your findings at the endBrowser Automation
Basic web testing prompt:
You are a Testing expert, and want to assess the basic workflows of a web application:
- Open World of Books (accept all cookies)
- Search for a fiction book
- Choose one and validate if there are NEW and used book options
- Report your findings at the endBrowser configuration options:
// Default settings (headed mode, 1280x1080)
start_session({platform: 'browser'})
// Firefox
start_session({platform: 'browser', browser: 'firefox'})
// Edge
start_session({platform: 'browser', browser: 'edge'})
// Safari (headed only; requires macOS)
start_session({platform: 'browser', browser: 'safari'})
// Headless mode
start_session({platform: 'browser', headless: true})
// Custom dimensions
start_session({platform: 'browser', windowWidth: 1920, windowHeight: 1080})
// Pass custom capabilities (e.g. Chrome extensions, profile, prefs)
start_session({
platform: 'browser',
headless: false,
capabilities: {
'goog:chromeOptions': {
args: ['--user-data-dir=/tmp/wdio-mcp-profile', '--load-extension=/path/to/unpacked-extension']
}
}
})Attach to a running Chrome instance:
// First, launch Chrome with remote debugging enabled:
//
// macOS (must quit Chrome first — open -a ignores args if Chrome is already running):
// pkill -x "Google Chrome" && sleep 1
// /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
// --remote-debugging-port=9222 \
// --user-data-dir=/tmp/chrome-debug &
//
// Linux:
// google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug &
//
// Verify it's ready: curl http://localhost:9222/json/version
start_session({attach: true})
start_session({attach: true, port: 9333})
start_session({attach: true, port: 9222, navigationUrl: 'https://app.example.com'})Device emulation (requires BiDi session):
// Device emulation (requires BiDi session)
start_browser({capabilities: {webSocketUrl: true}})
emulate_device() // list available presets
emulate_device({device: 'iPhone 15'}) // activate emulation
emulate_device({device: 'Pixel 7'}) // switch device
emulate_device({device: 'reset'}) // restore desktop defaultsMobile App Automation
Testing an iOS app on simulator:
Test my iOS app located at /path/to/MyApp.app on iPhone 15 Pro simulator:
1. Start the app session
2. Tap the login button
3. Enter "testuser" in the username field
4. Take a screenshot of the home screen
5. Close the sessionPreserving app state between sessions:
Test my Android app without resetting data:
1. Start app session with noReset: true and fullReset: false
2. App launches with existing login state and user data preserved
3. Run test scenarios
4. Close session (app remains installed with data intact)Testing an iOS app on real device:
Test my iOS app on my physical iPhone:
1. Start app session with:
- platform: iOS
- appPath: /path/to/MyApp.ipa
- deviceName: My iPhone
- udid: 00008030-001234567890ABCD (your device's UDID)
- platformVersion: 17.0
2. Run your test scenario
3. Close the sessionTesting an Android app:
Test my Android app /path/to/app.apk on the Pixel_6_API_34 emulator:
1. Start the app with auto-grant permissions
2. Get visible elements (use inViewportOnly: false to see all elements)
3. Swipe up to scroll
4. Tap on the "Settings" button using text matching
5. Verify the settings screen is displayedAdvanced element detection:
Test my app and debug layout issues:
1. Start the app session
2. Get visible elements with includeContainers: true to see the layout hierarchy
3. Analyze ViewGroup, FrameLayout, and ScrollView containers
4. Use inViewportOnly: false to find off-screen elements that need scrollingHybrid app testing (switching contexts):
Test my hybrid app:
1. Start the Android app session
2. Tap "Open Web" button in native context
3. List available contexts
4. Switch to WEBVIEW context
5. Click the login button using CSS selector
6. Switch back to NATIVE_APP context
7. Verify we're back on the home screenImportant Notes
⚠️ Session Management:
Only one session (browser OR app) can be active at a time
Always close sessions when done to free system resources
To switch between browser and mobile, close the current session first
Use
close_session({ detach: true })to disconnect without terminating the session on the Appium serverState preservation can be controlled with
noResetandfullResetparameters during session creationSessions created with
noReset: trueor withoutappPathwill automatically detach on close
⚠️ Task Planning:
Break complex automation into smaller, focused operations
Claude may consume message limits quickly with extensive automation
⚠️ Mobile Automation:
Appium server must be running before starting mobile sessions
Ensure emulators/simulators are running and devices are connected
iOS automation requires macOS with Xcode installed
iOS Real Devices: Testing on physical iOS devices requires the device's UDID (40-character unique identifier). See Prerequisites section for how to find your UDID
Selector Syntax Quick Reference
Web (CSS/XPath):
CSS:
button.my-class,#element-idXPath:
//button[@class='my-class']Text:
button=Exact text,a*=Contains text
Mobile (Cross-Platform):
Accessibility ID:
~loginButton(works on both iOS and Android)Android UiAutomator:
android=new UiSelector().text("Login")iOS Predicate:
-ios predicate string:label == "Login" AND visible == 1XPath:
//android.widget.Button[@text="Login"]
Advanced Features
App State Preservation
State Preservation with noReset/fullReset:
Control app state when creating new sessions using the noReset and fullReset parameters:
noReset | fullReset | Behavior |
|
| Preserve state: App stays installed, data preserved |
|
| Clear app data but keep app installed (default) |
|
| Full reset: Uninstall and reinstall app (clean slate) |
Example with state preservation:
// Preserve login state between test runs
start_session({
platform: 'android',
appPath: '/path/to/app.apk',
deviceName: 'emulator-5554',
noReset: true, // Don't reset app state
fullReset: false, // Don't uninstall
autoGrantPermissions: true,
capabilities: {
'appium:chromedriverExecutable': '/path/to/chromedriver',
'appium:autoWebview': true
}
})
// App launches with existing user data, login tokens, preferences intactDetach from Sessions:
The close_session tool supports a detach parameter that disconnects from the session without terminating it on the
Appium server:
// Detach without killing the session
close_session({detach: true})
// Standard session termination (closes the app and removes session)
close_session({detach: false}) // or just close_session()Sessions created with noReset: true or without appPath will automatically detach on close.
This is particularly useful when:
Preserving app state for manual testing continuation
Debugging multi-step workflows (leave session running between tool invocations)
Testing scenarios where you want the app to remain installed and in current state
Smart Element Detection
Platform-specific element classification: Automatically identifies interactable elements vs layout containers
Android: Button, EditText, CheckBox vs ViewGroup, FrameLayout, ScrollView
iOS: Button, TextField, Switch vs View, StackView, CollectionView
Multiple locator strategies: Each element provides accessibility ID, resource ID, text, XPath, and platform-specific selectors
Viewport filtering: Control whether to get only visible elements or all elements including off-screen
Layout debugging: Optionally include container elements to understand UI hierarchy
Automatic Permission & Alert Handling
Both iOS and Android sessions now support automatic handling of system permissions and alerts:
autoGrantPermissions(default: true): Automatically grants app permissions (camera, location, etc.)autoAcceptAlerts(default: true): Automatically accepts system alerts and dialogsautoDismissAlerts(optional): Set to true to dismiss alerts instead of accepting them
This eliminates the need to manually handle permission popups during automated testing.
Technical Details
Built with: TypeScript, WebDriverIO, Appium
Browser Support: Chrome, Firefox, Edge (headed/headless, automated driver management), Safari (headed only; macOS)
Mobile Support: iOS (XCUITest) and Android (UiAutomator2/Espresso)
Protocol: Model Context Protocol (MCP) for Claude Desktop integration
Session Model: Single active session (browser or mobile app)
Data Format: TOON (Token-Oriented Object Notation) for efficient LLM communication
Element Detection: XML-based page source parsing with intelligent filtering and multi-strategy locator generation
Session Recording & Code Export
Every tool call is automatically recorded to a session history. You can inspect sessions and export runnable code via MCP resources — no extra tool calls needed:
wdio://sessions— lists all recorded sessions with type, timestamps, and step countwdio://session/current/steps— step log for the active sessionwdio://session/current/code— generated runnable WebdriverIO JS for the active sessionwdio://session/{sessionId}/steps— step log for any past session by IDwdio://session/{sessionId}/code— generated JS for any past session by ID
The generated script reconstructs the full session — including capabilities, navigation, clicks, and inputs — as a standalone import { remote } from 'webdriverio' file. For BrowserStack sessions it includes the full try/catch/finally with automatic session result marking.
Troubleshooting
Browser automation not working?
Ensure Chrome, Firefox, Edge, or Safari is installed (Safari requires macOS)
Try restarting Claude Desktop completely
Check that no other WebDriver instances are running
Mobile automation not working?
Verify Appium server is running:
appiumCheck device/emulator is running:
adb devices(Android) or Xcode Devices (iOS)Ensure correct platform drivers are installed
Verify app path is correct and accessible
Found issues or have suggestions? Please share your feedback!