WebDriverIO MCP Server
OfficialServer Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| resources | {
"listChanged": true
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| start_sessionA | Starts a browser or mobile automation session. Only one active session at a time — starting a new one closes the existing session first. Use platform "browser" with a browser name, or "ios"/"android" with deviceName. Set attach: true to connect to a running Chrome via CDP instead of launching a new browser. |
| close_sessionA | Closes the current session or detaches without terminating. Detach preserves app state on the Appium server — sessions with noReset: true auto-detach by default. Closing a browser attach session terminates chromedriver but the Chrome process spawned by launch_chrome remains running. |
| launch_chromeA | Launches Chrome with remote debugging enabled. Wipes and recreates a temporary profile directory on each call. Mode "newInstance" (default) runs alongside existing Chrome; "freshSession" starts with an empty profile. Set copyProfileFiles to copy cookies/logins from your Default profile — changes do not sync back. After launch, call start_session with attach: true to connect. Spawns a detached Chrome process that persists if the server exits. |
| emulate_deviceA | Emulates a mobile or tablet device in the current browser session by setting viewport, DPR, user-agent, and touch events. Requires a BiDi-enabled session (start_session with capabilities: { webSocketUrl: true }). Omit device to list available presets. Pass "reset" to restore desktop defaults. Changes persist for all subsequent tool calls until reset or session close. Browser-only. |
| navigateA | Loads a URL in the current tab and waits for the page load event. Resets page state — DOM, JS runtime, timers, and frame context are destroyed. Use instead of clicking links when the target URL is known. |
| switch_tabA | Focuses a browser tab by window handle or 0-based index. All subsequent tool calls operate on the active tab. Provide handle OR index — use get_tabs to find them. Browser-only; use switch_context for mobile webviews. |
| switch_frameA | Switches WebDriver frame context into an iframe by CSS/XPath selector, or back to top-level if selector is omitted. Changes persist — all subsequent click_element, set_value, get_elements calls operate within the switched frame until you switch back. Waits up to 5s for the iframe. Browser-only. |
| scrollA | Scrolls the page vertically by a pixel amount. Browser-only — for mobile scrolling use swipe. Only supports up/down; no horizontal scrolling. |
| click_elementA | Waits for an element, scrolls it into view, and fires element.click(). May trigger navigation, form submission, or modals. Browser sessions only — on iOS element.click() is silently ignored; use tap_element instead. Default timeout: 3000ms. |
| set_valueA | Clears an input or textarea then types the given text character by character. Always replaces existing content — clearValue() runs first. Triggers input, change, and key events which may fire validation or autocomplete. Scrolls into view by default. |
| set_cookieA | Sets a browser cookie on the active session. The browser must already be on the target domain — cookies cannot be set cross-domain. Use to inject session tokens or feature flags without login flows. |
| delete_cookiesA | Deletes all cookies or a single cookie by name from the current browser session. Irreversible — deleted cookies cannot be recovered. |
| tap_elementA | Taps a matched element via element.tap() or at absolute screen coordinates (x, y). No scroll-into-view or wait — element must already be visible on screen. Use instead of click_element on iOS where element.click() is ignored. Provide selector OR both x and y. Mobile-only. |
| swipeA | Performs a full-screen swipe gesture. Direction is content movement — "up" scrolls content upward (finger moves down). For browser scrolling use scroll; for dragging a specific element use drag_and_drop. No error if content cannot scroll further. Mobile-only. |
| drag_and_dropA | Drags an element to another element or to relative x/y offsets. x and y are offsets from the source element, not absolute screen coordinates (unlike tap_element). Provide targetSelector OR both x and y. Mobile-only. |
| switch_contextA | Switches between native and webview automation contexts in a hybrid mobile app. In NATIVE_APP context, use accessibility IDs; in WEBVIEW_* context, use CSS/XPath. Changes persist for all subsequent commands. Accepts context name or 1-based index. Use get_contexts to discover available targets. Mobile-only. |
| rotate_deviceA | Rotates a mobile device to portrait or landscape orientation. Waits for the OS rotation animation to complete. Use to test orientation-dependent layouts. Mobile-only; no effect in browser sessions. |
| hide_keyboardA | Dismisses the on-screen keyboard on mobile. Call after text entry when the keyboard obscures elements. No-op if already hidden. Mobile-only. |
| set_geolocationA | Overrides GPS coordinates for the session. Affects navigator.geolocation in browsers and location services on mobile. Location permissions must already be granted to the app. |
| execute_scriptA | Executes arbitrary JavaScript in browser page context or Appium mobile: commands. Can read/modify DOM, trigger events, terminate apps, or run Android shell commands — use only when no dedicated tool covers the action. Browser: pass JS in script, use 'return' for values, string args matching selectors auto-resolve to elements. Mobile: use 'mobile: ' syntax in script with args array (e.g. "mobile: pressKey", "mobile: activateApp"). Prefer click_element/set_value/get_elements for standard interactions. |
| get_elementsA | Returns interactable elements on the current page with selectors, text, and bounding boxes. Supports filtering by element type, viewport visibility, and pagination. Use when the wdio://session/current/elements resource does not return desired elements. |
| list_appsA | List apps uploaded to BrowserStack App Automate. Reads BROWSERSTACK_USERNAME and BROWSERSTACK_ACCESS_KEY from environment. |
| upload_appA | Upload a local .apk or .ipa to BrowserStack App Automate. Returns a bs:// URL for use in start_session. |
| get_screenshotA | Takes a screenshot of the current page or screen and returns a base64-encoded image, resized and compressed for model context limits. |
| get_accessibility_treeA | Returns the page accessibility tree with roles, names, and selectors. Browser-only. Supports filtering by ARIA roles and pagination via limit/offset. |
| get_tabsA | Lists all browser tabs with handle, title, URL, and which is active. Use before switch_tab to find the target handle or index. Browser-only. |
| get_contextsA | Returns available automation contexts and the currently active one. Use before switch_context to discover NATIVE_APP and WEBVIEW_* targets. Mobile-only. |
| get_app_stateA | Returns the current state of a mobile app: not installed, not running, background, or foreground. Mobile-only. |
| get_cookiesA | Returns all cookies for the current session, or a single cookie by name. Use to verify auth state, session tokens, or feature flags after login flows. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
| sessions | JSON index of all browser and app sessions with metadata and step counts |
| session-current-steps | JSON step log for the currently active session |
| session-current-code | Generated WebdriverIO JS code for the currently active session |
| browserstack-local-binary | BrowserStack Local binary download URL and daemon setup instructions for the current platform. MUST be read and followed before using browserstackLocal: true in start_session. |
| session-current-capabilities | Raw capabilities returned by the WebDriver/Appium server for the current session. Use for debugging — shows the actual values the driver accepted, including defaults applied by BrowserStack or Appium. |
| session-current-elements | Interactable elements on the current page. Prefer this over screenshot — returns ready-to-use selectors, faster, and far fewer tokens. Only use screenshot for visual verification or debugging. |
| session-current-accessibility | Accessibility tree for the current page. Returns all elements by default. |
| session-current-screenshot | Screenshot of the current page |
| session-current-cookies | Cookies for the current session |
| session-current-contexts | Available contexts (NATIVE_APP, WEBVIEW) |
| session-current-context | Currently active context |
| session-current-geolocation | Current device geolocation |
| session-current-tabs | Browser tabs in the current session |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/webdriverio/mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server