Skip to main content
Glama
webdriverio

WebDriverIO MCP Server

Official

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault

No arguments

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}
resources
{
  "listChanged": true
}

Tools

Functions exposed to the LLM to take actions

NameDescription
start_sessionA

Starts a browser or mobile automation session. Only one active session at a time — starting a new one closes the existing session first. Use platform "browser" with a browser name, or "ios"/"android" with deviceName. Set attach: true to connect to a running Chrome via CDP instead of launching a new browser.

close_sessionA

Closes the current session or detaches without terminating. Detach preserves app state on the Appium server — sessions with noReset: true auto-detach by default. Closing a browser attach session terminates chromedriver but the Chrome process spawned by launch_chrome remains running.

launch_chromeA

Launches Chrome with remote debugging enabled. Wipes and recreates a temporary profile directory on each call. Mode "newInstance" (default) runs alongside existing Chrome; "freshSession" starts with an empty profile. Set copyProfileFiles to copy cookies/logins from your Default profile — changes do not sync back. After launch, call start_session with attach: true to connect. Spawns a detached Chrome process that persists if the server exits.

emulate_deviceA

Emulates a mobile or tablet device in the current browser session by setting viewport, DPR, user-agent, and touch events. Requires a BiDi-enabled session (start_session with capabilities: { webSocketUrl: true }). Omit device to list available presets. Pass "reset" to restore desktop defaults. Changes persist for all subsequent tool calls until reset or session close. Browser-only.

navigateA

Loads a URL in the current tab and waits for the page load event. Resets page state — DOM, JS runtime, timers, and frame context are destroyed. Use instead of clicking links when the target URL is known.

switch_tabA

Focuses a browser tab by window handle or 0-based index. All subsequent tool calls operate on the active tab. Provide handle OR index — use get_tabs to find them. Browser-only; use switch_context for mobile webviews.

switch_frameA

Switches WebDriver frame context into an iframe by CSS/XPath selector, or back to top-level if selector is omitted. Changes persist — all subsequent click_element, set_value, get_elements calls operate within the switched frame until you switch back. Waits up to 5s for the iframe. Browser-only.

scrollA

Scrolls the page vertically by a pixel amount. Browser-only — for mobile scrolling use swipe. Only supports up/down; no horizontal scrolling.

click_elementA

Waits for an element, scrolls it into view, and fires element.click(). May trigger navigation, form submission, or modals. Browser sessions only — on iOS element.click() is silently ignored; use tap_element instead. Default timeout: 3000ms.

set_valueA

Clears an input or textarea then types the given text character by character. Always replaces existing content — clearValue() runs first. Triggers input, change, and key events which may fire validation or autocomplete. Scrolls into view by default.

set_cookieA

Sets a browser cookie on the active session. The browser must already be on the target domain — cookies cannot be set cross-domain. Use to inject session tokens or feature flags without login flows.

delete_cookiesA

Deletes all cookies or a single cookie by name from the current browser session. Irreversible — deleted cookies cannot be recovered.

tap_elementA

Taps a matched element via element.tap() or at absolute screen coordinates (x, y). No scroll-into-view or wait — element must already be visible on screen. Use instead of click_element on iOS where element.click() is ignored. Provide selector OR both x and y. Mobile-only.

swipeA

Performs a full-screen swipe gesture. Direction is content movement — "up" scrolls content upward (finger moves down). For browser scrolling use scroll; for dragging a specific element use drag_and_drop. No error if content cannot scroll further. Mobile-only.

drag_and_dropA

Drags an element to another element or to relative x/y offsets. x and y are offsets from the source element, not absolute screen coordinates (unlike tap_element). Provide targetSelector OR both x and y. Mobile-only.

switch_contextA

Switches between native and webview automation contexts in a hybrid mobile app. In NATIVE_APP context, use accessibility IDs; in WEBVIEW_* context, use CSS/XPath. Changes persist for all subsequent commands. Accepts context name or 1-based index. Use get_contexts to discover available targets. Mobile-only.

rotate_deviceA

Rotates a mobile device to portrait or landscape orientation. Waits for the OS rotation animation to complete. Use to test orientation-dependent layouts. Mobile-only; no effect in browser sessions.

hide_keyboardA

Dismisses the on-screen keyboard on mobile. Call after text entry when the keyboard obscures elements. No-op if already hidden. Mobile-only.

set_geolocationA

Overrides GPS coordinates for the session. Affects navigator.geolocation in browsers and location services on mobile. Location permissions must already be granted to the app.

execute_scriptA

Executes arbitrary JavaScript in browser page context or Appium mobile: commands. Can read/modify DOM, trigger events, terminate apps, or run Android shell commands — use only when no dedicated tool covers the action. Browser: pass JS in script, use 'return' for values, string args matching selectors auto-resolve to elements. Mobile: use 'mobile: ' syntax in script with args array (e.g. "mobile: pressKey", "mobile: activateApp"). Prefer click_element/set_value/get_elements for standard interactions.

get_elementsA

Returns interactable elements on the current page with selectors, text, and bounding boxes. Supports filtering by element type, viewport visibility, and pagination. Use when the wdio://session/current/elements resource does not return desired elements.

list_appsA

List apps uploaded to BrowserStack App Automate. Reads BROWSERSTACK_USERNAME and BROWSERSTACK_ACCESS_KEY from environment.

upload_appA

Upload a local .apk or .ipa to BrowserStack App Automate. Returns a bs:// URL for use in start_session.

get_screenshotA

Takes a screenshot of the current page or screen and returns a base64-encoded image, resized and compressed for model context limits.

get_accessibility_treeA

Returns the page accessibility tree with roles, names, and selectors. Browser-only. Supports filtering by ARIA roles and pagination via limit/offset.

get_tabsA

Lists all browser tabs with handle, title, URL, and which is active. Use before switch_tab to find the target handle or index. Browser-only.

get_contextsA

Returns available automation contexts and the currently active one. Use before switch_context to discover NATIVE_APP and WEBVIEW_* targets. Mobile-only.

get_app_stateA

Returns the current state of a mobile app: not installed, not running, background, or foreground. Mobile-only.

get_cookiesA

Returns all cookies for the current session, or a single cookie by name. Use to verify auth state, session tokens, or feature flags after login flows.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription
sessionsJSON index of all browser and app sessions with metadata and step counts
session-current-stepsJSON step log for the currently active session
session-current-codeGenerated WebdriverIO JS code for the currently active session
browserstack-local-binaryBrowserStack Local binary download URL and daemon setup instructions for the current platform. MUST be read and followed before using browserstackLocal: true in start_session.
session-current-capabilitiesRaw capabilities returned by the WebDriver/Appium server for the current session. Use for debugging — shows the actual values the driver accepted, including defaults applied by BrowserStack or Appium.
session-current-elementsInteractable elements on the current page. Prefer this over screenshot — returns ready-to-use selectors, faster, and far fewer tokens. Only use screenshot for visual verification or debugging.
session-current-accessibilityAccessibility tree for the current page. Returns all elements by default.
session-current-screenshotScreenshot of the current page
session-current-cookiesCookies for the current session
session-current-contextsAvailable contexts (NATIVE_APP, WEBVIEW)
session-current-contextCurrently active context
session-current-geolocationCurrent device geolocation
session-current-tabsBrowser tabs in the current session

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/webdriverio/mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server