Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": false
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| list_monitors | List all connected monitors with resolution, position, and active workspace. |
| list_workspaces | List all active workspaces with window counts. |
| switch_workspace | Switch to a workspace by name or number. Args: workspace: Workspace name or number (e.g. "1", "3", "special:scratchpad") |
| get_cursor_position | Get the current cursor position in absolute layout coordinates. |
| list_windows | List all open windows with class, title, size, and position. Args: workspace: Filter to a specific workspace number monitor: Filter to a specific monitor name |
| get_active_window | Get details about the currently focused window. |
| focus_window | Focus a window by class or title. Args: target: Window selector — "class:firefox", "title:My Document", etc. |
| close_window | Close a window (sends WM_CLOSE — apps can show save dialogs). Args: target: Window selector (e.g. "class:firefox"). If omitted, closes the active window. |
| move_window | Move a window to a position or workspace. Args: target: Window selector. If omitted, moves the active window. x: Target X position in pixels y: Target Y position in pixels workspace: Target workspace name/number to move the window to |
| resize_window | Resize a window to exact pixel dimensions. Args: width: Target width in pixels height: Target height in pixels target: Window selector. If omitted, resizes the active window. |
| toggle_fullscreen | Toggle fullscreen for the active window. Args: mode: "fullscreen" for real fullscreen, "maximize" for maximized (keeps bar) |
| toggle_floating | Toggle floating mode for a window. Args: target: Window selector. If omitted, toggles the active window. |
| clipboard_read | Read the current clipboard contents as text. |
| clipboard_write | Write text to the clipboard. Args: text: The text to copy to the clipboard |
| launch_app | Launch an application (detached, via Hyprland). Args: command: The command to run (e.g. "firefox", "kitty", "nautilus ~/Documents") |
| mouse_move | Move the mouse cursor to absolute layout coordinates. Uses Hyprland's native movecursor — pixel-accurate, no acceleration issues. Args: x: Target X coordinate y: Target Y coordinate |
| mouse_click | Click the mouse at a position (or current position if no coordinates given). Args: button: "left", "right", or "middle" x: X coordinate to click at (optional — clicks at current position if omitted) y: Y coordinate to click at (optional) double: Whether to double-click |
| mouse_scroll | Scroll the mouse wheel. Args: direction: "up" or "down" amount: Number of scroll steps (default 3) x: X coordinate to scroll at (optional) y: Y coordinate to scroll at (optional) |
| mouse_drag | Drag from one position to another. Args: start_x: Starting X coordinate start_y: Starting Y coordinate end_x: Ending X coordinate end_y: Ending Y coordinate button: Mouse button to hold during drag ("left", "right", "middle") |
| type_text | Type text as if from a keyboard. Args: text: The text to type delay_ms: Delay between keystrokes in milliseconds (0 = instant) |
| key_press | Press a key combination. Uses Hyprland's native sendshortcut — can target specific windows without focusing them. Args: keys: Key combo string like "ctrl+c", "alt+F4", "super+1", "Return" target: Optional window selector to send the key to (e.g. "class:firefox"). If omitted, sends to the active window. |
| send_shortcut | Send a keyboard shortcut via Hyprland (can target specific windows). Args: mods: Modifier keys (e.g. "CTRL", "SUPER SHIFT", "ALT CTRL", or "" for none) key: Key name (e.g. "c", "F4", "Return", "space") target: Optional window selector (e.g. "class:firefox"). Empty = active window. |
| screenshot | Take a screenshot and return it as an inline image with coordinate mapping. Returns the image AND a coordinate mapping guide so you can convert image pixel positions to absolute screen coordinates for mouse tools. Supports three capture modes:
Args: monitor: Capture a specific monitor (e.g. "DP-1"). Default: all monitors. window: Capture a specific window by selector (e.g. "class:firefox") region: Capture a region as "X,Y WxH" (e.g. "100,200 800x600") max_width: Maximum output width in pixels (default 1024, lower = smaller output) quality: JPEG quality 1-100 (default 60, lower = smaller output) include_cursor: Whether to include the cursor in the screenshot |
| find_text_on_screen | Find text on screen using OCR. Returns matching locations in screen coordinates. Take a screenshot, run OCR, and find all occurrences of the target text. Coordinates are in absolute screen space — ready to pass to mouse_click. Args: target: Text to find (case-insensitive, supports multi-word) monitor: Limit search to a specific monitor window: Limit search to a specific window (e.g. "class:discord") region: Limit search to a region "X,Y WxH" scope: "auto" (default) captures just the active window for better accuracy. "full" captures the entire desktop. |
| click_text | Find text on screen and click it — screenshot + OCR + click in one call. By default, searches only the active window for better accuracy and speed. Args: target: Text to find and click (case-insensitive) button: Mouse button ("left", "right", "middle") double: Whether to double-click monitor: Limit search to a specific monitor window: Limit search to a specific window (e.g. "class:discord") region: Limit search to a region "X,Y WxH" occurrence: Which match to click if multiple found (1 = first/best, 2 = second, etc.) scope: "auto" (default) searches the active window. "full" searches entire desktop. |
| type_into | Find a text input field, click it, type text, and optionally submit. Combines focus + OCR + click + type + Enter into one action. Searches for placeholder text in the active window to find the input field. Args: text: The text to type input_hint: Placeholder or label text near the input field to click on (e.g. "Type a message", "Search", "Message"). If omitted, tries common placeholders. submit: Whether to press Enter after typing (default False) window: Target a specific window (e.g. "class:signal"). Default: active window. |
| screenshot_with_ocr | Take a screenshot AND run OCR, returning both the image and extracted text. More efficient than calling screenshot + find_text_on_screen separately. The text includes screen coordinates for every detected word. Args: monitor: Capture a specific monitor window: Capture a specific window (e.g. "class:discord") region: Capture a region as "X,Y WxH" max_width: Maximum output width for the image (default 1024) quality: JPEG quality for the image (default 60) scope: "auto" (default) captures the active window. "full" captures entire desktop. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |