Skip to main content
Glama

keyboard

Send keyboard input to a window: type text or press key combinations like Ctrl+C, with automatic focus and safety guards.

Instructions

Purpose: Send keyboard input to a window: 'type' for text, 'press' for key combos. Details: action='type' inserts text (auto-clipboard for non-ASCII / IME-safe). action='press' sends key combos like 'ctrl+c'/'alt+tab'. Pass windowTitle to auto-focus and auto-guard (verifies identity, foreground, modal) before input. Omitting windowTitle acts on the active window (unguarded). Prefer: Use windowTitle to auto-focus before injection. Set lensId to enable perception guards. Use desktop_act({action:'setValue'}) for form fields backed by UIA ValuePattern. Caveats: win+r/win+x/win+s/win+l blocked for security. action='type' does not handle IME composition for CJK — use use_clipboard=true or desktop_act({action:'setValue'}) instead. Non-ASCII punctuation (em-dash etc.) auto-routes via clipboard to prevent Chrome address-bar hijack; pass forceKeystrokes:true to disable. Background mode (PostMessage/WM_CHAR) auto-engages for known terminal windows (Windows Terminal / cmd / PowerShell) so keystrokes survive user-side foreground changes; DTM_BG_AUTO=1 enables it globally. Foreground-path keystrokes for non-terminal apps run with a per-chunk foreground guard (Phase B) — when the user grabs focus mid-stream, the call aborts with FocusLostDuringType and returns context.typed/context.remaining so the caller can re-focus and resume; pass abortOnFocusLoss:false to disable. Examples: keyboard({action:'type', text:'hello', windowTitle:'Notepad'}) → text injected (guarded) keyboard({action:'type', text:'hello'}) → text injected (unguarded) keyboard({action:'press', keys:'ctrl+c'}) → copy keyboard({action:'press', keys:'escape', windowTitle:'Dialog'}) → dismiss dialog

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It extensively discloses behaviors: auto-clipboard for non-ASCII, background mode for terminals, focus leash phase B with abort on focus loss, security blocks (win+r etc.), and IME handling. This is exceptionally transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with sections (Purpose, Details, Prefer, Caveats, Examples) but is quite verbose. It is front-loaded with core purpose, but could be more concise. Every sentence adds value, but overall length may overwhelm.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (two action variants, many parameters, no output schema), the description is remarkably complete. It covers behavior for different actions, parameter interactions (e.g., abortOnFocusLoss), security blocks, error scenarios (FocusLostDuringType), and alternatives (desktop_act). Very comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so all parameters are described in schema. The description adds extra context beyond schema: explains auto-clipboard routing, use_clipboard usage rationale, abortOnFocusLoss default logic, and examples. It enriches the schema but schema already covers basics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Send keyboard input to a window: 'type' for text, 'press' for key combos.' It uses specific verbs and resources, and distinguishes between two actions (type vs press). Sibling tools like mouse_click and desktop_act are different, so no confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Prefer: Use windowTitle to auto-focus before injection. Set lensId to enable perception guards. Use desktop_act({action:'setValue'}) for form fields backed by UIA ValuePattern.' It also lists caveats and blocked keys. However, it lacks explicit 'when not to use' compared to all siblings, though the guidance is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Harusame64/desktop-touch-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server