hyprland-mcp

Overview Schema Related Servers Score Discussions

type_into

Automatically locate text input fields using OCR, click to focus, type specified text, and optionally submit with Enter for desktop automation in Hyprland environments.

Instructions

Find a text input field, click it, type text, and optionally submit.

Combines focus + OCR + click + type + Enter into one action. Searches for placeholder text in the active window to find the input field.

Args: text: The text to type input_hint: Placeholder or label text near the input field to click on (e.g. "Type a message", "Search", "Message"). If omitted, tries common placeholders. submit: Whether to press Enter after typing (default False) window: Target a specific window (e.g. "class:signal"). Default: active window.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`text`	Yes
`input_hint`	No
`submit`	No
`window`	No

Output Schema

TableJSON Schema

Name	Required	Description	Default
`result`	Yes

Implementation Reference

hyprland_mcp/server.py:557-638 (handler)

Handler function for the 'type_into' MCP tool. It uses OCR to locate an input field, clicks it, types the provided text, and optionally submits.

async def type_into(
    text: str,
    input_hint: str | None = None,
    submit: bool = False,
    window: str | None = None,
) -> str:
    """Find a text input field, click it, type text, and optionally submit.

    Combines focus + OCR + click + type + Enter into one action. Searches for
    placeholder text in the active window to find the input field.

    Args:
        text: The text to type
        input_hint: Placeholder or label text near the input field to click on
                    (e.g. "Type a message", "Search", "Message"). If omitted,
                    tries common placeholders.
        submit: Whether to press Enter after typing (default False)
        window: Target a specific window (e.g. "class:signal"). Default: active window.
    """
    from . import screenshot as ss, ocr, input as inp
    import asyncio

    # Focus the window if specified
    if window:
        await hyprctl.dispatch("focuswindow", window)
        await asyncio.sleep(0.1)

    png_bytes, origin_x, origin_y = await _auto_scope_capture(
        None, window, None,
    )

    boxes = ocr.extract_boxes(png_bytes)

    # Try to find the input field by hint text or common placeholders
    hints = [input_hint] if input_hint else [
        "Type a message",
        "Message",
        "Search",
        "Type here",
        "Write a message",
        "Enter message",
        "Say something",
    ]

    match = None
    used_hint = None
    for hint in hints:
        matches = ocr.find_text(boxes, hint)
        if matches:
            match = matches[0]
            used_hint = hint
            break

    if not match:
        all_text = ocr.extract_text(png_bytes)
        preview = all_text[:500] + "..." if len(all_text) > 500 else all_text
        tried = ", ".join(f'"{h}"' for h in hints)
        return (
            f"Could not find input field. Tried: {tried}\n\n"
            f"OCR detected text:\n{preview}"
        )

    # Click the center of the matched text
    screen_x = match["x"] + origin_x + match["w"] // 2
    screen_y = match["y"] + origin_y + match["h"] // 2

    await inp.move_cursor(screen_x, screen_y)
    await inp.click("left")
    await asyncio.sleep(0.1)

    # Type the text
    await inp.type_text(text)

    result = f"Found '{used_hint}' at ({screen_x},{screen_y}), typed {len(text)} chars"

    # Submit if requested
    if submit:
        await asyncio.sleep(0.05)
        await hyprctl.dispatch("sendshortcut", ", Return, ")
        result += ", pressed Enter"

    return result

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by disclosing key behavioral traits: it performs OCR to find fields, searches the active window by default, tries common placeholders if input_hint is omitted, and includes a submit option. It doesn't mention error handling, performance characteristics, or platform dependencies, but covers core functionality adequately.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: the first sentence states the core purpose, followed by key behavioral context, then a well-structured parameter section. Every sentence adds value with zero waste, making it easy to scan and understand.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (4 parameters, no annotations, but has output schema), the description is mostly complete. It explains what the tool does, when to use it, and all parameters. Since an output schema exists, it doesn't need to explain return values. Minor gaps include lack of error cases or performance notes, but it's sufficient for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must fully compensate. It successfully adds meaning for all 4 parameters: explains 'text' is the text to type, 'input_hint' is placeholder/label text for finding the field with examples, 'submit' controls Enter press with default, and 'window' targets specific windows with an example. This goes well beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs (find, click, type, submit) and resources (text input field). It distinguishes from siblings like 'type_text' by explaining it combines multiple actions (focus + OCR + click + type + Enter) and searches for placeholder text, making the scope and differentiation explicit.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool (to type text into an input field with optional submission) and implies alternatives by mentioning it 'combines' multiple actions, suggesting it could replace separate tools like 'click_text' + 'type_text'. However, it lacks explicit when-not-to-use guidance or named alternatives from the sibling list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/alderban107/hyprland-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server