Search for:

Server for MCP Protocol with On-Screen UI Interaction

  • Why this server?

    Provides screenshot and OCR capabilities for macOS, allowing the AI to 'see' the screen.

    A
    security
    A
    license
    A
    quality
    Provides screenshot and OCR capabilities for macOS.
    1
    35
    10
    JavaScript
    MIT License
    • Apple
  • Why this server?

    Automates interactions with SAP GUI, allowing precise control through actions like clicking, typing, and scrolling in the SAP interface.

    -
    security
    A
    license
    -
    quality
    Automates interactions with SAP GUI using the Model Context Protocol, allowing precise control of SAP transactions through tools like clicking, typing, scrolling, and transaction management.
    9
    Python
    MIT License
  • Why this server?

    Enables control over Windows system operations, including mouse and keyboard control, window management, and screen capture.

    -
    security
    A
    license
    -
    quality
    A Windows control server built using nut.js and Model Context Protocol (MCP), providing programmatic control over Windows system operations including mouse, keyboard, window management, and screen capture functionality.
    189
    43
    TypeScript
    MIT License
  • Why this server?

    Enables the AI to control web browsers and interact with web pages.

    A
    security
    F
    license
    A
    quality
    A server that enables browser automation using Playwright, allowing interaction with web pages, capturing screenshots, and executing JavaScript in a browser environment through LLMs.
    12
    15,918
    1
    TypeScript
  • Why this server?

    Facilitates the capture of screenshots of web pages and local HTML files.

    -
    security
    F
    license
    -
    quality
    Enables capturing screenshots of web pages and local HTML files through a simple MCP tool interface using Puppeteer with configurable options for dimensions and output paths.
    1
    0
    4
    JavaScript
  • Why this server?

    Enables browser automation through Selenium, allowing interaction with web elements and performing user actions.

    -
    security
    A
    license
    -
    quality
    Enables browser automation using the Selenium WebDriver through MCP, supporting browser management, element location, and both basic and advanced user interactions.
    175
    21
    JavaScript
    MIT License
  • Why this server?

    Enables users to send live webcam images to MCP clients, facilitating interaction through capturing images, screenshots, and providing a webcam view for visual input.

    A
    security
    A
    license
    A
    quality
    Enables users to send live webcam images to Claude Desktop or other MCP clients, facilitating interaction through capturing images, screenshots, and providing a webcam view for visual input.
    2
    143
    15
    TypeScript
    MIT License
    • Apple
  • Why this server?

    An AI-powered development toolkit for Cursor providing intelligent coding assistance through advanced reasoning, UI screenshot analysis, and code review tools.

    -
    security
    A
    license
    -
    quality
    An AI-powered development toolkit for Cursor providing intelligent coding assistance through advanced reasoning, UI screenshot analysis, and code review tools.
    1,679
    240
    TypeScript
    MIT License
  • Why this server?

    Automatic operation of on-screen GUI.

  • Why this server?

    Enables AI agents to control web browsers using natural language, featuring automated browsing, form filling, vision-based element detection, and structured JSON responses for systematic browser control.

    A
    security
    F
    license
    A
    quality
    Enables AI agents to interact with web browsers using natural language, featuring automated browsing, form filling, vision-based element detection, and structured JSON responses for systematic browser control.
    1
    23
    Python
    • Linux
    • Apple