Skip to main content
Glama
eva-wanxin-git

Windows Automation MCP Server

move_mouse

Move the mouse cursor to specific screen coordinates (X and Y positions) to automate pointer positioning in Windows applications.

Instructions

移动鼠标到指定位置

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
xYesX 坐标
yYesY 坐标
smoothNo是否平滑移动(可选)

Implementation Reference

  • The primary handler function that executes the mouse movement logic using robotjs, supporting smooth or direct movement.
    moveMouse(x, y, smooth = false) {
      try {
        if (smooth) {
          this.robot.moveMouseSmooth(x, y);
        } else {
          this.robot.moveMouse(x, y);
        }
        return { success: true, position: { x, y }, message: '鼠标已移动' };
      } catch (error) {
        return { success: false, error: error.message };
      }
    }
  • Tool definition including name, description, and input schema for validating arguments x, y, and optional smooth.
    {
      name: 'move_mouse',
      description: '移动鼠标到指定位置',
      inputSchema: {
        type: 'object',
        properties: {
          x: { type: 'number', description: 'X 坐标' },
          y: { type: 'number', description: 'Y 坐标' },
          smooth: { type: 'boolean', description: '是否平滑移动(可选)' },
        },
        required: ['x', 'y'],
      },
    },
  • Registers 'move_mouse' as a supported tool by including it in the canHandle check list.
    canHandle(toolName) {
      const tools = ['move_mouse', 'mouse_click', 'type_text', 'press_key', 
                     'get_mouse_position', 'get_screen_size'];
      return tools.includes(toolName);
    }
  • Dispatch in executeTool method that routes 'move_mouse' calls to the moveMouse handler.
    case 'move_mouse':
      return this.moveMouse(args.x, args.y, args.smooth);
  • Primary registration point where the tool definitions are provided for the MCP protocol.
    getToolDefinitions() {
      return [
        {
          name: 'move_mouse',
          description: '移动鼠标到指定位置',
          inputSchema: {
            type: 'object',
            properties: {
              x: { type: 'number', description: 'X 坐标' },
              y: { type: 'number', description: 'Y 坐标' },
              smooth: { type: 'boolean', description: '是否平滑移动(可选)' },
            },
            required: ['x', 'y'],
          },
        },
        {
          name: 'mouse_click',
          description: '鼠标点击',
          inputSchema: {
            type: 'object',
            properties: {
              button: { type: 'string', enum: ['left', 'right', 'middle'], description: '按钮类型' },
              double: { type: 'boolean', description: '是否双击(可选)' },
            },
          },
        },
        {
          name: 'type_text',
          description: '输入文本',
          inputSchema: {
            type: 'object',
            properties: {
              text: { type: 'string', description: '要输入的文本' },
              delay: { type: 'number', description: '每个字符之间的延迟(毫秒,可选)' },
            },
            required: ['text'],
          },
        },
        {
          name: 'press_key',
          description: '按下键盘按键',
          inputSchema: {
            type: 'object',
            properties: {
              key: { type: 'string', description: '按键名称(如 enter, tab, escape)' },
              modifiers: { type: 'array', items: { type: 'string' }, description: '修饰键(如 control, shift, alt)' },
            },
            required: ['key'],
          },
        },
        {
          name: 'get_mouse_position',
          description: '获取当前鼠标位置',
          inputSchema: {
            type: 'object',
            properties: {},
          },
        },
        {
          name: 'get_screen_size',
          description: '获取屏幕尺寸',
          inputSchema: {
            type: 'object',
            properties: {},
          },
        },
      ];
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden but offers minimal behavioral context. It states the action ('move mouse') but doesn't disclose whether this requires elevated permissions, affects system state, has rate limits, or what happens if coordinates are out of bounds. For a tool that interacts with system input, this lack of transparency is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that directly states the tool's function. It's front-loaded with the core action and contains no redundant or unnecessary information. Every word earns its place, making it highly concise and well-structured for its purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (system interaction with 3 parameters), lack of annotations, and no output schema, the description is incomplete. It doesn't explain what happens after the move (e.g., whether it returns success/failure, if the mouse stays at the position), nor does it cover error conditions or behavioral nuances. For a tool that could have side effects, this is inadequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear parameter descriptions in Chinese ('X坐标', 'Y坐标', '是否平滑移动'). The tool description adds no additional parameter semantics beyond what the schema provides. Since the schema adequately documents parameters, the baseline score of 3 is appropriate—no extra value is added, but no gap exists either.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '移动鼠标到指定位置' (Move mouse to specified position) clearly states the verb ('move') and resource ('mouse'), making the purpose immediately understandable. It distinguishes from siblings like 'mouse_click' (which clicks) and 'get_mouse_position' (which reads position). However, it doesn't specify coordinate system or screen reference, which could help differentiate from potential alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention when to prefer 'mouse_click' (for clicking) or 'get_mouse_position' (for reading), nor does it specify prerequisites like needing mouse control permissions or appropriate screen context. The agent must infer usage from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/eva-wanxin-git/windows-automation-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server