Skip to main content
Glama

interact_page

Execute sequential browser actions on recorded pages to simulate user interactions like clicking, typing, scrolling, and navigating, with all actions captured in video output.

Instructions

Perform actions on a recording page (scroll, click, hover, type, press, select, wait, navigate). Actions are performed sequentially and recorded in the video.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
sessionIdYesSession ID from record_page
actionsYesArray of actions to perform sequentially

Implementation Reference

  • Implementation of the interactWithPage function which performs actions on the browser context.
    export async function interactWithPage(sessionId, actions) {
      const session = sessions.get(sessionId);
      if (!session) throw new Error(`Session ${sessionId} not found. Active sessions: ${[...sessions.keys()].join(', ') || 'none'}`);
    
      const { page } = session;
      const results = [];
    
      for (const action of actions) {
        switch (action.type) {
          case 'wait':
            await new Promise(r => setTimeout(r, (action.ms || 1000)));
            results.push(`Waited ${action.ms || 1000}ms`);
            break;
          case 'scroll':
            await page.evaluate(({ x, y }) => window.scrollBy(x, y), { x: action.x || 0, y: action.y || 300 });
            results.push(`Scrolled by (${action.x || 0}, ${action.y || 300})`);
            break;
          case 'click':
            await page.click(action.selector, { timeout: 5000 });
            results.push(`Clicked ${action.selector}`);
            break;
          case 'hover':
            if (action.x !== undefined && action.y !== undefined) {
              await page.mouse.move(action.x, action.y, { steps: 10 });
              results.push(`Hovered at (${action.x}, ${action.y})`);
            } else {
              await page.hover(action.selector, { timeout: 5000 });
              results.push(`Hovered ${action.selector}`);
            }
            break;
          case 'type':
            if (action.selector) {
              await page.click(action.selector, { timeout: 5000 });
            }
            await page.keyboard.type(action.text || '', { delay: action.delay || 80 });
            results.push(`Typed "${action.text}" ${action.selector ? 'in ' + action.selector : ''}`);
            break;
          case 'press':
            await page.keyboard.press(action.key || 'Enter');
            results.push(`Pressed ${action.key || 'Enter'}`);
            break;
          case 'select':
            await page.selectOption(action.selector, action.value, { timeout: 5000 });
            results.push(`Selected "${action.value}" in ${action.selector}`);
            break;
          case 'navigate':
            await page.goto(action.url, { waitUntil: 'load', timeout: 30_000 });
            results.push(`Navigated to ${action.url}`);
            break;
          default:
            results.push(`Unknown action: ${action.type}`);
        }
      }
    
      return results;
    }
  • src/index.js:43-74 (registration)
    Registration of the interact_page MCP tool in index.js.
    mcp.tool(
      'interact_page',
      'Perform actions on a recording page (scroll, click, hover, type, press, select, wait, navigate). Actions are performed sequentially and recorded in the video.',
      {
        sessionId: z.string().describe('Session ID from record_page'),
        actions: z.array(z.object({
          type: z.enum(['wait', 'scroll', 'click', 'hover', 'type', 'press', 'select', 'navigate']).describe('Action type'),
          ms: z.number().optional().describe('Wait duration in ms (for wait action)'),
          x: z.number().optional().describe('Scroll X pixels (for scroll) or hover X coordinate (for hover)'),
          y: z.number().optional().describe('Scroll Y pixels (for scroll) or hover Y coordinate (for hover)'),
          selector: z.string().optional().describe('CSS selector (for click/hover/type/select)'),
          url: z.string().optional().describe('URL (for navigate)'),
          text: z.string().optional().describe('Text to type (for type action)'),
          delay: z.number().optional().describe('Typing delay between characters in ms (for type action, default 80)'),
          key: z.string().optional().describe('Key to press, e.g. "Enter", "Tab", "Escape", "Control+a" (for press action)'),
          value: z.string().optional().describe('Option value to select (for select action on <select> elements)')
        })).describe('Array of actions to perform sequentially')
      },
      async ({ sessionId, actions }) => {
        try {
          const results = await interactWithPage(sessionId, actions);
          return {
            content: [{
              type: 'text',
              text: `Actions completed:\n${results.map((r, i) => `${i + 1}. ${r}`).join('\n')}`
            }]
          };
        } catch (err) {
          return { content: [{ type: 'text', text: `Error: ${err.message}` }], isError: true };
        }
      }
    );

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcpware/pagecast'

If you have feedback or need assistance with the MCP directory API, please join our Discord server