Skip to main content
Glama

scrape_with_interaction

Extract web content after performing user interactions like clicks and scrolls to capture dynamically loaded data for development workflows.

Instructions

Scrape content after user interactions (click, scroll, etc.)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to scrape
interactionsYesList of interactions to perform

Implementation Reference

  • Tool registration in webScrapingTools array defining the 'scrape_with_interaction' tool, including name, description, and input schema for URL and interactions array.
    { name: 'scrape_with_interaction', description: 'Scrape content after user interactions (click, scroll, etc.)', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'URL to scrape', }, interactions: { type: 'array', items: { type: 'object', properties: { type: { type: 'string', enum: ['click', 'scroll', 'wait'], }, selector: { type: 'string', description: 'CSS selector for click action', }, timeout: { type: 'number', description: 'Timeout in milliseconds', }, }, }, description: 'List of interactions to perform', }, }, required: ['url', 'interactions'], },
  • Handler dispatch in handleWebScrapingTool function that extracts interactions from params, calls DynamicScraper.scrapeWithInteraction, and formats the result.
    case 'scrape_with_interaction': { const interactions = params.interactions as Array<{ type: 'click' | 'scroll' | 'wait'; selector?: string; timeout?: number; }>; const data = await dynamicScraper.scrapeWithInteraction(config, interactions); return Formatters.formatScrapedData(data); }
  • Core handler implementation in DynamicScraper class: launches headless Chromium browser with Playwright, navigates to URL, executes sequence of interactions (click on selector, scroll to bottom, wait), then extracts page title, cleaned text, and full HTML.
    async scrapeWithInteraction( config: ScrapingConfig, interactions: Array<{ type: 'click' | 'scroll' | 'wait'; selector?: string; timeout?: number }> ): Promise<ScrapedData> { const browser = await this.getBrowser(); const page = await browser.newPage(); try { if (config.headers) { await page.setExtraHTTPHeaders(config.headers); } await page.goto(config.url, { waitUntil: 'networkidle', timeout: config.timeout || 30000, }); // Perform interactions for (const interaction of interactions) { switch (interaction.type) { case 'click': if (interaction.selector) { await page.click(interaction.selector); await page.waitForTimeout(interaction.timeout || 1000); } break; case 'scroll': await page.evaluate(() => { if (typeof window !== 'undefined' && document.body) { window.scrollTo(0, document.body.scrollHeight); } }); await page.waitForTimeout(interaction.timeout || 1000); break; case 'wait': await page.waitForTimeout(interaction.timeout || 1000); break; } } // Extract content after interactions const title = await page.title(); const text = await page.evaluate(() => { return document.body.innerText.replace(/\s+/g, ' ').trim(); }); const html = await page.content(); return { url: config.url, title, text, html, scrapedAt: new Date(), }; } finally { await page.close(); } }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/code-alchemist01/development-tools-mcp-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server