scrape_with_interaction
Extract web content after performing user interactions like clicks and scrolls to capture dynamically loaded data for development workflows.
Instructions
Scrape content after user interactions (click, scroll, etc.)
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to scrape | |
| interactions | Yes | List of interactions to perform |
Implementation Reference
- src/tools/web-scraping.ts:162-195 (registration)Tool registration in webScrapingTools array defining the 'scrape_with_interaction' tool, including name, description, and input schema for URL and interactions array.{ name: 'scrape_with_interaction', description: 'Scrape content after user interactions (click, scroll, etc.)', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'URL to scrape', }, interactions: { type: 'array', items: { type: 'object', properties: { type: { type: 'string', enum: ['click', 'scroll', 'wait'], }, selector: { type: 'string', description: 'CSS selector for click action', }, timeout: { type: 'number', description: 'Timeout in milliseconds', }, }, }, description: 'List of interactions to perform', }, }, required: ['url', 'interactions'], },
- src/tools/web-scraping.ts:346-354 (handler)Handler dispatch in handleWebScrapingTool function that extracts interactions from params, calls DynamicScraper.scrapeWithInteraction, and formats the result.case 'scrape_with_interaction': { const interactions = params.interactions as Array<{ type: 'click' | 'scroll' | 'wait'; selector?: string; timeout?: number; }>; const data = await dynamicScraper.scrapeWithInteraction(config, interactions); return Formatters.formatScrapedData(data); }
- src/scrapers/dynamic-scraper.ts:160-217 (handler)Core handler implementation in DynamicScraper class: launches headless Chromium browser with Playwright, navigates to URL, executes sequence of interactions (click on selector, scroll to bottom, wait), then extracts page title, cleaned text, and full HTML.async scrapeWithInteraction( config: ScrapingConfig, interactions: Array<{ type: 'click' | 'scroll' | 'wait'; selector?: string; timeout?: number }> ): Promise<ScrapedData> { const browser = await this.getBrowser(); const page = await browser.newPage(); try { if (config.headers) { await page.setExtraHTTPHeaders(config.headers); } await page.goto(config.url, { waitUntil: 'networkidle', timeout: config.timeout || 30000, }); // Perform interactions for (const interaction of interactions) { switch (interaction.type) { case 'click': if (interaction.selector) { await page.click(interaction.selector); await page.waitForTimeout(interaction.timeout || 1000); } break; case 'scroll': await page.evaluate(() => { if (typeof window !== 'undefined' && document.body) { window.scrollTo(0, document.body.scrollHeight); } }); await page.waitForTimeout(interaction.timeout || 1000); break; case 'wait': await page.waitForTimeout(interaction.timeout || 1000); break; } } // Extract content after interactions const title = await page.title(); const text = await page.evaluate(() => { return document.body.innerText.replace(/\s+/g, ' ').trim(); }); const html = await page.content(); return { url: config.url, title, text, html, scrapedAt: new Date(), }; } finally { await page.close(); } }