scrape_by_selector

scrape_by_selector

Extract specific webpage content using CSS selectors for static or dynamic elements to support development workflows.

Instructions

Scrape content using CSS selector

Input Schema

TableJSON Schema

Name	Required	Description
`url`	Yes	URL to scrape
`selector`	Yes	CSS selector
`useBrowser`	No	Use browser for dynamic content

Implementation Reference

src/scrapers/static-scraper.ts:150-182 (handler)
Core handler function that fetches the HTML using axios, parses it with cheerio, and extracts trimmed text from all elements matching the CSS selector.
async scrapeBySelector(config: ScrapingConfig, selector: string): Promise<string[]> { if (!Validators.isValidSelector(selector)) { throw new Error('Invalid CSS selector'); } const validation = Validators.validateScrapingConfig(config); if (!validation.valid) { throw new Error(`Invalid scraping config: ${validation.errors.join(', ')}`); } try { const response = await axios.get(config.url, { headers: config.headers || { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36', }, timeout: config.timeout || 30000, }); const $ = cheerio.load(response.data); const results: string[] = []; $(selector).each((_, element) => { const text = $(element).text().trim(); if (text) { results.push(text); } }); return results; } catch (error) { throw new Error(`Failed to scrape: ${error instanceof Error ? error.message : String(error)}`); } }
src/tools/web-scraping.ts:112-134 (registration)
Registers the 'scrape_by_selector' tool in the webScrapingTools array, including name, description, and input schema.
{ name: 'scrape_by_selector', description: 'Scrape content using CSS selector', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'URL to scrape', }, selector: { type: 'string', description: 'CSS selector', }, useBrowser: { type: 'boolean', description: 'Use browser for dynamic content', default: false, }, }, required: ['url', 'selector'], }, },
src/tools/web-scraping.ts:329-339 (handler)
Tool handler case in handleWebScrapingTool that dispatches to staticScraper.scrapeBySelector for non-browser scraping or returns a stub for browser mode.
case 'scrape_by_selector': { const selector = params.selector as string; if (config.useBrowser) { // For browser, we'd need to use page.evaluate await dynamicScraper.scrapeDynamicContent(config); // Simplified - would extract by selector in real implementation return { message: 'Selector extraction with browser requires page.evaluate', selector }; } else { return await staticScraper.scrapeBySelector(config, selector); } }
src/tools/web-scraping.ts:115-133 (schema)
JSON schema defining the input parameters for the scrape_by_selector tool.
inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'URL to scrape', }, selector: { type: 'string', description: 'CSS selector', }, useBrowser: { type: 'boolean', description: 'Use browser for dynamic content', default: false, }, }, required: ['url', 'selector'], },

Development Tools MCP Server

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API