Skip to main content
Glama

scrape_by_selector

Extract specific webpage content using CSS selectors for static or dynamic elements to support development workflows.

Instructions

Scrape content using CSS selector

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to scrape
selectorYesCSS selector
useBrowserNoUse browser for dynamic content

Implementation Reference

  • Core handler function that fetches the HTML using axios, parses it with cheerio, and extracts trimmed text from all elements matching the CSS selector.
    async scrapeBySelector(config: ScrapingConfig, selector: string): Promise<string[]> { if (!Validators.isValidSelector(selector)) { throw new Error('Invalid CSS selector'); } const validation = Validators.validateScrapingConfig(config); if (!validation.valid) { throw new Error(`Invalid scraping config: ${validation.errors.join(', ')}`); } try { const response = await axios.get(config.url, { headers: config.headers || { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36', }, timeout: config.timeout || 30000, }); const $ = cheerio.load(response.data); const results: string[] = []; $(selector).each((_, element) => { const text = $(element).text().trim(); if (text) { results.push(text); } }); return results; } catch (error) { throw new Error(`Failed to scrape: ${error instanceof Error ? error.message : String(error)}`); } }
  • Registers the 'scrape_by_selector' tool in the webScrapingTools array, including name, description, and input schema.
    { name: 'scrape_by_selector', description: 'Scrape content using CSS selector', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'URL to scrape', }, selector: { type: 'string', description: 'CSS selector', }, useBrowser: { type: 'boolean', description: 'Use browser for dynamic content', default: false, }, }, required: ['url', 'selector'], }, },
  • Tool handler case in handleWebScrapingTool that dispatches to staticScraper.scrapeBySelector for non-browser scraping or returns a stub for browser mode.
    case 'scrape_by_selector': { const selector = params.selector as string; if (config.useBrowser) { // For browser, we'd need to use page.evaluate await dynamicScraper.scrapeDynamicContent(config); // Simplified - would extract by selector in real implementation return { message: 'Selector extraction with browser requires page.evaluate', selector }; } else { return await staticScraper.scrapeBySelector(config, selector); } }
  • JSON schema defining the input parameters for the scrape_by_selector tool.
    inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'URL to scrape', }, selector: { type: 'string', description: 'CSS selector', }, useBrowser: { type: 'boolean', description: 'Use browser for dynamic content', default: false, }, }, required: ['url', 'selector'], },

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/code-alchemist01/development-tools-mcp-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server