Skip to main content
Glama
code-alchemist01

Development Tools MCP Server

extract_tables

Extract structured table data from web pages for analysis, supporting both static content and dynamic JavaScript-rendered pages.

Instructions

Extract table data from a web page

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to scrape
useBrowserNoUse browser for dynamic content

Implementation Reference

  • Registration of the 'extract_tables' tool including name, description, and input schema.
    {
      name: 'extract_tables',
      description: 'Extract table data from a web page',
      inputSchema: {
        type: 'object',
        properties: {
          url: {
            type: 'string',
            description: 'URL to scrape',
          },
          useBrowser: {
            type: 'boolean',
            description: 'Use browser for dynamic content',
            default: false,
          },
        },
        required: ['url'],
      },
    },
  • Dispatcher logic in handleWebScrapingTool for the 'extract_tables' tool, routing to static or dynamic scraper based on useBrowser flag.
    case 'extract_tables': {
      if (config.useBrowser) {
        const data = await dynamicScraper.scrapeDynamicContent(config);
        return data.tables;
      } else {
        return await staticScraper.extractTables(config);
      }
    }
  • The extractTables method in StaticScraper class, which performs the table extraction by calling scrapeHTML and returning the tables.
    async extractTables(config: ScrapingConfig): Promise<TableData[]> {
      const data = await this.scrapeHTML(config);
      return data.tables || [];
    }
  • Core helper logic inside scrapeHTML method for parsing HTML tables using Cheerio, extracting captions, headers, and rows into TableData format.
    const tables: TableData[] = [];
    $('table').each((_, tableElement) => {
      const table: TableData = {
        headers: [],
        rows: [],
      };
    
      // Extract caption
      const caption = $(tableElement).find('caption').text().trim();
      if (caption) {
        table.caption = caption;
      }
    
      // Extract headers
      $(tableElement)
        .find('thead th, thead td, tr:first-child th, tr:first-child td')
        .each((_, header) => {
          table.headers.push($(header).text().trim());
        });
    
      // Extract rows
      $(tableElement)
        .find('tbody tr, tr')
        .each((_, row) => {
          const rowData: string[] = [];
          $(row)
            .find('td, th')
            .each((_, cell) => {
              rowData.push($(cell).text().trim());
            });
          if (rowData.length > 0) {
            table.rows.push(rowData);
          }
        });
    
      if (table.headers.length > 0 || table.rows.length > 0) {
        tables.push(table);
      }
    });

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/code-alchemist01/development-tools-mcp-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server