Skip to main content
Glama

get_content

Extract HTML or text content from web pages or specific elements to enable automated data collection and content analysis.

Instructions

Get the HTML or text content of the page or a specific element

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
selectorNoElement selector (returns full page content if not specified)
typeNoContent type to returntext
tabIdNoTab ID to operate on (uses active tab if not specified)

Implementation Reference

  • The main handler function for the 'get_content' tool. It retrieves HTML or text content from the entire page or a specific element using Puppeteer page methods.
    async ({ selector, type, tabId }) => { const pageResult = await getPageForOperation(tabId); if (!pageResult.success) { return handleResult(pageResult); } const page = pageResult.data; const contentType = type ?? 'text'; try { if (selector) { // Get content of specific element const element = await page.$(selector); if (!element) { return handleResult(err(selectorNotFound(selector))); } const content = await element.evaluate((el, t) => { return t === 'html' ? el.innerHTML : el.textContent ?? ''; }, contentType); return handleResult(ok({ content, selector })); } else { // Get full page content let content: string; if (contentType === 'html') { content = await page.content(); } else { content = await page.evaluate(() => document.body.innerText); } return handleResult(ok({ content })); } } catch (error) { return handleResult(err(normalizeError(error))); } }
  • Zod schema defining the input parameters for the 'get_content' tool: selector (optional), type (html/text), tabId (optional).
    export const getContentSchema = z.object({ selector: z.string().optional().describe('Element selector (returns full page content if not specified)'), type: z.enum(['html', 'text']).optional().default('text').describe('Content type to return'), tabId: tabIdSchema, });
  • Registration of the 'get_content' tool on the MCP server using server.tool(), including name, description, input schema, and handler.
    server.tool( 'get_content', 'Get the HTML or text content of the page or a specific element', getContentSchema.shape, async ({ selector, type, tabId }) => { const pageResult = await getPageForOperation(tabId); if (!pageResult.success) { return handleResult(pageResult); } const page = pageResult.data; const contentType = type ?? 'text'; try { if (selector) { // Get content of specific element const element = await page.$(selector); if (!element) { return handleResult(err(selectorNotFound(selector))); } const content = await element.evaluate((el, t) => { return t === 'html' ? el.innerHTML : el.textContent ?? ''; }, contentType); return handleResult(ok({ content, selector })); } else { // Get full page content let content: string; if (contentType === 'html') { content = await page.content(); } else { content = await page.evaluate(() => document.body.innerText); } return handleResult(ok({ content })); } } catch (error) { return handleResult(err(normalizeError(error))); } } );
  • src/server.ts:25-25 (registration)
    Top-level call to registerContentTools which includes the get_content tool registration.
    registerContentTools(server);

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/andytango/puppeteer-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server