get_content
Extract HTML or text content from web pages or specific elements to enable automated data collection and content analysis.
Instructions
Get the HTML or text content of the page or a specific element
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| selector | No | Element selector (returns full page content if not specified) | |
| type | No | Content type to return | text |
| tabId | No | Tab ID to operate on (uses active tab if not specified) |
Implementation Reference
- src/tools/content.ts:53-91 (handler)The main handler function for the 'get_content' tool. It retrieves HTML or text content from the entire page or a specific element using Puppeteer page methods.async ({ selector, type, tabId }) => { const pageResult = await getPageForOperation(tabId); if (!pageResult.success) { return handleResult(pageResult); } const page = pageResult.data; const contentType = type ?? 'text'; try { if (selector) { // Get content of specific element const element = await page.$(selector); if (!element) { return handleResult(err(selectorNotFound(selector))); } const content = await element.evaluate((el, t) => { return t === 'html' ? el.innerHTML : el.textContent ?? ''; }, contentType); return handleResult(ok({ content, selector })); } else { // Get full page content let content: string; if (contentType === 'html') { content = await page.content(); } else { content = await page.evaluate(() => document.body.innerText); } return handleResult(ok({ content })); } } catch (error) { return handleResult(err(normalizeError(error))); } }
- src/schemas.ts:116-120 (schema)Zod schema defining the input parameters for the 'get_content' tool: selector (optional), type (html/text), tabId (optional).export const getContentSchema = z.object({ selector: z.string().optional().describe('Element selector (returns full page content if not specified)'), type: z.enum(['html', 'text']).optional().default('text').describe('Content type to return'), tabId: tabIdSchema, });
- src/tools/content.ts:49-92 (registration)Registration of the 'get_content' tool on the MCP server using server.tool(), including name, description, input schema, and handler.server.tool( 'get_content', 'Get the HTML or text content of the page or a specific element', getContentSchema.shape, async ({ selector, type, tabId }) => { const pageResult = await getPageForOperation(tabId); if (!pageResult.success) { return handleResult(pageResult); } const page = pageResult.data; const contentType = type ?? 'text'; try { if (selector) { // Get content of specific element const element = await page.$(selector); if (!element) { return handleResult(err(selectorNotFound(selector))); } const content = await element.evaluate((el, t) => { return t === 'html' ? el.innerHTML : el.textContent ?? ''; }, contentType); return handleResult(ok({ content, selector })); } else { // Get full page content let content: string; if (contentType === 'html') { content = await page.content(); } else { content = await page.evaluate(() => document.body.innerText); } return handleResult(ok({ content })); } } catch (error) { return handleResult(err(normalizeError(error))); } } );
- src/server.ts:25-25 (registration)Top-level call to registerContentTools which includes the get_content tool registration.registerContentTools(server);