get_content

get_content

Extract rendered HTML content from webpages for web scraping and content analysis. Use this tool to retrieve fully loaded page content with options to wait for specific elements or conditions before extraction.

Instructions

Extract rendered HTML content from a webpage

Input Schema

TableJSON Schema

Name	Required	Description	Default
`url`	Yes
`waitForSelector`	No
`waitForFunction`	No

Implementation Reference

src/index.ts:345-368 (handler)
Primary MCP server handler for the 'get_content' tool. Validates arguments, calls BrowserlessClient.getContent, and formats the response as MCP content blocks including extracted HTML.
case 'get_content': { if (!args) throw new Error('Arguments are required'); const result = await this.client!.getContent(args as any); if (result.success && result.data) { return { content: [ { type: 'text', text: `Content extracted successfully from ${result.data.url}`, }, { type: 'text', text: `Title: ${result.data.title}`, }, { type: 'text', text: result.data.html, }, ], }; } else { throw new Error(result.error || 'Failed to get content'); } }
src/client.ts:113-124 (helper)
BrowserlessClient helper method that implements the core logic by making an HTTP POST request to the Browserless server '/content' endpoint to extract webpage content.
async getContent(request: ContentRequest): Promise<BrowserlessResponse<ContentResponse>> { try { const response: AxiosResponse<ContentResponse> = await this.httpClient.post('/content', request); return { success: true, data: response.data, }; } catch (error) { return this.handleError(error); } }
src/index.ts:110-134 (registration)
Tool registration in the ListTools response, defining the name, description, and input schema for 'get_content'.
{ name: 'get_content', description: 'Extract rendered HTML content from a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string' }, waitForSelector: { type: 'object', properties: { selector: { type: 'string' }, timeout: { type: 'number' }, }, }, waitForFunction: { type: 'object', properties: { fn: { type: 'string' }, timeout: { type: 'number' }, }, }, }, required: ['url'], }, },
src/types.ts:151-166 (schema)
Zod schema definition for ContentRequest type used in getContent requests, providing detailed input validation.
export const ContentRequestSchema = z.object({ url: z.string(), gotoOptions: z.object({ waitUntil: z.string().optional(), timeout: z.number().optional(), }).optional(), waitForSelector: WaitForSelectorSchema.optional(), waitForFunction: WaitForFunctionSchema.optional(), waitForTimeout: z.number().optional(), addScriptTag: z.array(ScriptTagSchema).optional(), headers: z.record(z.string()).optional(), cookies: z.array(CookieSchema).optional(), viewport: ViewportSchema.optional(), }); export type ContentRequest = z.infer<typeof ContentRequestSchema>;
src/simple-server.ts:91-110 (handler)
Alternative simple MCP server handler for 'get_content' using direct axios call to Browserless /content endpoint.
case 'get_content': { if (!args?.url) throw new Error('URL is required'); const response = await axios.post(`${this.browserlessUrl}/content`, { url: args.url, ...(args.waitForSelector ? { waitForSelector: args.waitForSelector } : {}), }, { timeout: 15000 }); return { content: [ { type: 'text', text: `Content extracted successfully from ${args.url}`, }, { type: 'text', text: response.data, }, ], }; }

Browserless MCP Server

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API