Skip to main content
Glama

get_content

Extract rendered HTML content from webpages for web scraping and content analysis. Use this tool to retrieve fully loaded page content with options to wait for specific elements or conditions before extraction.

Instructions

Extract rendered HTML content from a webpage

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
waitForSelectorNo
waitForFunctionNo

Implementation Reference

  • Primary MCP server handler for the 'get_content' tool. Validates arguments, calls BrowserlessClient.getContent, and formats the response as MCP content blocks including extracted HTML.
    case 'get_content': { if (!args) throw new Error('Arguments are required'); const result = await this.client!.getContent(args as any); if (result.success && result.data) { return { content: [ { type: 'text', text: `Content extracted successfully from ${result.data.url}`, }, { type: 'text', text: `Title: ${result.data.title}`, }, { type: 'text', text: result.data.html, }, ], }; } else { throw new Error(result.error || 'Failed to get content'); } }
  • BrowserlessClient helper method that implements the core logic by making an HTTP POST request to the Browserless server '/content' endpoint to extract webpage content.
    async getContent(request: ContentRequest): Promise<BrowserlessResponse<ContentResponse>> { try { const response: AxiosResponse<ContentResponse> = await this.httpClient.post('/content', request); return { success: true, data: response.data, }; } catch (error) { return this.handleError(error); } }
  • src/index.ts:110-134 (registration)
    Tool registration in the ListTools response, defining the name, description, and input schema for 'get_content'.
    { name: 'get_content', description: 'Extract rendered HTML content from a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string' }, waitForSelector: { type: 'object', properties: { selector: { type: 'string' }, timeout: { type: 'number' }, }, }, waitForFunction: { type: 'object', properties: { fn: { type: 'string' }, timeout: { type: 'number' }, }, }, }, required: ['url'], }, },
  • Zod schema definition for ContentRequest type used in getContent requests, providing detailed input validation.
    export const ContentRequestSchema = z.object({ url: z.string(), gotoOptions: z.object({ waitUntil: z.string().optional(), timeout: z.number().optional(), }).optional(), waitForSelector: WaitForSelectorSchema.optional(), waitForFunction: WaitForFunctionSchema.optional(), waitForTimeout: z.number().optional(), addScriptTag: z.array(ScriptTagSchema).optional(), headers: z.record(z.string()).optional(), cookies: z.array(CookieSchema).optional(), viewport: ViewportSchema.optional(), }); export type ContentRequest = z.infer<typeof ContentRequestSchema>;
  • Alternative simple MCP server handler for 'get_content' using direct axios call to Browserless /content endpoint.
    case 'get_content': { if (!args?.url) throw new Error('URL is required'); const response = await axios.post(`${this.browserlessUrl}/content`, { url: args.url, ...(args.waitForSelector ? { waitForSelector: args.waitForSelector } : {}), }, { timeout: 15000 }); return { content: [ { type: 'text', text: `Content extracted successfully from ${args.url}`, }, { type: 'text', text: response.data, }, ], }; }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Lizzard-Solutions/browserless-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server