on_page_content_parsing
Parse webpage content to extract structured data including links, headings, and text for SEO analysis and content processing.
Instructions
This endpoint allows parsing the content on any page you specify and will return the structured content of the target page, including link URLs, anchors, headings, and textual content.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL of the page to parse | |
| enable_javascript | No | Enable JavaScript rendering | |
| custom_js | No | Custom JavaScript code to execute | |
| custom_user_agent | No | Custom User-Agent header | |
| accept_language | No | Accept-Language header value |
Implementation Reference
- The `handle` method implements the tool's core logic by sending a POST request to the DataForSEO `/v3/on_page/content_parsing/live` endpoint with the provided parameters and processing/formatting the response accordingly.async handle(params: { url: string; enable_javascript?: boolean; custom_js?: string; custom_user_agent?: string; accept_language?: string; }): Promise<any> { try { const response = await this.dataForSEOClient.makeRequest('/v3/on_page/content_parsing/live', 'POST', [{ url: params.url, enable_javascript: params.enable_javascript, custom_js: params.custom_js, custom_user_agent: params.custom_user_agent, accept_language: params.accept_language, markdown_view: true }]); console.error(JSON.stringify(response)); if(defaultGlobalToolConfig.fullResponse || this.supportOnlyFullResponse()){ let data = response as DataForSEOFullResponse; this.validateResponseFull(data); let result = data.tasks[0].result; return this.formatResponse(result); } else{ let data = response as DataForSEOResponse; this.validateResponse(data); let result = data.items[0].page_as_markdown; return this.formatResponse(result); } } catch (error) { return this.formatErrorResponse(error); } }
- Defines the input schema using Zod for parameters: url (required), and optional flags for JavaScript, custom JS, user agent, and accept language.getParams(): z.ZodRawShape { return { url: z.string().describe("URL of the page to parse"), enable_javascript: z.boolean().optional().describe("Enable JavaScript rendering"), custom_js: z.string().optional().describe("Custom JavaScript code to execute"), custom_user_agent: z.string().optional().describe("Custom User-Agent header"), accept_language: z.string().optional().describe("Accept-Language header value"), }; }
- src/core/modules/onpage/onpage-api.module.ts:6-21 (registration)The `getTools()` method in `OnPageApiModule` instantiates the `ContentParsingTool` and registers it in a tool map using its name, description, params, and wrapped handler.getTools(): Record<string, ToolDefinition> { const tools = [ new ContentParsingTool(this.dataForSEOClient), new InstantPagesTool(this.dataForSEOClient), // Add more tools here ]; return tools.reduce((acc, tool) => ({ ...acc, [tool.getName()]: { description: tool.getDescription(), params: tool.getParams(), handler: (params: any) => tool.handle(params), }, }), {}); }