playwright_get_text_content
Extract text content from all elements on a webpage using Playwright automation, enabling efficient data retrieval and analysis for web interactions.
Instructions
Get the text content of all elements
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Implementation Reference
- src/playwright_server/server.py:281-322 (handler)The GetTextContentToolHandler class implements the core logic for the 'playwright_get_text_content' tool. It retrieves text content from visible elements (with few children) using a custom JavaScript evaluation on the page, collects unique texts and values, and returns them as a text content artifact.class GetTextContentToolHandler(ToolHandler): async def handle(self, name: str, arguments: dict | None) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]: if not self._sessions: return [types.TextContent(type="text", text="No active session. Please create a new session first.")] session_id = list(self._sessions.keys())[-1] page = self._sessions[session_id]["page"] # text_contents = await page.locator('body').all_inner_texts() async def get_unique_texts_js(page): unique_texts = await page.evaluate('''() => { var elements = Array.from(document.querySelectorAll('*')); // 先选择所有元素,再进行过滤 var uniqueTexts = new Set(); for (var element of elements) { if (element.offsetWidth > 0 || element.offsetHeight > 0) { // 判断是否可见 var childrenCount = element.querySelectorAll('*').length; if (childrenCount <= 3) { var innerText = element.innerText ? element.innerText.trim() : ''; if (innerText && innerText.length <= 1000) { uniqueTexts.add(innerText); } var value = element.getAttribute('value'); if (value) { uniqueTexts.add(value); } } } } //console.log( Array.from(uniqueTexts)); return Array.from(uniqueTexts); } ''') return unique_texts # 使用示例 text_contents = await get_unique_texts_js(page) return [types.TextContent(type="text", text=f"Text content of all elements: {text_contents}")]
- The tool schema definition for 'playwright_get_text_content' in the list_tools handler, specifying the name, description, and empty input schema (no required arguments).types.Tool( name="playwright_get_text_content", description="Get the text content of all elements", inputSchema={ "type": "object", "properties": { }, } ),
- src/playwright_server/server.py:334-344 (registration)The tool_handlers dictionary registers an instance of GetTextContentToolHandler for the 'playwright_get_text_content' key, which is used by the call_tool handler to dispatch tool calls.tool_handlers = { "playwright_navigate": NavigateToolHandler(), "playwright_screenshot": ScreenshotToolHandler(), "playwright_click": ClickToolHandler(), "playwright_fill": FillToolHandler(), "playwright_evaluate": EvaluateToolHandler(), "playwright_click_text": ClickTextToolHandler(), "playwright_get_text_content": GetTextContentToolHandler(), "playwright_get_html_content": GetHtmlContentToolHandler(), "playwright_new_session":NewSessionToolHandler(), }