extract_text_content
Extract text content from web pages using CSS selectors to target specific elements for data collection and analysis.
Instructions
Extract text content from the current page, optionally using a CSS selector.
Args:
selector: Optional CSS selector to target specific elements
context: Optional context object for logging (ignored)
Returns:
Extracted text content
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| selector | No | ||
| context | No |
Implementation Reference
- src/mcp_web_browser/server.py:133-168 (handler)The main handler function for the 'extract_text_content' tool. It is decorated with @mcp.tool(), which also handles registration and schema inference from the type hints and docstring. Extracts visible text content from the current Playwright page, either from the entire body or specific elements via CSS selector.@mcp.tool() async def extract_text_content( selector: Optional[str] = None, context: Optional[Any] = None ) -> str: """ Extract text content from the current page, optionally using a CSS selector. Args: selector: Optional CSS selector to target specific elements context: Optional context object for logging (ignored) Returns: Extracted text content """ global _current_page if not _current_page: raise ValueError("No page is currently loaded. Use browse_to first.") try: if selector: # If selector is provided, extract text from matching elements elements = await _current_page.query_selector_all(selector) text_content = "\n".join([await el.inner_text() for el in elements]) print(f"Extracted text from selector: {selector}", file=sys.stderr) else: # If no selector, extract all visible text from the page text_content = await _current_page.inner_text('body') return text_content except Exception as e: print(f"Error extracting text: {e}", file=sys.stderr) raise ValueError(f"Error extracting text: {e}")