extract_text_content
Extract text from web pages accurately by specifying a CSS selector, enabling targeted content retrieval for analysis or processing. Part of the MCP Web Browser Server.
Instructions
Extract text content from the current page, optionally using a CSS selector.
Args:
selector: Optional CSS selector to target specific elements
context: Optional context object for logging (ignored)
Returns:
Extracted text content
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| context | No | ||
| selector | No |
Implementation Reference
- src/mcp_web_browser/server.py:133-168 (handler)The handler function decorated with @mcp.tool(), implementing the logic to extract text content from the current page using Playwright, optionally with a CSS selector.@mcp.tool() async def extract_text_content( selector: Optional[str] = None, context: Optional[Any] = None ) -> str: """ Extract text content from the current page, optionally using a CSS selector. Args: selector: Optional CSS selector to target specific elements context: Optional context object for logging (ignored) Returns: Extracted text content """ global _current_page if not _current_page: raise ValueError("No page is currently loaded. Use browse_to first.") try: if selector: # If selector is provided, extract text from matching elements elements = await _current_page.query_selector_all(selector) text_content = "\n".join([await el.inner_text() for el in elements]) print(f"Extracted text from selector: {selector}", file=sys.stderr) else: # If no selector, extract all visible text from the page text_content = await _current_page.inner_text('body') return text_content except Exception as e: print(f"Error extracting text: {e}", file=sys.stderr) raise ValueError(f"Error extracting text: {e}")
- Function signature and docstring defining the input schema (selector: Optional[str], context: Optional[Any]) and output (str).async def extract_text_content( selector: Optional[str] = None, context: Optional[Any] = None ) -> str: """ Extract text content from the current page, optionally using a CSS selector. Args: selector: Optional CSS selector to target specific elements context: Optional context object for logging (ignored) Returns: Extracted text content """