webpage_scrape
Extract and retrieve webpage content by URL, optionally converting it to Markdown format, enabling LLMs to access and process web-based information.
Instructions
Scrape webpage by url
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| includeMarkdown | No | ||
| url | Yes |
Input Schema (JSON Schema)
{
"properties": {
"includeMarkdown": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"title": "Includemarkdown"
},
"url": {
"title": "Url",
"type": "string"
}
},
"required": [
"url"
],
"title": "WebpageRequest",
"type": "object"
}
Implementation Reference
- src/serper_mcp_server/core.py:20-22 (handler)Executes the webpage scraping by posting the request to scrape.serper.dev API endpoint using the fetch_json utility.async def scape(request: WebpageRequest) -> Dict[str, Any]: url = "https://scrape.serper.dev" return await fetch_json(url, request)
- Pydantic model defining the input schema for the webpage_scrape tool, including url and optional includeMarkdown flag.class WebpageRequest(BaseModel): url: str = Field(..., description="The url to scrape") includeMarkdown: Optional[str] = Field( "false", pattern=r"^(true|false)$", description="Include markdown in the response (boolean value as string: 'true' or 'false')", )
- src/serper_mcp_server/server.py:54-58 (registration)Registers the webpage_scrape tool with MCP server in the list_tools handler, providing name, description, and input schema.tools.append(Tool( name=SerperTools.WEBPAGE_SCRAPE, description="Scrape webpage by url", inputSchema=WebpageRequest.model_json_schema(), ))
- src/serper_mcp_server/server.py:68-71 (handler)Dispatch logic in the call_tool MCP handler that validates arguments with WebpageRequest, calls the scape function, and formats the JSON result as TextContent.if name == SerperTools.WEBPAGE_SCRAPE.value: request = WebpageRequest(**arguments) result = await scape(request) return [TextContent(text=json.dumps(result, indent=2), type="text")]
- src/serper_mcp_server/core.py:25-40 (helper)Shared utility function that performs the HTTP POST request to Serper APIs with API key, SSL context, and timeout handling.async def fetch_json(url: str, request: BaseModel) -> Dict[str, Any]: payload = request.model_dump(exclude_none=True) headers = { 'X-API-KEY': SERPER_API_KEY, 'Content-Type': 'application/json' } ssl_context = ssl.create_default_context(cafile=certifi.where()) connector = aiohttp.TCPConnector(ssl=ssl_context) timeout = aiohttp.ClientTimeout(total=AIOHTTP_TIMEOUT) async with aiohttp.ClientSession(connector=connector, timeout=timeout) as session: async with session.post(url, headers=headers, json=payload) as response: response.raise_for_status() return await response.json()