crawl_web_page
Extract and process data from multiple web pages by providing a list of URLs using the Hubble MCP Server's integrated tool for efficient web crawling.
Instructions
웹 페이지 크롤링
args:
url_list: List[str], 크롤링할 웹 페이지 리스트
returns:
dict[Any] | None: 크롤링 결과
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url_list | Yes |
Implementation Reference
- data_api.py:402-421 (handler)The main handler function for the 'crawl_web_page' tool. It is registered via @mcp.tool() decorator and implements web page crawling by sending a POST request to the external Hubble API's /web_crawl endpoint with the provided list of URLs.@mcp.tool() @async_retry(exceptions=(Exception), tries=2, delay=0.3) async def crawl_web_page( url_list: List[str]) -> dict[Any] | None: ''' 웹 페이지 크롤링 args: url_list: List[str], 크롤링할 웹 페이지 리스트 returns: dict[Any] | None: 크롤링 결과 ''' async with httpx.AsyncClient() as client: headers = {"X-API-Key": HUBBLE_API_KEY} response = await client.post( f"{HUBBLE_API_URL}/web_crawl", headers=headers, json={"urls": url_list}, timeout=30.0) response.raise_for_status() return response.text
- data_api.py:402-402 (registration)The @mcp.tool() decorator registers the crawl_web_page function as an MCP tool.@mcp.tool()