get_url_content_info
Extract webpage content from a specified URL. Use this tool to retrieve text and data directly from web addresses, enabling real-time information access and analysis.
Instructions
从指定URL获取网页内容
参数:
url (str): 需要提取内容的网页地址
返回:
str: 从URL提取的网页内容
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes |
Input Schema (JSON Schema)
{
"properties": {
"url": {
"title": "Url",
"type": "string"
}
},
"required": [
"url"
],
"title": "get_url_content_infoArguments",
"type": "object"
}
Implementation Reference
- mcp2tavily.py:123-133 (handler)The main handler function for the 'get_url_content_info' tool, decorated with @mcp.tool(). It takes a URL and delegates to the internal _get_url_content helper function.@mcp.tool() def get_url_content_info(url: str) -> str: """从指定URL获取网页内容 参数: url (str): 需要提取内容的网页地址 返回: str: 从URL提取的网页内容 """ return _get_url_content(url)
- mcp2tavily.py:62-99 (helper)Internal helper function that performs the core logic of extracting and formatting content from the given URL using the TavilyClient.extract() method.def _get_url_content(url: str) -> str: """Internal function to get content from a specific URL using Tavily API""" try: tavily_client = TavilyClient(api_key=API_KEY) logger.info(f"Attempting to extract content from URL: {url}") response = tavily_client.extract(url) # logger.info(f"Raw API response: {response}") # 使用 logger 替代 print # 处理返回的数据结构 results = response.get('results', []) if not results: logger.error(f"No results found in response: {response}") return "No content found for this URL. API response contains no results." # 获取第一个结果的原始内容 first_result = results[0] # logger.info(f"First result structure: {list(first_result.keys())}") # 只记录键名,避免日志过大 content = first_result.get('raw_content', '') if not content: logger.error("No raw_content found in first result") return "No raw content available in the API response" # 确保响应文本是UTF-8编码 content = content.encode('utf-8').decode('utf-8') # 添加一些元数据到输出中 metadata = f"URL: {url}\n" metadata += f"Content length: {len(content)} characters\n" metadata += "---\n\n" logger.info(f"Successfully extracted content with length: {len(content)}") return f"{metadata}{content}" except Exception as e: logger.exception(f"Detailed error while extracting URL content") return f"Error getting content from URL: {str(e)}"
- mcp2tavily.py:123-123 (registration)The @mcp.tool() decorator registers the get_url_content_info function as an MCP tool.@mcp.tool()