Skip to main content
Glama
mcp2everything

MCP2Tavily

get_url_content_info

Extract webpage content from any URL to analyze text, structure, and information for research or data processing tasks.

Instructions

从指定URL获取网页内容

参数:
    url (str): 需要提取内容的网页地址
    
返回:
    str: 从URL提取的网页内容

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Implementation Reference

  • The handler function for the 'get_url_content_info' tool. It is registered via the @mcp.tool() decorator and delegates the actual work to the _get_url_content helper function.
    @mcp.tool()
    def get_url_content_info(url: str) -> str:
        """从指定URL获取网页内容
        
        参数:
            url (str): 需要提取内容的网页地址
            
        返回:
            str: 从URL提取的网页内容
        """
        return _get_url_content(url)
  • The supporting utility function that contains the core implementation for fetching and processing URL content using the Tavily API's extract method.
    def _get_url_content(url: str) -> str:
        """Internal function to get content from a specific URL using Tavily API"""
        try:
            tavily_client = TavilyClient(api_key=API_KEY)
            logger.info(f"Attempting to extract content from URL: {url}")
            
            response = tavily_client.extract(url)
            # logger.info(f"Raw API response: {response}")  # 使用 logger 替代 print
            
            # 处理返回的数据结构
            results = response.get('results', [])
            if not results:
                logger.error(f"No results found in response: {response}")
                return "No content found for this URL. API response contains no results."
                
            # 获取第一个结果的原始内容
            first_result = results[0]
            # logger.info(f"First result structure: {list(first_result.keys())}")  # 只记录键名,避免日志过大
            
            content = first_result.get('raw_content', '')
            if not content:
                logger.error("No raw_content found in first result")
                return "No raw content available in the API response"
            
            # 确保响应文本是UTF-8编码
            content = content.encode('utf-8').decode('utf-8')
            
            # 添加一些元数据到输出中
            metadata = f"URL: {url}\n"
            metadata += f"Content length: {len(content)} characters\n"
            metadata += "---\n\n"
            
            logger.info(f"Successfully extracted content with length: {len(content)}")
            return f"{metadata}{content}"
            
        except Exception as e:
            logger.exception(f"Detailed error while extracting URL content")
            return f"Error getting content from URL: {str(e)}"
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool fetches content from a URL but doesn't mention critical behaviors like error handling (e.g., for invalid URLs), authentication needs, rate limits, or whether it performs web scraping or uses APIs. This leaves significant gaps in understanding how the tool operates.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, with the purpose stated first, followed by parameter and return sections. It uses minimal sentences that directly convey information without waste, though the structure could be slightly improved by integrating usage context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of web content fetching (which involves network operations, potential errors, and format variations), the description is incomplete. No annotations or output schema exist to supplement it, and it lacks details on return value format (e.g., HTML text, structured data), error cases, or behavioral traits, making it inadequate for safe and effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It adds meaning by specifying that the 'url' parameter is a '网页地址' (webpage address) for content extraction, which clarifies its purpose beyond the schema's generic 'string' type. However, it doesn't detail format constraints (e.g., must be a valid HTTP/HTTPS URL) or examples, leaving some ambiguity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '从指定URL获取网页内容' (get webpage content from a specified URL). It specifies the verb ('获取' - get) and resource ('网页内容' - webpage content). However, it doesn't differentiate from sibling tools like 'get_url_content', 'search_web', or 'search_web_info', so it lacks sibling differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus its siblings. There's no mention of alternatives, exclusions, or specific contexts for use. The agent must infer usage from the purpose alone, which is insufficient for distinguishing between similar tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcp2everything/mcp2tavily'

If you have feedback or need assistance with the MCP directory API, please join our Discord server