Skip to main content
Glama
mcp2everything

MCP2Tavily

get_url_content

Extract content from any URL using the Tavily API to retrieve web page information for analysis or processing.

Instructions

Get the content from a specific URL using Tavily API

Args:
    url (str): The URL to extract content from
    
Returns:
    str: The extracted content from the URL

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Implementation Reference

  • Handler function for the 'get_url_content' tool, registered via @mcp.tool() decorator. It defines the input schema via type hints and docstring, and delegates to the internal _get_url_content implementation.
    @mcp.tool()
    def get_url_content(url: str) -> str:
        """Get the content from a specific URL using Tavily API
        
        Args:
            url (str): The URL to extract content from
            
        Returns:
            str: The extracted content from the URL
        """
        return _get_url_content(url)
  • The main helper function containing the tool logic: uses TavilyClient.extract() to fetch URL content, processes it with UTF-8 handling, adds metadata, and includes error handling.
    def _get_url_content(url: str) -> str:
        """Internal function to get content from a specific URL using Tavily API"""
        try:
            tavily_client = TavilyClient(api_key=API_KEY)
            logger.info(f"Attempting to extract content from URL: {url}")
            
            response = tavily_client.extract(url)
            # logger.info(f"Raw API response: {response}")  # 使用 logger 替代 print
            
            # 处理返回的数据结构
            results = response.get('results', [])
            if not results:
                logger.error(f"No results found in response: {response}")
                return "No content found for this URL. API response contains no results."
                
            # 获取第一个结果的原始内容
            first_result = results[0]
            # logger.info(f"First result structure: {list(first_result.keys())}")  # 只记录键名,避免日志过大
            
            content = first_result.get('raw_content', '')
            if not content:
                logger.error("No raw_content found in first result")
                return "No raw content available in the API response"
            
            # 确保响应文本是UTF-8编码
            content = content.encode('utf-8').decode('utf-8')
            
            # 添加一些元数据到输出中
            metadata = f"URL: {url}\n"
            metadata += f"Content length: {len(content)} characters\n"
            metadata += "---\n\n"
            
            logger.info(f"Successfully extracted content with length: {len(content)}")
            return f"{metadata}{content}"
            
        except Exception as e:
            logger.exception(f"Detailed error while extracting URL content")
            return f"Error getting content from URL: {str(e)}"
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool extracts content via Tavily API but lacks details on rate limits, authentication needs, error handling, or content limitations (e.g., text-only extraction, handling of dynamic pages). This leaves significant gaps for a tool that interacts with external APIs and web content.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and concise, with zero wasted words. It front-loads the core purpose in the first sentence, followed by clear 'Args' and 'Returns' sections. Every sentence earns its place by directly contributing to understanding the tool's function and parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of web content extraction and the lack of annotations and output schema, the description is incomplete. It doesn't explain what 'extracted content' entails (e.g., full HTML, cleaned text, metadata), potential errors (e.g., invalid URLs, timeouts), or API-specific behaviors. For a tool with external dependencies and no structured output, more context is needed to ensure reliable use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds minimal semantic value beyond the input schema. It documents the single parameter 'url' as 'The URL to extract content from,' which matches the schema's 'Url' title. With 0% schema description coverage, the description compensates slightly by confirming the parameter's purpose, but it doesn't provide additional context like URL format requirements or examples. The baseline is 3 due to the single parameter, but it doesn't fully address the coverage gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get the content from a specific URL using Tavily API.' It specifies the verb ('Get'), resource ('content from a specific URL'), and method ('using Tavily API'). However, it doesn't explicitly differentiate from sibling tools like 'get_url_content_info' or 'search_web', which likely have related but distinct functions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools such as 'get_url_content_info' (which might return metadata) or 'search_web' (which might perform broader searches), leaving the agent without context for tool selection. Usage is implied only by the tool's name and basic purpose.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcp2everything/mcp2tavily'

If you have feedback or need assistance with the MCP directory API, please join our Discord server