Skip to main content
Glama

firecrawlsearchagent_firecrawl_scrape_url

Extract full content from a specific URL using this tool. It retrieves complete raw web page data for analysis or integration into workflows, with customizable wait time for page loading.

Instructions

Scrape full contents from a specific URL. This provides complete raw web contents from individual web pages.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to scrape and analyze
wait_timeNoTime to wait for page to load in milliseconds (default: 5000)

Implementation Reference

  • The MCP call_tool handler that executes any registered tool, including 'firecrawlsearchagent_firecrawl_scrape_url', by proxying to the remote Mesh API after looking up the agent and tool details.
    async def call_tool(name: str, arguments: dict) -> List[types.TextContent]:
        """Call the specified tool with the given arguments."""
        try:
            if name not in self.tool_registry:
                raise ValueError(f"Unknown tool: {name}")
    
            tool_info = self.tool_registry[name]
            result = await self.execute_tool(
                agent_id=tool_info["agent_id"],
                tool_name=tool_info["tool_name"],
                tool_arguments=arguments,
            )
    
            # Convert result to TextContent
            return [types.TextContent(type="text", text=str(result))]
        except Exception as e:
            logger.error(f"Error calling tool {name}: {e}")
            raise ValueError(f"Failed to call tool {name}: {str(e)}") from e
  • Dynamic registration of tools from agent metadata into tool_registry, constructing tool names like 'firecrawlsearchagent_firecrawl_scrape_url' via f\"{agent_id.lower()}_{tool_name}\". Enabled based on config.json per agent."},{
    for tool in agent_data.get("tools", []):
        if tool.get("type") == "function":
            function_data = tool.get("function", {})
            tool_name = function_data.get("name")
    
            if not tool_name:
                continue
    
            # Check if this tool is enabled based on configuration
            if not self.is_tool_enabled(agent_id, tool_name):
                logger.debug(
                    f"Skipping tool {tool_name} for agent {agent_id} (not in config)"  # noqa: E501
                )
                tools_skipped += 1
                continue
    
            # Create a unique tool ID
            tool_id = f"{agent_id.lower()}_{tool_name}"
    
            # Get parameters or create default schema
            parameters = function_data.get("parameters", {})
            if not parameters:
                parameters = {
                    "type": "object",
                    "properties": {},
                    "required": [],
                }
    
            # Store tool info
            tool_registry[tool_id] = {
                "agent_id": agent_id,
                "tool_name": tool_name,
                "description": function_data.get("description", ""),
                "parameters": parameters,
            }
            tools_enabled += 1
            logger.debug(f"Enabled tool: {tool_id}")
  • Provides the input schema for all tools, including the target tool, from the loaded agent metadata parameters.
    @app.list_tools()
    async def list_tools() -> List[types.Tool]:
        """List all available tools."""
        return [
            types.Tool(
                name=tool_id,
                description=tool_info["description"],
                inputSchema=tool_info["parameters"],
            )
            for tool_id, tool_info in self.tool_registry.items()
        ]
  • Helper function that actually calls the remote Mesh API to execute the tool on the agent, used by the handler.
    async def execute_tool(
        self, agent_id: str, tool_name: str, tool_arguments: Dict[str, Any]
    ) -> Dict[str, Any]:
        """Execute a tool on a mesh agent.
    
        Args:
            agent_id: ID of the agent to execute the tool on
            tool_name: Name of the tool to execute
            tool_arguments: Arguments to pass to the tool
    
        Returns:
            Tool execution result
    
        Raises:
            ToolExecutionError: If there's an error executing the tool
        """
        request_data = {
            "agent_id": agent_id,
            "input": {"tool": tool_name, "tool_arguments": tool_arguments},
        }
    
        # Add API key if available
        if Config.HEURIST_API_KEY:
            request_data["api_key"] = Config.HEURIST_API_KEY
    
        try:
            result = await call_mesh_api(
                "mesh_request", method="POST", json=request_data
            )
            return result.get("data", result)  # Prefer the 'data' field if it exists
        except MeshApiError as e:
            # Re-raise API errors with clearer context
            raise ToolExecutionError(str(e)) from e
        except Exception as e:
            logger.error(f"Error calling {agent_id} tool {tool_name}: {e}")
            raise ToolExecutionError(
                f"Failed to call {agent_id} tool {tool_name}: {str(e)}"
            ) from e
  • Low-level helper to make HTTP calls to the Mesh API endpoint, used by execute_tool.
    async def call_mesh_api(
        path: str, method: str = "GET", json: Dict[str, Any] = None
    ) -> Dict[str, Any]:
        """Helper function to call the mesh API endpoint.
    
        Args:
            path: API path to call
            method: HTTP method to use
            json: Optional JSON payload
    
        Returns:
            API response as dictionary
    
        Raises:
            MeshApiError: If there's an error calling the API
        """
        async with aiohttp.ClientSession() as session:
            url = f"{Config.HEURIST_API_ENDPOINT}/{path}"
            try:
                headers = {}
                if Config.HEURIST_API_KEY:
                    headers["X-HEURIST-API-KEY"] = Config.HEURIST_API_KEY
    
                async with session.request(
                    method, url, json=json, headers=headers
                ) as response:
                    if response.status != 200:
                        error_text = await response.text()
                        raise MeshApiError(f"Mesh API error: {error_text}")
                    return await response.json()
            except aiohttp.ClientError as e:
                logger.error(f"Error calling mesh API: {e}")
                raise MeshApiError(f"Failed to connect to mesh API: {str(e)}") from e
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden but offers minimal behavioral insight. It mentions 'complete raw web contents' but doesn't cover potential limitations (e.g., authentication needs, rate limits, error handling, or what 'raw' entails). For a web scraping tool with zero annotation coverage, this is inadequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two sentences that directly address the tool's function without any fluff. It's front-loaded with the core purpose and efficiently specifies the scope.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of web scraping (no annotations, no output schema), the description is insufficient. It lacks details on return format, error cases, performance expectations, or how it differs from sibling tools, leaving significant gaps for agent understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents both parameters. The description adds no parameter-specific information beyond what's in the schema, meeting the baseline score of 3 for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('scrape full contents') and resource ('from a specific URL'), with 'complete raw web contents from individual web pages' providing specific scope. However, it doesn't explicitly differentiate from sibling tools like 'firecrawlsearchagent_firecrawl_extract_web_data' or 'firecrawlsearchagent_firecrawl_web_search', which prevents a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools or contexts where other scraping/search tools might be more appropriate, leaving the agent without usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/heurist-network/heurist-mesh-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server