Skip to main content
Glama

tool_deep_dive

Research topics by searching and scraping multiple sources to build aggregated reports. Provides structured web context for developer workflows.

Instructions

Research a topic from multiple sources.

Searches and scrapes multiple pages to build a report.

Args: topic: Topic to research. depth: Number of sources (1-10, default 3).

Returns: Aggregated research report.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
topicYes
depthNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The actual implementation of the tool_deep_dive functionality, which uses DDG search and Scraper tools to aggregate a research report.
    async def deep_dive(topic: str, depth: int = 3, *, parallel: bool = True) -> str:
        """Research a topic by searching and scraping multiple sources.
    
        Args:
            topic: Topic to research.
            depth: Number of sources to scrape (1-10).
            parallel: Whether to scrape sources in parallel (faster).
    
        Returns:
            Aggregated research report with sources.
    
        Example:
            >>> report = await deep_dive("Python async/await tutorial", depth=5)
        """
        import asyncio
    
        depth = min(max(depth, 1), 10)
    
        # Search for sources
        results = await _ddg.search(topic, limit=depth * 2)  # Get more to filter
    
        # Filter to unique domains for diversity
        from urllib.parse import urlparse
    
        seen_domains = set()
        filtered_results = []
        for r in results:
            domain = urlparse(r.url).netloc
            if domain not in seen_domains and len(filtered_results) < depth:
                seen_domains.add(domain)
                filtered_results.append(r)
    
        results = filtered_results
    
        if not results:
            return f"# Research: {topic}\n\nNo results found for this topic."
    
        # Build report header
        report_lines = [
            f"# Research: {topic}\n",
            f"*Analyzed {len(results)} sources*\n",
            "## Sources\n",
        ]
    
        # Add sources list
        for i, r in enumerate(results, 1):
            report_lines.append(f"{i}. [{r.title}]({r.url})")
    
        report_lines.append("\n## Content\n")
    
        # Scrape sources (parallel or sequential)
        async def fetch_source(i: int, r) -> tuple[int, str, str, str | None]:
            """Fetch a single source."""
            try:
                doc = await _scraper.fetch(r.url)
                # Extract content without header
                lines = doc.content.split("\n")
                start = 0
                for j, line in enumerate(lines):
                    if line.startswith("> Source:"):
                        start = j + 1
                        break
                content = "\n".join(lines[start:]).strip()
                # Truncate if too long
                if len(content) > 3000:
                    content = content[:3000] + "\n\n*[Content truncated...]*"
                return (i, r.title, r.url, content)
            except Exception:
                return (i, r.title, r.url, None)
    
        if parallel:
            # Fetch all sources concurrently
            tasks = [fetch_source(i, r) for i, r in enumerate(results, 1)]
            fetched = await asyncio.gather(*tasks, return_exceptions=False)
        else:
            # Fetch sequentially
            fetched = []
            for i, r in enumerate(results, 1):
                result = await fetch_source(i, r)
                fetched.append(result)
    
        # Build content sections
        successful = 0
        for i, title, url, content in fetched:
            report_lines.append(f"### Source {i}: {title}\n")
            report_lines.append(f"> {url}\n")
    
            if content:
                report_lines.append(content)
                successful += 1
            else:
                report_lines.append("*Failed to fetch content*")
    
            report_lines.append("\n")
    
        # Add summary footer
        report_lines.append(
            f"---\n\n*Successfully retrieved {successful}/{len(results)} sources*"
        )
    
        return "\n".join(report_lines)
  • Registration of the tool_deep_dive tool using the @mcp.tool() decorator.
    @mcp.tool()
    async def tool_deep_dive(topic: str, depth: int = 3) -> str:
        """Research a topic from multiple sources.
    
        Searches and scrapes multiple pages to build a report.
    
        Args:
            topic: Topic to research.
            depth: Number of sources (1-10, default 3).
    
        Returns:
            Aggregated research report.
        """
        return await deep_dive(topic, depth)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'searches and scrapes multiple pages' and 'builds a report,' which gives some context, but lacks details on permissions, rate limits, error handling, or what 'scrapes' entails (e.g., potential blocking or ethical considerations). For a tool with no annotations, this is insufficient to fully understand its behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, starting with a high-level purpose followed by details. Sentences are efficient, with no wasted words. However, the structure could be slightly improved by integrating the 'Args' and 'Returns' sections more seamlessly into the flow, but it remains clear and concise overall.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (research and scraping from multiple sources), no annotations, and an output schema exists (indicating returns are documented elsewhere), the description is moderately complete. It covers the basic operation and parameters but lacks depth on behavioral aspects and usage context. With an output schema, it doesn't need to explain return values, but more guidance on when to use it would enhance completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It adds meaning by explaining 'topic: Topic to research' and 'depth: Number of sources (1-10, default 3),' which clarifies the parameters beyond the bare schema. However, it doesn't provide examples, constraints beyond the range, or details on how 'depth' affects the research process, leaving some gaps in understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Research a topic from multiple sources' and 'Searches and scrapes multiple pages to build a report.' It specifies the verb (research/search/scrape) and resource (multiple sources/pages). However, it doesn't explicitly differentiate from siblings like tool_search_web or tool_scrape_url, which appear to handle similar operations individually.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives is provided. The description implies usage for aggregated research, but it doesn't specify scenarios, prerequisites, or exclusions compared to siblings such as tool_search_web (for searching) or tool_summarize_page (for summarizing). This leaves the agent without clear direction on tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Y4NN777/devlens-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server