Skip to main content
Glama

search_brave_with_summary

Search the web using Brave Search API and get summarized results to quickly find relevant information.

Instructions

Search the web using Brave Search API

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes

Implementation Reference

  • The handler function for the 'search_brave_with_summary' tool, registered via @mcp.tool(). It takes a query string and delegates execution to the internal _do_search_with_summary helper.
    def search_brave_with_summary(query: str) -> str:
        """Search the web using Brave Search API """
        return _do_search_with_summary(query)
  • Core implementation logic for the search tool. Performs web search via Brave API, attempts to use built-in summarizer or generates one by fetching and extracting content from top results, then formats results with summary and links.
    def _do_search_with_summary(query: str) -> str:
        """Internal function to handle the search logic with summary support"""
        try:
            query = query.encode('utf-8').decode('utf-8')
            url = "https://api.search.brave.com/res/v1/web/search"
            
            headers = {
                "Accept": "application/json",
                "X-Subscription-Token": API_KEY
            }
            
            params = {
                "q": query,
                "count": 5,
                "result_filter": "web",
                "enable_summarizer": True,
                "format": "json"
            }
            
            response = requests.get(url, headers=headers, params=params)
            response.raise_for_status()
            data = response.json()
            
            logger.debug("API Response Structure:")
            logger.debug(f"Response Keys: {list(data.keys())}")
            
            # 处理搜索结果
            summary_text = ""
            search_results = []
            
            # 获取网页搜索结果
            if 'web' in data and 'results' in data['web']:
                results = data['web']['results']
                
                # 获取摘要
                if 'summarizer' in data:
                    logger.debug("Found official summarizer data")
                    summary = data.get('summarizer', {})
                    summary_text = summary.get('text', '')
                else:
                    logger.debug("No summarizer found, generating summary from top results")
                    # 使用前两个结果的内容作为摘要
                    try:
                        summaries = []
                        for result in results[:2]:  # 只处理前两个结果
                            url = result.get('url')
                            if url:
                                logger.debug(f"Fetching content from: {url}")
                                content = _get_url_content_direct(url)
                                # 提取HTML中的文本内容
                                raw_content = content.split('---\n\n')[-1]
                                text_content = _extract_text_from_html(raw_content)
                                if text_content:
                                    # 添加标题和来源信息
                                    title = result.get('title', 'No title')
                                    date = result.get('age', '') or result.get('published_time', '')
                                    summaries.append(f"### {title}")
                                    if date:
                                        summaries.append(f"Published: {date}")
                                    summaries.append(text_content)
                        
                        if summaries:
                            summary_text = "\n\n".join([
                                "Generated summary from top results:",
                                *summaries
                            ])
                            logger.debug("Successfully generated summary from content")
                        else:
                            summary_text = results[0].get('description', '')
                    except Exception as e:
                        logger.error(f"Error generating summary from content: {str(e)}")
                        summary_text = results[0].get('description', '')
                
                # 处理搜索结果显示
                for result in results:
                    title = result.get('title', 'No title').encode('utf-8').decode('utf-8')
                    url = result.get('url', 'No URL')
                    description = result.get('description', 'No description').encode('utf-8').decode('utf-8')
                    search_results.append(f"- {title}\n  URL: {url}\n  Description: {description}\n")
            
            # 组合输出
            output = []
            if summary_text:
                output.append(f"Summary:\n{summary_text}\n")
            if search_results:
                output.append("Search Results:\n" + "\n".join(search_results))
            
            logger.debug(f"Has summary: {bool(summary_text)}")
            logger.debug(f"Number of results: {len(search_results)}")
            
            return "\n".join(output) if output else "No results found for your query."
            
        except Exception as e:
            logger.error(f"Search error: {str(e)}")
            logger.exception("Detailed error trace:")
            return f"Error performing search: {str(e)}"
  • Helper function to extract meaningful text from HTML content, removing scripts/styles/etc., prioritizing article/main content, cleaning and truncating to ~1000 chars. Used for generating summaries from fetched pages.
    def _extract_text_from_html(html_content: str) -> str:
        """从HTML内容中提取有意义的文本"""
        try:
            from bs4 import BeautifulSoup
            soup = BeautifulSoup(html_content, 'html.parser')
            
            # 移除不需要的元素
            for element in soup(['script', 'style', 'header', 'footer', 'nav', 'aside', 'iframe', 'ad', '.advertisement']):
                element.decompose()
            
            # 优先提取文章主要内容
            article = soup.find('article')
            if article:
                content = article
            else:
                # 尝试找到主要内容区域
                content = soup.find(['main', '.content', '#content', '.post-content', '.article-content'])
                if not content:
                    content = soup
            
            # 获取文本
            text = content.get_text(separator='\n')
            
            # 文本清理
            lines = []
            for line in text.split('\n'):
                line = line.strip()
                # 跳过空行和太短的行
                if line and len(line) > 30:
                    lines.append(line)
            
            # 组合文本,限制在1000字符以内
            cleaned_text = ' '.join(lines)
            if len(cleaned_text) > 1000:
                # 尝试在句子边界截断
                end_pos = cleaned_text.rfind('. ', 0, 1000)
                if end_pos > 0:
                    cleaned_text = cleaned_text[:end_pos + 1]
                else:
                    cleaned_text = cleaned_text[:1000]
            
            return cleaned_text
            
        except Exception as e:
            logger.error(f"Error extracting text from HTML: {str(e)}")
            # 如果无法处理HTML,返回原始内容的一部分
            text = html_content.replace('<', ' <').replace('>', '> ').split()
            return ' '.join(text)[:500]
  • Helper function to directly fetch webpage content via requests, extract main text using BeautifulSoup similar to _extract_text_from_html, add metadata, truncate. Called when no API summarizer available.
    def _get_url_content_direct(url: str) -> str:
        """Internal function to get content directly using requests"""
        try:
            logger.debug(f"Directly fetching content from URL: {url}")
            response = requests.get(url, timeout=10, headers={
                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
            })
            response.raise_for_status()
            
            # 尝试检测编码
            if 'charset' in response.headers.get('content-type', '').lower():
                response.encoding = response.apparent_encoding
                
            try:
                from bs4 import BeautifulSoup
                soup = BeautifulSoup(response.text, 'html.parser')
                
                # 移除不需要的元素
                for element in soup(['script', 'style', 'header', 'footer', 'nav', 'aside', 'iframe', 'ad', '.advertisement']):
                    element.decompose()
                
                # 尝试找到主要内容区域
                main_content = None
                possible_content_elements = [
                    soup.find('article'),
                    soup.find('main'),
                    soup.find(class_='content'),
                    soup.find(id='content'),
                    soup.find(class_='post-content'),
                    soup.find(class_='article-content'),
                    soup.find(class_='entry-content'),
                    soup.find(class_='main-content'),
                    soup.select_one('div[class*="content"]'),  # 包含 "content" 的任何 class
                ]
                
                for element in possible_content_elements:
                    if element:
                        main_content = element
                        break
                
                if not main_content:
                    main_content = soup
                
                text = main_content.get_text(separator='\n')
                
                lines = []
                for line in text.split('\n'):
                    line = line.strip()
                    if line and len(line) > 30:
                        lines.append(line)
                
                cleaned_text = ' '.join(lines)
                if len(cleaned_text) > 1000:
                    end_pos = cleaned_text.rfind('. ', 0, 1000)
                    if end_pos > 0:
                        cleaned_text = cleaned_text[:end_pos + 1]
                    else:
                        cleaned_text = cleaned_text[:1000]
                
                metadata = f"URL: {url}\n"
                metadata += f"Content Length: {len(response.text)} characters\n"
                metadata += f"Content Type: {response.headers.get('content-type', 'Unknown')}\n"
                metadata += "---\n\n"
                
                return f"{metadata}{cleaned_text}"
                
            except Exception as e:
                logger.error(f"Error extracting text from HTML: {str(e)}")
                return f"Error extracting text: {str(e)}"
            
        except Exception as e:
            logger.error(f"Error fetching URL content directly: {str(e)}")
            return f"Error getting content: {str(e)}"
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It only states the basic action without detailing traits such as rate limits, authentication needs, response format, or whether it's read-only or mutative. For a web search tool with zero annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with zero waste, clearly stating the tool's function without unnecessary words. It is appropriately sized and front-loaded, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (web search with potential for rich outputs), lack of annotations, no output schema, and low schema coverage, the description is incomplete. It doesn't explain return values, error handling, or behavioral nuances, leaving the agent with insufficient context for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 1 parameter with 0% description coverage, meaning the schema provides no details about the 'query' parameter. The description adds no semantic information beyond implying a search query is needed, failing to compensate for the schema's lack of documentation. This results in inadequate parameter guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the action ('Search the web') and the resource ('using Brave Search API'), which provides a basic purpose. However, it's vague about what distinguishes this tool from its siblings like 'brave_search_summary' or 'search_news', as it doesn't specify unique features such as summary generation or web vs. news focus. This lack of differentiation keeps it at a minimal viable level.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description offers no guidance on when to use this tool versus alternatives like 'brave_search_summary' or 'search_news'. It doesn't mention any context, prerequisites, or exclusions, leaving the agent without direction on tool selection. This absence of usage information results in a low score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcp2everything/mcp2brave'

If you have feedback or need assistance with the MCP directory API, please join our Discord server