search
Search Baidu and retrieve formatted results with options to control the number of outputs and enable deep content analysis for enhanced web data extraction.
Instructions
Search Baidu and return formatted results.
Args:
query: The search query string
max_results: Maximum number of results to return (default: 6)
deep_mode: Deep search the web content (default: False)
ctx: MCP context for logging
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| deep_mode | No | ||
| max_results | No | ||
| query | Yes |
Implementation Reference
- src/baidu_mcp_server/server.py:861-878 (handler)Primary MCP tool handler for 'search'. Registers the tool and executes Baidu search via BaiduSearcher, formats results as string for LLM.@mcp.tool() async def search(query: str, ctx: Context, max_results: int = 6, deep_mode: bool = False) -> str: """ Search Baidu and return formatted results. Args: query: The search query string max_results: Maximum number of results to return (default: 6) deep_mode: Deep search the web content (default: False) ctx: MCP context for logging """ try: results = await searcher.search(query, ctx, max_results, deep_mode) return searcher.format_results_for_llm(results) except Exception as e: traceback.print_exc(file=sys.stderr) return f"An error occurred while searching: {str(e)}"
- Dataclass schema for individual search results returned by the internal search logic.@dataclass class SearchResult: title: str link: str snippet: str position: int
- Core helper function in BaiduSearcher class that invokes the search performance logic.@handle_errors async def search( self, query: str, ctx: Context, max_results: int = 10, deep_mode: bool = False, max_retries: int = 2, ) -> List[SearchResult]: return await self._perform_search( query=query, max_results=max_results, deep_mode=deep_mode, max_retries=max_retries, ctx=ctx, )
- Main helper implementing the search logic: fetches Baidu pages using browser or curl_cffi, parses results, optionally deep-fetches content.async def _perform_search( self, query: str, max_results: int, deep_mode: bool, max_retries: int, ctx: Optional[Context] = None, ) -> List[SearchResult]: await self._log_ctx(ctx, "info", f"Searching Baidu for: {query}") params = {"word": query} results: List[Dict[str, Any]] = [] seen_urls: Set[str] = set() page = 0 user_agent = self.HEADERS.get("User-Agent") extra_headers = { key: value for key, value in self.HEADERS.items() if key.lower() != "user-agent" } _, browser_context = await _ensure_browser( user_agent=user_agent, extra_headers=extra_headers or None ) if browser_context is None: if CurlAsyncSession is None: await self._log_ctx( ctx, "error", "Playwright is unavailable and curl_cffi is not installed; unable to execute search.", ) return [] await self._log_ctx( ctx, "warning", "Playwright unavailable; using curl_cffi fallback HTTP client.", ) while len(results) < max_results: params["pn"] = page * 10 page += 1 html = await self._request_with_retries(browser_context, params, max_retries) if html is None: await self._log_ctx( ctx, "error", "Failed to fetch search results from Baidu" ) break page_results = self._parse_search_page(html, seen_urls) if not page_results: break results.extend(page_results) limited_results = results[:max_results] if deep_mode and limited_results: tasks = [self.process_result(result, idx + 1) for idx, result in enumerate(limited_results)] enriched_results = await asyncio.gather(*tasks, return_exceptions=True) search_results: List[SearchResult] = [] for item in enriched_results: if isinstance(item, Exception): logger.error("Deep fetch failed for a result: %s", item, exc_info=True) continue if isinstance(item, SearchResult): search_results.append(item) else: search_results = [ self._create_search_result(result, idx + 1) for idx, result in enumerate(limited_results) ] await self._log_ctx(ctx, "info", f"Successfully found {len(search_results)} results") return search_results