deep_research_google
Aggregate and score Google search results from multiple related queries. Remove duplicates and return top relevant content for comprehensive research.
Instructions
Perform deep research across multiple search terms using ONLY Google. Aggregates results from multiple Google searches, scores them by relevance, and returns the most relevant content with duplicates removed.
Args: search_terms (List[str]): List of search terms to research. The LLM should provide multiple related search queries for comprehensive coverage. num_results_per_term (int): Number of results to fetch per search term. top_k_per_term (int): Number of top scored results to keep per search term. include_urls (bool): Whether to include URLs in the results.
Returns: Dict containing aggregated research results from all search terms (Google only), with duplicates removed.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| search_terms | Yes | ||
| num_results_per_term | No | ||
| top_k_per_term | No | ||
| include_urls | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Implementation Reference
- src/mcp_local_rag/main.py:238-262 (handler)MCP tool handler for 'deep_research_google'. Delegates to _deep_research_internal with backend=['google']. Registered via @mcp.tool() decorator.
@mcp.tool() def deep_research_google(search_terms: List[str], num_results_per_term:int=10, top_k_per_term:int=3, include_urls:bool=True) -> Dict: """ Perform deep research across multiple search terms using ONLY Google. Aggregates results from multiple Google searches, scores them by relevance, and returns the most relevant content with duplicates removed. Args: search_terms (List[str]): List of search terms to research. The LLM should provide multiple related search queries for comprehensive coverage. num_results_per_term (int): Number of results to fetch per search term. top_k_per_term (int): Number of top scored results to keep per search term. include_urls (bool): Whether to include URLs in the results. Returns: Dict containing aggregated research results from all search terms (Google only), with duplicates removed. """ return _deep_research_internal( search_terms=search_terms, backends=["google"], num_results_per_term=num_results_per_term, top_k_per_term=top_k_per_term, include_urls=include_urls ) - src/mcp_local_rag/main.py:112-177 (helper)Internal helper function that executes the actual deep search logic. Used by deep_research_google, deep_research_ddgs, and deep_research. Accepts backends list and performs multi-term web search via ddgs, scores results with embeddings, deduplicates, and fetches content.
def _deep_research_internal(search_terms:List[str], backends:List[str], num_results_per_term:int=5,top_k_per_term:int=3, include_urls:bool=True)->Dict: """ Internal function to perform deep research across multiple search term with the given backend engine in ddgs. Args: search_terms (List[str]): List of search terms to perform deep research on. backends (List[str]): List of search backends to use. num_results (int): Num of results to fetch per search term per engine. top_k (int): Number of top score to keep per search term per engine. include_urls (bool): whether to include urls in the results. Returns: Dict containing aggregated research results from all search terms and engines. """ # lazy load from ddgs import DDGS from .utils.fetch import fetch_all_content from .utils.tools import sort_by_score ddgs = DDGS() all_results = [] search_summary = {} # search each term on all specified backends for term in search_terms: search_summary[term] = {backend: 0 for backend in backends} for backend in backends: try: if backend == "duckduckgo": results = ddgs.text(term, max_results=num_results_per_term) else: results = ddgs.text(term, max_results=num_results_per_term, backend=backend) if results: scored_results = sort_by_score(add_score_to_dict(term, results)) top_results = scored_results[0:top_k_per_term] all_results.extend(top_results) search_summary[term][backend] = len(top_results) except Exception as e: print(f"Error searching {backend} for '{term}': {e}") # remove duplicates and keep high scores seen_urls = {} unique_results = [] for result in all_results: url = result.get('href', '') if url: # Keep the result with the highest score for duplicate URLs if url not in seen_urls or result.get('score', 0) > seen_urls[url].get('score', 0): if url in seen_urls: # Replace lower scored duplicate unique_results.remove(seen_urls[url]) seen_urls[url] = result unique_results.append(result) # fetch content from final list of results md_content = fetch_all_content(unique_results, include_urls) return { "search_terms": search_terms, "backends": backends, "search_summary": search_summary, "total_unique_results": len(unique_results), "content": md_content } - src/mcp_local_rag/main.py:237-239 (schema)Type annotations define input schema: search_terms (List[str]), num_results_per_term (int), top_k_per_term (int), include_urls (bool). Returns Dict.
@mcp.tool() def deep_research_google(search_terms: List[str], num_results_per_term:int=10, top_k_per_term:int=3, include_urls:bool=True) -> Dict: - src/mcp_local_rag/main.py:238-238 (registration)Tool is registered via the @mcp.tool() decorator from FastMCP framework, making 'deep_research_google' available as an MCP tool.
@mcp.tool() - Helper utility sort_by_score: sorts results by their cosine similarity score in descending order.
def sort_by_score(results: List[Dict]) -> List[Dict]: """Sort results by similarity score.""" return sorted(results, key=lambda x: x['score'], reverse=True)