Search Web Context
search_web_contextRetrieves summarized educational web content from trusted CS education sites to supplement CAIE exam explanations. Use for conceptual depth beyond mark schemes.
Instructions
Get educational web content to supplement CAIE exam explanations.
Returns summarized content from trusted CS education sites. Use when the student needs conceptual explanations beyond what mark schemes provide.
Web content is supplementary — always prioritize official CAIE mark scheme points first. Returns: source title, URL, domain, and key educational content (max 800 chars per source).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | ||
| subject | No | ||
| num_results | No |
Implementation Reference
- mcp_server.py:1291-1295 (registration)Registration of the 'search_web_context' tool with FastMCP decorator. Title: 'Search Web Context', tags: 'search' and 'enhanced', with read-only and idempotent annotations.
@mcp.tool( title="Search Web Context", tags={"search", "enhanced"}, annotations={"readOnlyHint": True, "idempotentHint": True}, ) - mcp_server.py:1296-1357 (handler)Handler function for 'search_web_context'. Takes a query, optional subject filter, and num_results (1-10). Calls the upstream API at /search/web-context, de-duplicates results by domain, extracts educational content (up to 800 chars), and returns a ToolResult with both text summary and structured payload.
def search_web_context( query: str, subject: Optional[str] = DEFAULT_SUBJECT, num_results: int = 5, ) -> ToolResult: """Get educational web content to supplement CAIE exam explanations. Returns summarized content from trusted CS education sites. Use when the student needs conceptual explanations beyond what mark schemes provide. Web content is supplementary — always prioritize official CAIE mark scheme points first. Returns: source title, URL, domain, and key educational content (max 800 chars per source). """ params: dict[str, Any] = { "q": query, "num_results": max(1, min(num_results, 10)), } if subject: params["subject"] = subject try: data = _api_get("/search/web-context", params) except Exception as exc: logger.error("search_web_context failed: %s", exc) error_payload = _error_from_exception(exc, "/search/web-context") raise ToolError(error_payload.get("error", {}).get("message", "Web search failed.")) results = data.get("results", []) if isinstance(data, dict) else [] # De-duplicate by domain — keep only the most relevant per domain seen_domains: set[str] = set() curated_results: list[dict[str, Any]] = [] for r in results: if not isinstance(r, dict): continue domain = r.get("domain", "") if domain in seen_domains: continue seen_domains.add(domain) key_content = _extract_educational_content(r.get("content", ""), max_chars=800) curated_results.append({ "title": r.get("title", ""), "url": r.get("url", ""), "domain": domain, "key_content": key_content, }) # Build concise text summary content_lines = [f"Web context for '{query}' from {len(curated_results)} sources:"] for i, r in enumerate(curated_results, 1): content_lines.append(f"\n[{i}] {r['title']} ({r['domain']})") content_lines.append(r["key_content"]) if not curated_results: content_lines.append("No web content found for this query.") payload = { "ok": True, "query": query, "returned": len(curated_results), "results": curated_results, } return ToolResult(content="\n".join(content_lines), structured_content=payload) - mcp_server.py:1146-1191 (helper)Helper function used by search_web_context to strip navigation, ads, and chrome from web page scrapes, keeping only educational content. Truncates at sentence boundaries (max 800 chars).
def _extract_educational_content(content: str, max_chars: int = 800) -> str: """Extract educational content from a web page scrape, stripping nav/ads/chrome.""" if not content: return "" # Remove common navigation/menu patterns nav_patterns = [ r"(?i)(?:^|\n)\s*\*\s*(?:Courses|Tutorials|Interview Prep|Sign In|DSA Python|" r"Interview Corner|Puzzles|Aptitude|System Design|Must Do|Quizzes|" r"Interview Questions|DSA Tutorial|Data Types|Examples|Practice|" r"Data Science|NumPy|Pandas|Django|Flask|Projects|Advanced DSA)\s*(?:\n|$)", r"(?i)Open In App\s*\n", r"(?i)Jump to content\s*\n", r"\*\*\s*\n", # Standalone bold markers r"(?i)^\s*\d+\s+languages?\s*$", # Language count lines r"(?i)^\s*\*\s*(?:Español|فارسی|한국어|Italiano|עברית)\s*$", # Other language links r"(?i)Article Tags:.*$", r"(?i)Comment\s*$", r"(?i)Improve\s*$", r"(?i)\d+\s*Likes?\s*$", r"(?i)Like\s*$", r"(?i)Report\s*$", r"(?i)Suggest changes\s*$", r"(?i)Last Updated\s*:\s*\d+\s+\w+,?\s*\d{4}", r"(?i)geeksforgeeks\s*\n", ] cleaned = content for pattern in nav_patterns: cleaned = re.sub(pattern, "\n", cleaned, flags=re.MULTILINE) # Collapse multiple blank lines cleaned = re.sub(r"\n{3,}", "\n\n", cleaned).strip() # Truncate to max_chars at a sentence boundary if len(cleaned) > max_chars: # Try to cut at sentence boundary truncated = cleaned[:max_chars] last_period = truncated.rfind(".") last_newline = truncated.rfind("\n") cut_at = max(last_period, last_newline) if cut_at > max_chars * 0.5: cleaned = truncated[:cut_at + 1].rstrip() else: cleaned = truncated.rstrip() + "..." return cleaned - mcp_server.py:1296-1308 (schema)Input schema/signature for search_web_context: accepts query (str), optional subject (str), and num_results (int, default 5, clamped to 1-10). Returns ToolResult.
def search_web_context( query: str, subject: Optional[str] = DEFAULT_SUBJECT, num_results: int = 5, ) -> ToolResult: """Get educational web content to supplement CAIE exam explanations. Returns summarized content from trusted CS education sites. Use when the student needs conceptual explanations beyond what mark schemes provide. Web content is supplementary — always prioritize official CAIE mark scheme points first. Returns: source title, URL, domain, and key educational content (max 800 chars per source). """