check_links
Crawl all internal links on a page and identify dead links (404/5xx errors) using HEAD requests. Ensures site integrity by detecting broken pages automatically.
Instructions
Crawl all internal links on the current page and check for dead links (404/5xx).
Sends HEAD requests to each internal link found on the page. This can take several seconds on pages with many links.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |
Implementation Reference
- argus/browser.py:442-462 (handler)Core handler: checks internal links via HEAD requests (fallback GET on 405), returns status results marking dead/ok links.
async def check_links(self, links: List[Dict]) -> List[Dict]: """Check internal link status via HEAD requests (falls back to GET on 405).""" results = [] checked: set = set() for link in links: href = link.get("href", "") if not href or href in checked or not link.get("isInternal"): continue checked.add(href) try: resp = await self._context.request.head(href, timeout=5000) # 405 = server doesn't support HEAD, retry with GET if resp.status == 405: resp = await self._context.request.get(href, timeout=5000) # 403 from context.request often means anti-bot, not a real dead link # Mark as ok since the page loaded fine in the browser is_ok = resp.ok or resp.status == 403 results.append({"href": href, "status": resp.status, "ok": is_ok}) except Exception: results.append({"href": href, "status": 0, "ok": False}) return results - argus/mcp_server.py:545-578 (registration)Registration as an MCP tool via @mcp.tool() decorator, formats and returns the result string.
@mcp.tool() async def check_links() -> str: """Crawl all internal links on the current page and check for dead links (404/5xx). Sends HEAD requests to each internal link found on the page. This can take several seconds on pages with many links. """ s = _require_session() state = await s.browser.get_state() s._last_elements = state.elements link_results = await s.browser.check_links(state.links) new_bugs = s.detector.process_dead_links(link_results, state.url, s.steps) if new_bugs: ss_path = await _auto_screenshot(s, "dead_links", f"Dead links on {state.url}") for bug in new_bugs: bug.screenshot_path = ss_path s.bugs.extend(new_bugs) dead = [r for r in link_results if not r["ok"]] alive = [r for r in link_results if r["ok"]] external = [l for l in state.links if not l.get("isInternal")] lines = [f"Checked {len(link_results)} internal links on {state.url}"] lines.append(f" OK: {len(alive)}") lines.append(f" Dead: {len(dead)}") lines.append(f" External (not checked): {len(external)}") if dead: lines.append("") for r in dead: lines.append(f" [HTTP {r['status']}] {r['href']}") lines.append(f"\nTotal bugs in session: {len(s.bugs)}") return "\n".join(lines) - argus/detector.py:323-326 (helper)Helper that processes link check results, creates Bug objects for dead links aggregated by domain.
# ── Dead link crawling ──────────────────────────────────────────── def process_dead_links( - argus/mcp_server.py:569-576 (schema)Output format: returns a human-readable report string with counts (OK, dead, external) and details per dead link.
lines = [f"Checked {len(link_results)} internal links on {state.url}"] lines.append(f" OK: {len(alive)}") lines.append(f" Dead: {len(dead)}") lines.append(f" External (not checked): {len(external)}") if dead: lines.append("") for r in dead: lines.append(f" [HTTP {r['status']}] {r['href']}") - argus/mcp_server.py:686-707 (helper)Reuse of check_links within the crawl_site tool during automated site crawling.
link_results = await s.browser.check_links(state.links) new_bugs.extend(s.detector.process_dead_links(link_results, state.url, s.steps)) # Performance perf = await s.browser.get_performance() new_bugs.extend(s.detector.process_performance(perf, state.url, s.steps)) # Screenshot if new_bugs: page_name = state.url.split("/")[-1] or "index" ss_path = await _auto_screenshot(s, f"crawl_{page_name}", f"Crawl: {state.url}") for bug in new_bugs: bug.screenshot_path = ss_path s.bugs.extend(new_bugs) page_results.append((state.url, len(new_bugs))) # Discover new internal links to visit (deduplicated by path) for link in state.links: href = link.get("href", "") if link.get("isInternal") and _normalize(href) not in visited_paths and href not in to_visit: to_visit.append(href)