validate_url
Validates Tactual's predicted navigation paths by simulating a screen reader on the page. Returns an accuracy ratio comparing predicted vs actual step counts for each target.
Instructions
Validate Tactual's predicted navigation paths against a virtual screen reader. Runs analyze_url internally, then for each worst finding drives @guidepup/virtual-screen-reader over the captured DOM (via jsdom) to check: (a) is the target reachable at all, and (b) how many virtual SR announcements does it take to reach it? Compares to Tactual's predicted step count. Returns an accuracy ratio per target and a mean across all validated targets — closer to 1.0 means Tactual's predictions match this virtual-screen-reader run, not a guarantee of full real-AT fidelity.
Requires (optional deps): jsdom + @guidepup/virtual-screen-reader. Installed with tactual if optionalDependencies were honored; otherwise run npm install jsdom @guidepup/virtual-screen-reader in your project.
When to use: closing the predicted-vs-actual loop. If Tactual's predictions diverge a lot from the virtual SR, either the profile weights need calibration or the page has structural patterns the analyzer doesn't model. Use sparingly — this adds the analyze_url cost plus jsdom parsing + virtual SR navigation time.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to analyze and validate | |
| profile | No | AT profile ID (default: nvda-desktop-v0). Use list_profiles to see options. | |
| maxTargets | No | Maximum findings to validate (worst-first). Higher = slower but more signal. | |
| strategy | No | Navigation strategy for the virtual SR. 'linear' uses Tab/Shift-Tab (keyboard flow); 'semantic' uses heading/landmark skip commands (screen-reader flow). Semantic is more representative for NVDA/JAWS/VoiceOver users. | semantic |
| timeout | No | Page load timeout in ms | |
| waitTime | No | Additional wait after load (ms) | |
| channel | No | Browser channel: chrome, chrome-beta, msedge | |
| stealth | No | Apply anti-bot-detection defaults | |
| storageState | No | Path to a Playwright storageState JSON (for authenticated pages). Must be within the current working directory. |
Implementation Reference
- src/mcp/tools/validate-url.ts:59-88 (handler)MCP tool handler function that destructures the Zod-validated input, calls runValidateUrl(), and formats the result as MCP content. Catches ValidateUrlError and returns isError: true on failure.
async ({ url, profile, maxTargets, strategy, timeout, waitTime, channel, stealth, storageState }) => { try { const result = await runValidateUrl({ url, profileId: profile, maxTargets, strategy, timeout, waitTime, channel, stealth, storageState, restrictStorageStateToCwd: true, useSharedBrowserPool: true, }); return { content: [{ type: "text" as const, text: JSON.stringify(result, null, 2) }], }; } catch (err) { const text = err instanceof ValidateUrlError ? err.message : `validate_url failed: ${err instanceof Error ? err.message : String(err)}`; return { content: [{ type: "text" as const, text }], isError: true, }; } }, ); - src/mcp/tools/validate-url.ts:25-57 (schema)Zod input schema for validate_url tool, defining parameters: url, profile, maxTargets, strategy, timeout, waitTime, channel, stealth, storageState.
inputSchema: { url: z.string().describe("URL to analyze and validate"), profile: z .string() .optional() .describe("AT profile ID (default: nvda-desktop-v0). Use list_profiles to see options."), maxTargets: z .number() .int() .min(1) .max(50) .default(10) .describe("Maximum findings to validate (worst-first). Higher = slower but more signal."), strategy: z .enum(["linear", "semantic"]) .default("semantic") .describe( "Navigation strategy for the virtual SR. 'linear' uses Tab/Shift-Tab (keyboard flow); " + "'semantic' uses heading/landmark skip commands (screen-reader flow). Semantic is more " + "representative for NVDA/JAWS/VoiceOver users.", ), timeout: z.number().default(30000).describe("Page load timeout in ms"), waitTime: z.number().optional().describe("Additional wait after load (ms)"), channel: z.string().optional().describe("Browser channel: chrome, chrome-beta, msedge"), stealth: z.boolean().optional().describe("Apply anti-bot-detection defaults"), storageState: z .string() .optional() .describe( "Path to a Playwright storageState JSON (for authenticated pages). " + "Must be within the current working directory.", ), }, - src/mcp/tools/validate-url.ts:5-89 (registration)Registration function that calls server.registerTool('validate_url', ...) with schema and handler. Exported as registerValidateUrl.
export function registerValidateUrl(server: McpServer): void { server.registerTool( "validate_url", { description: "Validate Tactual's predicted navigation paths against a virtual screen reader. " + "Runs analyze_url internally, then for each worst finding drives " + "@guidepup/virtual-screen-reader over the captured DOM (via jsdom) to check: " + "(a) is the target reachable at all, and (b) how many virtual SR announcements " + "does it take to reach it? Compares to Tactual's predicted step count. " + "Returns an accuracy ratio per target and a mean across all validated targets — " + "closer to 1.0 means Tactual's predictions match this virtual-screen-reader run, " + "not a guarantee of full real-AT fidelity.\n\n" + "**Requires** (optional deps): jsdom + @guidepup/virtual-screen-reader. " + "Installed with tactual if optionalDependencies were honored; otherwise run " + "`npm install jsdom @guidepup/virtual-screen-reader` in your project.\n\n" + "**When to use**: closing the predicted-vs-actual loop. If Tactual's predictions " + "diverge a lot from the virtual SR, either the profile weights need calibration " + "or the page has structural patterns the analyzer doesn't model. Use sparingly — " + "this adds the analyze_url cost plus jsdom parsing + virtual SR navigation time.", inputSchema: { url: z.string().describe("URL to analyze and validate"), profile: z .string() .optional() .describe("AT profile ID (default: nvda-desktop-v0). Use list_profiles to see options."), maxTargets: z .number() .int() .min(1) .max(50) .default(10) .describe("Maximum findings to validate (worst-first). Higher = slower but more signal."), strategy: z .enum(["linear", "semantic"]) .default("semantic") .describe( "Navigation strategy for the virtual SR. 'linear' uses Tab/Shift-Tab (keyboard flow); " + "'semantic' uses heading/landmark skip commands (screen-reader flow). Semantic is more " + "representative for NVDA/JAWS/VoiceOver users.", ), timeout: z.number().default(30000).describe("Page load timeout in ms"), waitTime: z.number().optional().describe("Additional wait after load (ms)"), channel: z.string().optional().describe("Browser channel: chrome, chrome-beta, msedge"), stealth: z.boolean().optional().describe("Apply anti-bot-detection defaults"), storageState: z .string() .optional() .describe( "Path to a Playwright storageState JSON (for authenticated pages). " + "Must be within the current working directory.", ), }, }, async ({ url, profile, maxTargets, strategy, timeout, waitTime, channel, stealth, storageState }) => { try { const result = await runValidateUrl({ url, profileId: profile, maxTargets, strategy, timeout, waitTime, channel, stealth, storageState, restrictStorageStateToCwd: true, useSharedBrowserPool: true, }); return { content: [{ type: "text" as const, text: JSON.stringify(result, null, 2) }], }; } catch (err) { const text = err instanceof ValidateUrlError ? err.message : `validate_url failed: ${err instanceof Error ? err.message : String(err)}`; return { content: [{ type: "text" as const, text }], isError: true, }; } }, ); } - src/mcp/index.ts:34-35 (registration)Top-level registration call: registerValidateUrl(server) invoked in createMcpServer().
registerAnalyzeUrl(server); registerValidateUrl(server); - src/core/url-validation.ts:26-66 (helper)Core URL validation logic used internally by the pipeline. Validates allowed protocols (http, https, file), blocks dangerous protocols (javascript, data, vbscript, blob), checks hostname, and prevents embedded credentials.
export function validateUrl(input: string): ValidationResult { const trimmed = input.trim(); if (!trimmed) { return { valid: false, error: "URL is empty" }; } // Block obviously dangerous protocols before parsing const lower = trimmed.toLowerCase(); for (const proto of BLOCKED_PROTOCOLS) { if (lower.startsWith(proto)) { return { valid: false, error: `Blocked protocol: ${proto}` }; } } let parsed: URL; try { parsed = new URL(trimmed); } catch { return { valid: false, error: `Invalid URL: "${trimmed}"` }; } if (!ALLOWED_PROTOCOLS.has(parsed.protocol)) { return { valid: false, error: `Unsupported protocol "${parsed.protocol}" — only http:, https:, and file: are allowed`, }; } // For http/https, require a hostname if ((parsed.protocol === "http:" || parsed.protocol === "https:") && !parsed.hostname) { return { valid: false, error: "URL is missing a hostname" }; } // Block credentials in URLs (potential phishing) if (parsed.username || parsed.password) { return { valid: false, error: "URLs with embedded credentials are not allowed" }; } return { valid: true, url: parsed.href }; }