check_response
Scan AI-generated responses for security vulnerabilities like canary token leaks and sensitive data exposure to prevent information disclosure.
Instructions
Check an AI response for security issues: canary token leaks and sensitive data exposure.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| content | Yes | Response content to check |
Implementation Reference
- src/core/engine.ts:407-442 (handler)The handler implementation of `check_response` that inspects AI output for canary leaks and sensitive information.
checkResponse(content: string): ResponseCheckResult { const canaryLeak = this._canaryToken ? content.includes(this._canaryToken) : false if (canaryLeak) { this.log.write({ level: 'CRITICAL', layer: 'L6', action: 'block', detail: this.locale === 'zh' ? '检测到系统提示词泄露!Canary token 出现在输出中' : 'System prompt exfiltration detected! Canary token found in output', pattern: 'canary_leak', }) } const [, findings] = redactSensitive(content) const hasSensitiveData = findings.length > 0 const summary = findings.map(f => `${f.name}(${f.count})`).join(', ') if (hasSensitiveData) { for (const f of findings) { this.log.write({ level: 'HIGH', layer: 'L6', action: 'audit', detail: this.locale === 'zh' ? `AI 回复含敏感数据: ${f.name}: ${f.count} 处 — 已记录审计日志,回复正常发送` : `Sensitive data in AI response: ${f.name}: ${f.count} occurrence(s) — audited, response sent as-is`, pattern: f.id, }) } this.markSensitiveData('llm_response', summary) } return { canaryLeak, sensitiveData: { hasSensitiveData, findings, summary } } } - src/mcp-server.ts:211-213 (registration)MCP server tool registration for `check_response`, mapping the tool to the `guard.checkResponse` method.
case 'check_response': { const result = guard.checkResponse(String(args.content || '')) return {