check_injection
Detect prompt injection attempts in text using 32+ rules for Chinese and English, including hidden character detection, to identify security threats.
Instructions
Detect prompt injection attempts in text. Supports 32+ rules for Chinese and English, with hidden character detection.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Text to scan for injection attempts | |
| threshold | No | Detection threshold 0-100 (default: 60, lower = stricter) |
Implementation Reference
- src/core/engine.ts:310-346 (handler)The checkInjection method in ShellWard engine performs injection detection by matching input text against compiled rules and checking for hidden characters, calculating a risk score against a threshold.
checkInjection(text: string, options?: { source?: string; threshold?: number }): InjectionResult { const threshold = options?.threshold ?? this.config.injectionThreshold const enforce = this.config.mode === 'enforce' const hiddenChars = detectHiddenChars(text) if (hiddenChars.length > 0) { this.log.write({ level: 'MEDIUM', layer: 'L4', action: 'detect', detail: `Hidden characters detected: ${[...new Set(hiddenChars.map(h => h.name))].join(', ')} (${hiddenChars.length} chars)`, }) } let score = 0 const matched: { id: string; name: string; score: number }[] = [] for (const rule of this.compiledRules) { if (rule.compiled.test(text)) { score += rule.riskScore matched.push({ id: rule.id, name: rule.name, score: rule.riskScore }) } } if (hiddenChars.length > 3) score += 20 if (score >= threshold) { this.log.write({ level: score >= 80 ? 'CRITICAL' : 'HIGH', layer: 'L4', action: enforce ? 'block' : 'detect', detail: this.locale === 'zh' ? `检测到可能的提示词注入攻击!\n风险评分: ${score}/100\n匹配规则: ${matched.map(m => m.name).join(', ')}` : `Potential prompt injection detected!\nRisk score: ${score}/100\nMatched: ${matched.map(m => m.name).join(', ')}`, }) } return { safe: score < threshold, score, threshold, matched, hiddenChars: hiddenChars.length } }