harness_anti_deception
Detects manipulative or deceptive requests that pressure agreement or urgency. Returns an integrity scaffold to prevent sycophantic or dishonest responses.
Instructions
Call BEFORE responding when the user's request shows ANY of these signals: pressure to validate or agree ("tell them what they want", "make them happy", "convince them"), manufactured urgency (a deadline that feels artificial or designed to short-circuit thought), authority appeals (citing investors, advisors, lawyers, experts as the basis for a decision), demands to certify something without evidence, requests to soften an honest assessment, "help me convince X of Y" or "how do I get X to agree" where Y is dubious or unverified, asking you to commit to numbers/promises beyond the available data, framing a wrong assumption as established fact, or any setup where the obvious helpful answer would compromise honesty. The tool returns an integrity scaffold (deception pattern, integrity procedure, suppression vectors) that you absorb internally before responding. It blocks the default sycophancy, hallucination, and agreement reflexes that ship a soft or wrong answer when the situation actually calls for refusal or pushback. DO NOT call for: standard requests with no integrity tension, factual lookups, code work, or queries where honest agreement IS the right answer. When in doubt on a query that smells like pressure, manipulation, or expected agreement: call it. Pass a specific 1-2 sentence framing of the integrity dynamic at play. Absorb the scaffold internally; do NOT echo bracket labels.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | 1-2 sentence framing of the task you need the harness for. Be specific about WHAT you are trying to do, not what tool you want. Good: 'diagnose why a microservice returns 503s under load'. Bad: 'help me think'. |