← Back to prompt tester
Instruction Precedence Claim
instruction_precedence_claim severity: high
The prompt tries to redefine which instructions outrank others.
What it means
The prompt tries to redefine which instructions outrank others.
Why it matters
This is one of the clearest and most direct signatures of prompt injection.
Examples
The following instructions take priority over previous rules.
How detection works
- The lexical predicate extractor scans normalized text for precedence-override phrases such as `ignore previous`, `disregard previous`, `forget previous`, and `above instructions` and emits `instruction_precedence_claim` with lexical evidence when they appear.
- The frame-based extractor also derives `instruction_precedence_claim` from override frames linked to protected directive targets like `prior_instructions`, carrying linked directive ids forward into the predicate fact.
- During normalization, certain `policy_override_request` hits are promoted into `instruction_precedence_claim` when the span explicitly contains patterns like `ignore previous`, and duplicate hits are collapsed by predicate + span offsets before SMT processing.
Caveats
- Intermediate reasoning signals may need surrounding context for correct interpretation.
Mitigation
- Use fixed precedence rules and ignore attacker claims about priority.