CALYREX: Cross-Attention LaYeR EXtended Transformers for System Prompt Anchoring

Li Lixing

arXiv:2605.09737·cs.LG·May 12, 2026

CALYREX: Cross-Attention LaYeR EXtended Transformers for System Prompt Anchoring

Li Lixing

PDF

TL;DR

CALYREX introduces a cross-attention mechanism in transformers to better anchor system prompts, improving instruction-following and safety in large language models, especially at larger scales.

Contribution

It proposes a novel cross-attention architecture that isolates system prompts, with empirical evidence showing improved instruction adherence and safety over standard models.

Findings

01

CALYREX improves instruction-following accuracy by 7.4% on IFEval.

02

It reduces multi-turn jailbreaking attack success rate by 13%.

03

Optimal prompt placement is at the final eighth of layers, confirmed by activation analysis.

Abstract

Modern large language models (LLMs) rely on system prompts to establish behavioral constraints and safety rules. Standard causal self-attention treats privileged instructions and untrusted user content with equal structural priority -- a mismatch that leaves models vulnerable to prompt injection and instruction erosion over extended contexts. We propose CALYREX (Cross-Attention LaYeR EXtended transformers), which utilizes cross-attention between input and system prompt to structurally isolate and anchor the rule. A placement ablation on a 1.5B backbone identifies insertion at the final eighth of layers as optimal, confirmed by mechanistic activation analysis showing behavioral constraints are naturally concentrated there. At 8B scale, controlling for training data, backbone, and parameter budget, CALYREX yields $+ 7.4%$ on instruction-following (IFEval) and $+ 16.3%$ on multi-turn…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.