MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs

Rui Wen; Mark Russinovich; Andrew Paverd; Jun Sakuma; Ahmed Salem

arXiv:2605.15172·cs.CR·May 15, 2026

MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs

Rui Wen, Mark Russinovich, Andrew Paverd, Jun Sakuma, Ahmed Salem

PDF

TL;DR

MetaBackdoor reveals that positional encoding in Transformer-based LLMs can be exploited as a stealthy backdoor trigger, enabling attacks without altering input text and posing new security challenges.

Contribution

Introduces MetaBackdoor, a novel backdoor attack exploiting positional information as a trigger, expanding the threat model of LLM security beyond content-based methods.

Findings

01

Length-based positional triggers can activate backdoors stealthily.

02

Backdoored LLMs can leak sensitive internal information.

03

Positional triggers can be combined with content-based backdoors for enhanced stealth.

Abstract

Backdoor attacks pose a serious security threat to large language models (LLMs), which are increasingly deployed as general-purpose assistants in safety- and privacy-critical applications. Existing LLM backdoors rely primarily on content-based triggers, requiring explicit modification of the input text. In this work, we show that this assumption is unnecessary and limiting. We introduce MetaBackdoor, a new class of backdoor attacks that exploits positional information as the trigger, without modifying textual content. Our key insight is that Transformer-based LLMs necessarily encode token positions to process ordered sequences. As a result, length-correlated positional structure is reflected in the model's internal computation and can be used as an effective non-content trigger signal. We demonstrate that even a simple length-based positional trigger is sufficient to activate stealthy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.