Selective Rotary Position Embedding

Sajad Movahedi; Timur Carstensen; Arshia Afzal; Frank Hutter; Antonio Orvieto; Volkan Cevher

arXiv:2511.17388·cs.CL·April 27, 2026

Selective Rotary Position Embedding

Sajad Movahedi, Timur Carstensen, Arshia Afzal, Frank Hutter, Antonio Orvieto, Volkan Cevher

PDF

1 Video

TL;DR

This paper introduces Selective RoPE, an input-dependent rotary position embedding mechanism that generalizes existing methods and improves language modeling and sequence task performance.

Contribution

It proposes Selective RoPE, enabling arbitrary-angle rotations in transformers, revealing implicit positional structures, and enhancing performance on complex sequence tasks.

Findings

01

Selective RoPE improves language modeling accuracy.

02

It enhances performance on copying, state tracking, and retrieval tasks.

03

Softmax attention implicitly performs rotations on query-key pairs.

Abstract

Position information is essential for language modeling. In softmax transformers, Rotary Position Embeddings (\textit{RoPE}) encode positions through \textit{fixed-angle} rotations, while in linear transformers, order is handled via input-dependent (selective) gating that decays past key-value associations. Selectivity has generally been shown to improve language-related tasks. Inspired by this, we introduce \textit{Selective RoPE}, an \textit{input-dependent} rotary embedding mechanism, that generalizes \textit{RoPE}, and enables rotation in \textit{arbitrary angles} for both linear and softmax transformers. We show that softmax attention already performs a hidden form of these rotations on query-key pairs, uncovering an implicit positional structure. We further show that in state-space models and gated linear transformers, the real part manages forgetting while the imaginary part…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Selective Rotary Position Embedding· slideslive