Jordan-RoPE: Non-Semisimple Relative Positional Encoding via Complex Jordan Blocks
Yaobo Zhang

TL;DR
This paper introduces Jordan-RoPE, a novel non-semisimple relative positional encoding based on complex Jordan blocks, enabling oscillatory-polynomial features for improved modeling of distance-modulated phase interactions in language models.
Contribution
It formulates a non-semisimple representation of relative positional encoding using Jordan blocks, providing a new distance-modulated phase basis and demonstrating its effectiveness in specific tasks.
Findings
Jordan-RoPE captures distance-modulated phase interactions.
A scaled-exact variant improves over RoPE on language modeling.
RoPE+ALiBi remains the strongest baseline overall.
Abstract
Relative positional encodings determine which functions of query-key lag can enter the primitive attention logit. RoPE supplies a rotary phase, while ALiBi supplies an additive distance bias. Motivated by group-theoretic views of linear translation-invariant positional encodings, we study a non-semisimple case in which a complex rotary eigenvalue and a nilpotent response live in the same defective Jordan block. The resulting relative operator generates oscillatory-polynomial features such as , , , and , for causal lag . Thus the construction realizes a distance-modulated phase basis , rather than merely adding a separate distance channel to RoPE. We formulate Exact Jordan-RoPE as a non-semisimple one-parameter representation, give its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
