Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling
Hailing Cheng, Daqi Sun, Xinyu Lu

TL;DR
This paper introduces SIREN-RoPE, a learnable rotation-based encoding in Transformers that enhances expressivity by integrating heterogeneous signals into the attention mechanism, leading to improved ranking performance.
Contribution
It proposes a novel, signal-conditioned rotation space for positional encoding in Transformers, enabling dynamic, learnable representations that improve downstream task performance.
Findings
Activating the rotation dimension improves ranking calibration.
SIREN-RoPE achieves consistent performance gains with negligible overhead.
The approach introduces heterogeneous signals into the rotation manifold.
Abstract
Every Transformer architecture dedicates enormous capacity to learning rich representations in semantic embedding space -- yet the rotation manifold acted upon by Rotary Positional Embeddings (RoPE) has been treated as a fixed, hand-crafted structure, populated only by discrete ordinal indices. We argue that this rotation space is a largely overlooked second dimension of expressivity in the attention mechanism, one whose systematic exploration may open a new door for attention-based architectures. The analogy to complex numbers is instructive: just as introducing the imaginary axis -- orthogonal to and independent of the real line -- unlocked new algebraic structure once believed impossible, treating the rotation manifold as a learnable, signal-conditioned space opens an orthogonal degree of freedom in attention. In this framing, the token embedding encodes the semantic (real) component…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
