Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling

Hailing Cheng; Daqi Sun; Xinyu Lu

arXiv:2604.24717·cs.AI·April 28, 2026

Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling

Hailing Cheng, Daqi Sun, Xinyu Lu

PDF

TL;DR

This paper introduces SIREN-RoPE, a learnable rotation-based encoding in Transformers that enhances expressivity by integrating heterogeneous signals into the attention mechanism, leading to improved ranking performance.

Contribution

It proposes a novel, signal-conditioned rotation space for positional encoding in Transformers, enabling dynamic, learnable representations that improve downstream task performance.

Findings

01

Activating the rotation dimension improves ranking calibration.

02

SIREN-RoPE achieves consistent performance gains with negligible overhead.

03

The approach introduces heterogeneous signals into the rotation manifold.

Abstract

Every Transformer architecture dedicates enormous capacity to learning rich representations in semantic embedding space -- yet the rotation manifold acted upon by Rotary Positional Embeddings (RoPE) has been treated as a fixed, hand-crafted structure, populated only by discrete ordinal indices. We argue that this rotation space is a largely overlooked second dimension of expressivity in the attention mechanism, one whose systematic exploration may open a new door for attention-based architectures. The analogy to complex numbers is instructive: just as introducing the imaginary axis -- orthogonal to and independent of the real line -- unlocked new algebraic structure once believed impossible, treating the rotation manifold as a learnable, signal-conditioned space opens an orthogonal degree of freedom in attention. In this framing, the token embedding encodes the semantic (real) component…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.