Do traveling waves make good positional encodings?
Chase van de Geijn, Ayush Paliwal, Timo L\"uddecke, Alexander S. Ecker

TL;DR
This paper introduces RollPE, a traveling wave-based positional encoding for transformers, which improves performance over traditional methods by encoding relative positional information through phase shifts.
Contribution
The paper proposes RollPE, a novel traveling wave-based positional encoding that enhances transformer performance and provides a mathematical link to existing methods like RoPE.
Findings
RollPE outperforms traditional absolute positional embeddings.
It is comparable to RoPE in effectiveness.
The method offers a topographic interpretation of positional encoding.
Abstract
Transformers rely on positional encoding to compensate for the inherent permutation invariance of self-attention. Traditional approaches use absolute sinusoidal embeddings or learned positional vectors, while more recent methods emphasize relative encodings to better capture translation equivariances. In this work, we propose RollPE, a novel positional encoding mechanism based on traveling waves, implemented by applying a circular roll operation to the query and key tensors in self-attention. This operation induces a relative shift in phase across positions, allowing the model to compute attention as a function of positional differences rather than absolute indices. We show this simple method significantly outperforms traditional absolute positional embeddings and is comparable to RoPE. We derive a continuous case of RollPE which implicitly imposes a topographic structure on the query…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace Recognition and Perception · Neural dynamics and brain function · Functional Brain Connectivity Studies
