LieRE: Lie Rotational Positional Encodings
Sophie Ostmeier, Brian Axelrod, Maya Varma, Michael E. Moseley, Akshay Chaudhari, Curtis Langlotz

TL;DR
LieRE introduces a learnable, high-dimensional rotational positional encoding for transformers, improving their ability to model complex spatial structures in vision tasks while maintaining efficiency.
Contribution
We propose LieRE, a novel generalization of RoPE that learns dense skew-symmetric matrices to encode positional information in high-dimensional spaces.
Findings
LieRE outperforms traditional RoPE in 2D and 3D vision tasks.
LieRE generalizes well to higher resolutions.
It maintains computational efficiency.
Abstract
Transformer architectures rely on position encodings to model the spatial structure of input data. Rotary Position Encoding (RoPE) is a widely used method in language models that encodes relative positions through fixed, block-diagonal, rotation matrices applied to key-query interactions. We hypothesize that this inductive bias limits their RoPE's effectiveness for modalities with high dimensional structure. Lie Relative Encodings (LieRE) introduce a principled generalization of RoPE, aimed at increasing the representational capacity of positional encodings in transformers. Instead of fixed 2D rotations, LieRE learns dense skew-symmetric matrices (Lie algebra elements), which are then differentiable mapped to form high-dimensional rotation matrices (Lie group elements). This results in richer, learnable, and continuous, encodings of both relative and absolute positional information. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Image and Video Retrieval Techniques · Natural Language Processing Techniques
MethodsDense Connections · Dropout · Feedforward Network · Linear Layer · Multi-Head Attention · Softmax · Attention Dropout · Attention Is All You Need · Data-efficient Image Transformer · Relative Position Encodings
