ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices

Hao Yu; Tangyu Jiang; Shuning Jia; Shannan Yan; Shunning Liu; Haolong Qian; Guanghao Li; Shuting Dong; Huaisong Zhang; Chun Yuan

arXiv:2506.03737·cs.CV·June 5, 2025

ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices

Hao Yu, Tangyu Jiang, Shuning Jia, Shannan Yan, Shunning Liu, Haolong Qian, Guanghao Li, Shuting Dong, Huaisong Zhang, Chun Yuan

PDF

Open Access 1 Repo

TL;DR

ComRoPE introduces trainable commuting angle matrices to enhance rotary positional encoding in Transformers, improving scalability, robustness, and performance over existing methods, with theoretical analysis and state-of-the-art results on ImageNet-1K.

Contribution

It generalizes RoPE using trainable commuting angle matrices, providing a scalable, robust, and theoretically grounded positional encoding method with superior performance.

Findings

01

Surpasses state-of-the-art by 1.6% at training resolution.

02

Achieves 2.9% higher accuracy at higher resolution.

03

Theoretically ensures consistent performance with position offsets.

Abstract

The Transformer architecture has revolutionized various regions since it was proposed, and its effectiveness largely depends on the ability to encode positional information. Traditional position encoding methods exhibit significant limitations due to lack of robustness and flexibility of position. Therefore, Rotary Positional Encoding (RoPE) was proposed to alleviate these issues, which integrates positional information by rotating the embeddings in the attention mechanism. However, RoPE requires manually defined rotation matrices with limited transformation space, constraining the model's capacity. In this work, we propose ComRoPE, which generalizes RoPE by defining it in terms of trainable commuting angle matrices. Specifically, we demonstrate that pairwise commutativity of these matrices is essential for RoPE to achieve scalability and positional robustness. We formally define the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

longin-yu/comrope
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Multimodal Machine Learning Applications

MethodsAbsolute Position Encodings · Layer Normalization · Byte Pair Encoding · Label Smoothing · Softmax · Dropout · Dense Connections · Transformer · Attention Is All You Need