URoPE: Universal Relative Position Embedding across Geometric Spaces

Yichen Xie; Depu Meng; Chensheng Peng; Yihan Hu; Quentin Herau; Masayoshi Tomizuka; Wei Zhan

arXiv:2604.18747·cs.CV·April 22, 2026

URoPE: Universal Relative Position Embedding across Geometric Spaces

Yichen Xie, Depu Meng, Chensheng Peng, Yihan Hu, Quentin Herau, Masayoshi Tomizuka, Wei Zhan

PDF

1 Repo

TL;DR

URoPE extends rotary position embeddings to cross-view and cross-dimensional geometric spaces, enabling transformers to better handle geometric reasoning in diverse vision tasks.

Contribution

It introduces a parameter-free, intrinsics-aware positional encoding that generalizes RoPE to 3D and cross-view scenarios, improving transformer performance across multiple vision tasks.

Findings

01

URoPE improves transformer performance in view synthesis, 3D detection, and tracking.

02

It is invariant to global coordinate systems and compatible with existing RoPE kernels.

03

Experiments demonstrate URoPE's effectiveness across diverse geometric vision tasks.

Abstract

Relative position embedding has become a standard mechanism for encoding positional information in Transformers. However, existing formulations are typically limited to a fixed geometric space, namely 1D sequences or regular 2D/3D grids, which restricts their applicability to many computer vision tasks that require geometric reasoning across camera views or between 2D and 3D spaces. To address this limitation, we propose URoPE, a universal extension of Rotary Position Embedding (RoPE) to cross-view or cross-dimensional geometric spaces. For each key/value image patch, URoPE samples 3D points along the corresponding camera ray at predefined depth anchors and projects them into the query image plane. Standard 2D RoPE can then be applied using the projected pixel coordinates. URoPE is a parameter-free and intrinsics-aware relative position embedding that is invariant to the choice of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://urope-pe.github.io
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.