Eff-GRot: Efficient and Generalizable Rotation Estimation with Transformers
Fanis Mathioulakis, Gorjan Radevski, Tinne Tuytelaars

TL;DR
Eff-GRot is a transformer-based method that efficiently estimates object rotation from RGB images using reference comparisons, achieving a good balance of accuracy and speed without category-specific training.
Contribution
It introduces a novel transformer framework for rotation estimation that is both efficient and generalizable, working across categories without specialized training.
Findings
Achieves high accuracy with low latency
Operates without object-category-specific training
Scalable and end-to-end design
Abstract
We introduce Eff-GRot, an approach for efficient and generalizable rotation estimation from RGB images. Given a query image and a set of reference images with known orientations, our method directly predicts the object's rotation in a single forward pass, without requiring object- or category-specific training. At the core of our framework is a transformer that performs a comparison in the latent space, jointly processing rotation-aware representations from multiple references alongside a query. This design enables a favorable balance between accuracy and computational efficiency while remaining simple, scalable, and fully end-to-end. Experimental results show that Eff-GRot offers a promising direction toward more efficient rotation estimation, particularly in latency-sensitive applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Image Retrieval and Classification Techniques
