Rigidity Preserving Image Transformations and Equivariance in Perspective
Lucas Brynte, Georg B\"okman, Axel Flinth, Fredrik Kahl

TL;DR
This paper introduces rigidity preserving image transformations, characterizes their properties, and explores how CNNs can be adapted to be equivariant to these transformations, improving 3D inference tasks.
Contribution
It defines rigidity preserving transformations, analyzes their properties, and proposes methods to incorporate equivariance to these transformations in CNNs for better 3D inference.
Findings
Improved accuracy in 6D object pose estimation
Enhanced visual localization performance
Demonstrated benefits of rigidity equivariance in CNNs
Abstract
We characterize the class of image plane transformations which realize rigid camera motions and call these transformations `rigidity preserving'. In particular, 2D translations of pinhole images are not rigidity preserving. Hence, when using CNNs for 3D inference tasks, it can be beneficial to modify the inductive bias from equivariance towards translations to equivariance towards rigidity preserving transformations. We investigate how equivariance with respect to rigidity preserving transformations can be approximated in CNNs, and test our ideas on both 6D object pose estimation and visual localization. Experimentally, we improve on several competitive baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Image and Object Detection Techniques
