TL;DR
This paper introduces a lightweight, graph neural network-based approach for visual camera re-localization that learns from relative poses and does not require intrinsics or multiple frames, achieving competitive accuracy and efficiency.
Contribution
The proposed method leverages relative pose supervision and graph neural networks to improve camera re-localization without needing intrinsics or extensive geometric optimization.
Findings
Matches the accuracy of absolute pose regression methods
Retains fast test-time speed and scene generalization
Effective on both indoor and outdoor benchmarks
Abstract
Visual re-localization means using a single image as input to estimate the camera's location and orientation relative to a pre-recorded environment. The highest-scoring methods are "structure based," and need the query camera's intrinsics as an input to the model, with careful geometric optimization. When intrinsics are absent, methods vie for accuracy by making various other assumptions. This yields fairly good localization scores, but the models are "narrow" in some way, eg., requiring costly test-time computations, or depth sensors, or multiple query frames. In contrast, our proposed method makes few special assumptions, and is fairly lightweight in training and testing. Our pose regression network learns from only relative poses of training scenes. For inference, it builds a graph connecting the query image to training counterparts and uses a graph neural network (GNN) with image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGraph Neural Network
