TL;DR
This paper critically examines how the choice of pseudo ground truth algorithms influences the evaluation of visual camera re-localisation methods, revealing biases and questioning established performance rankings.
Contribution
It analyzes the impact of reference algorithms on re-localisation benchmarks and highlights potential biases affecting method rankings.
Findings
Evaluation outcomes vary with the reference algorithm used.
Claims about the superiority of learning-based over classical methods are challenged.
The influence of the reference algorithm's similarity to the evaluated methods is significant.
Abstract
Benchmark datasets that measure camera pose accuracy have driven progress in visual re-localisation research. To obtain poses for thousands of images, it is common to use a reference algorithm to generate pseudo ground truth. Popular choices include Structure-from-Motion (SfM) and Simultaneous-Localisation-and-Mapping (SLAM) using additional sensors like depth cameras if available. Re-localisation benchmarks thus measure how well each method replicates the results of the reference algorithm. This begs the question whether the choice of the reference algorithm favours a certain family of re-localisation methods. This paper analyzes two widely used re-localisation datasets and shows that evaluation outcomes indeed vary with the choice of the reference algorithm. We thus question common beliefs in the re-localisation literature, namely that learning-based scene coordinate regression…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
