Joint Multi-Condition Representation Modelling via Matrix Factorisation for Visual Place Recognition
Timur Ismagilov, Shakaiba Majeed, Michael Milford, Tan Viet Tuyen Nguyen, Sarvapali D. Ramchurn, Shoaib Ehsan

TL;DR
This paper introduces a training-free, matrix factorization-based method for multi-reference visual place recognition that improves localization accuracy across appearance and viewpoint variations without extensive training.
Contribution
It proposes a novel, descriptor-agnostic approach using matrix decomposition for joint place modeling, along with a new benchmark for multi-viewpoint VPR.
Findings
Improves Recall@1 by up to 18% over single-reference methods.
Outperforms existing multi-reference baselines across various conditions.
Demonstrates strong generalization with lightweight computation.
Abstract
We address multi-reference visual place recognition (VPR), where reference sets captured under varying conditions are used to improve localisation performance. While deep learning with large-scale training improves robustness, increasing data diversity and model complexity incur extensive computational cost during training and deployment. Descriptor-level fusion via voting or aggregation avoids training, but often targets multi-sensor setups or relies on heuristics with limited gains under appearance and viewpoint change. We propose a training-free, descriptor-agnostic approach that jointly models places using multiple reference descriptors via matrix decomposition into basis representations, enabling projection-based residual matching. We also introduce SotonMV, a structured benchmark for multi-viewpoint VPR. On multi-appearance data, our method improves Recall@1 by up to ~18% over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Face recognition and analysis
