TL;DR
Patch-NetVLAD introduces a multi-scale fusion approach for local and global descriptors, significantly improving visual place recognition robustness against appearance and viewpoint changes in real-world environments.
Contribution
It proposes a novel patch-level feature formulation from NetVLAD residuals and a multi-scale fusion method, enhancing invariance and performance over existing descriptors.
Findings
Outperforms existing global and local descriptors on challenging datasets
Achieves state-of-the-art results in visual place recognition
Offers a faster, configurable version suitable for real-time applications
Abstract
Visual Place Recognition is a challenging task for robotics and autonomous systems, which must deal with the twin problems of appearance and viewpoint change in an always changing world. This paper introduces Patch-NetVLAD, which provides a novel formulation for combining the advantages of both local and global descriptor methods by deriving patch-level features from NetVLAD residuals. Unlike the fixed spatial neighborhood regime of existing local keypoint features, our method enables aggregation and matching of deep-learned local features defined over the feature-space grid. We further introduce a multi-scale fusion of patch features that have complementary scales (i.e. patch sizes) via an integral feature space and show that the fused features are highly invariant to both condition (season, structure, and illumination) and viewpoint (translation and rotation) changes. Patch-NetVLAD…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
