TL;DR
This paper introduces a saliency-weighted convolutional feature framework for instance search, significantly improving retrieval performance by leveraging human visual attention models without additional region analysis or feature fine-tuning.
Contribution
It presents a novel retrieval framework using saliency-weighted local convolutional features that outperforms state-of-the-art methods on challenging benchmarks.
Findings
Outperforms state-of-the-art on INSTRE benchmark
Achieves comparable results on Oxford and Paris benchmarks
Saliency models' performance on benchmarks does not directly correlate with search performance
Abstract
This work explores attention models to weight the contribution of local convolutional representations for the instance search task. We present a retrieval framework based on bags of local convolutional features (BLCF) that benefits from saliency weighting to build an efficient image representation. The use of human visual attention models (saliency) allows significant improvements in retrieval performance without the need to conduct region analysis or spatial verification, and without requiring any feature fine tuning. We investigate the impact of different saliency models, finding that higher performance on saliency benchmarks does not necessarily equate to improved performance when used in instance search tasks. The proposed approach outperforms the state-of-the-art on the challenging INSTRE benchmark by a large margin, and provides similar performance on the Oxford and Paris…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
