A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval
Federico Magliani, Tomaso Fontanini, Andrea Prati

TL;DR
This paper introduces a dense-depth representation method for VLAD descriptors using CNN feature maps, enhancing image retrieval performance by increasing feature density and tested on multiple public datasets.
Contribution
It proposes a new detector on CNN feature maps to generate more features, improving VLAD-based image retrieval accuracy.
Findings
Improved retrieval performance on Holidays, Oxford5k, Paris6k, and UKB datasets.
Enhanced feature density leads to better aggregation in VLAD descriptors.
Method outperforms existing approaches in benchmark tests.
Abstract
The recent advances brought by deep learning allowed to improve the performance in image retrieval tasks. Through the many convolutional layers, available in a Convolutional Neural Network (CNN), it is possible to obtain a hierarchy of features from the evaluated image. At every step, the patches extracted are smaller than the previous levels and more representative. Following this idea, this paper introduces a new detector applied on the feature maps extracted from pre-trained CNN. Specifically, this approach lets to increase the number of features in order to increase the performance of the aggregation algorithms like the most famous and used VLAD embedding. The proposed approach is tested on different public datasets: Holidays, Oxford5k, Paris6k and UKB.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Remote-Sensing Image Classification
