TL;DR
This paper introduces a fully automated method for fine-tuning CNNs for image retrieval using 3D model guidance, improving performance without human annotations and achieving state-of-the-art results.
Contribution
It presents a novel automated fine-tuning approach for CNNs in image retrieval, utilizing 3D models for training data selection and a new GeM pooling layer.
Findings
Enhanced retrieval accuracy with geometry-based hard example mining.
Discriminative CNN descriptor whitening outperforms PCA whitening.
State-of-the-art results on Oxford, Paris, and Holidays datasets.
Abstract
Image descriptors based on activations of Convolutional Neural Networks (CNNs) have become dominant in image retrieval due to their discriminative power, compactness of representation, and search efficiency. Training of CNNs, either from scratch or fine-tuning, requires a large amount of annotated data, where a high quality of annotation is often crucial. In this work, we propose to fine-tune CNNs for image retrieval on a large collection of unordered images in a fully automated manner. Reconstructed 3D models obtained by the state-of-the-art retrieval and structure-from-motion methods guide the selection of the training data. We show that both hard-positive and hard-negative examples, selected by exploiting the geometry and the camera positions available from the 3D models, enhance the performance of particular-object retrieval. CNN descriptor whitening discriminatively learned from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDropout · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Softmax · Convolution · Ethereum Customer Service Number +1-833-534-1729 · Principal Components Analysis
