TL;DR
This paper introduces an unsupervised method for fine-tuning CNNs for image retrieval by leveraging 3D models from SfM to select training examples, improving retrieval accuracy without manual annotation.
Contribution
It presents a novel unsupervised fine-tuning approach for CNNs in image retrieval using 3D models to automatically select hard positive and negative examples.
Findings
Hard examples significantly improve retrieval performance.
Unsupervised fine-tuning outperforms baseline methods.
Method enhances object retrieval with compact codes.
Abstract
Convolutional Neural Networks (CNNs) achieve state-of-the-art performance in many computer vision tasks. However, this achievement is preceded by extreme manual annotation in order to perform either training from scratch or fine-tuning for the target task. In this work, we propose to fine-tune CNN for image retrieval from a large collection of unordered images in a fully automated manner. We employ state-of-the-art retrieval and Structure-from-Motion (SfM) methods to obtain 3D models, which are used to guide the selection of the training data for CNN fine-tuning. We show that both hard positive and hard negative examples enhance the final performance in particular object retrieval with compact codes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
