An evaluation of large-scale methods for image instance and class discovery
Matthijs Douze, Herv\'e J\'egou, Jeff Johnson

TL;DR
This paper evaluates large-scale image discovery methods, highlighting the effectiveness of diffusion algorithms like Markov Clustering over traditional k-means, especially on extensive datasets with minimal human labeling effort.
Contribution
It demonstrates the advantages of diffusion-based clustering methods, particularly Markov Clustering, for large-scale image discovery tasks, outperforming traditional approaches.
Findings
Markov Clustering outperforms other methods in large-scale scenarios.
GPU implementation reduces discovery costs significantly.
Descriptor selection impacts object class discovery.
Abstract
This paper aims at discovering meaningful subsets of related images from large image collections without annotations. We search groups of images related at different levels of semantic, i.e., either instances or visual classes. While k-means is usually considered as the gold standard for this task, we evaluate and show the interest of diffusion methods that have been neglected by the state of the art, such as the Markov Clustering algorithm. We report results on the ImageNet and the Paris500k instance dataset, both enlarged with images from YFCC100M. We evaluate our methods with a labelling cost that reflects how much effort a human would require to correct the generated clusters. Our analysis highlights several properties. First, when powered with an efficient GPU implementation, the cost of the discovery process is small compared to computing the image descriptors, even for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Remote-Sensing Image Classification
