Combining pretrained CNN feature extractors to enhance clustering of complex natural images
Joris Guerin, Stephane Thiery, Eric Nyiri, Olivier Gibaru, Byron Boots

TL;DR
This paper investigates how combining features from different pretrained CNN architectures can improve image clustering, proposing a multi-view clustering approach and a neural network model that achieves state-of-the-art results.
Contribution
It introduces a multi-view clustering framework using multiple CNN features and a neural network architecture trained end-to-end for enhanced image clustering.
Findings
CNN architecture choice significantly impacts clustering quality
Multi-view clustering with combined features improves results
Proposed method achieves state-of-the-art performance on natural image datasets
Abstract
Recently, a common starting point for solving complex unsupervised image classification tasks is to use generic features, extracted with deep Convolutional Neural Networks (CNN) pretrained on a large and versatile dataset (ImageNet). However, in most research, the CNN architecture for feature extraction is chosen arbitrarily, without justification. This paper aims at providing insight on the use of pretrained CNN features for image clustering (IC). First, extensive experiments are conducted and show that, for a given dataset, the choice of the CNN architecture for feature extraction has a huge impact on the final clustering. These experiments also demonstrate that proper extractor selection for a given IC task is difficult. To solve this issue, we propose to rephrase the IC problem as a multi-view clustering (MVC) problem that considers features extracted from different architectures as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
