Automatically Discovering and Learning New Visual Categories with Ranking Statistics
Kai Han, Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Andrea, Vedaldi, Andrew Zisserman

TL;DR
This paper introduces a novel approach for discovering new visual categories in image collections by combining self-supervised learning, rank statistics, and joint training on labeled and unlabeled data, outperforming existing methods.
Contribution
It proposes a new method that avoids bias from labeled data, uses rank statistics for clustering, and jointly trains representations on labeled and unlabeled data for better novel category discovery.
Findings
Outperforms current methods on standard benchmarks.
Effectively discovers new classes without labeled examples.
Improves clustering accuracy for unlabelled data.
Abstract
We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes. This setting is similar to semi-supervised learning, but significantly harder because there are no labelled examples for the new classes. The challenge, then, is to leverage the information contained in the labelled images in order to learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data. In this work we address this problem by combining three ideas: (1) we suggest that the common approach of bootstrapping an image representation using the labeled data only introduces an unwanted bias, and that this can be avoided by using self-supervised learning to train the representation from scratch on the union of labelled and unlabelled data; (2) we use rank statistics to transfer the model's knowledge of the labelled classes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
