Learning Semi-supervised Gaussian Mixture Models for Generalized Category Discovery
Bingchen Zhao, Xin Wen, Kai Han

TL;DR
This paper introduces a semi-supervised Gaussian Mixture Model framework with an EM-like approach for generalized category discovery, effectively clustering unlabelled data with unknown class numbers by leveraging labelled data and prototypical contrastive learning.
Contribution
It proposes a novel EM-like framework with a stochastic prototype splitting and merging mechanism for GCD without prior class number knowledge, combining representation learning and class estimation.
Findings
Achieves state-of-the-art results on image classification and fine-grained recognition datasets.
Effectively estimates class numbers and clusters unlabelled data.
Demonstrates robustness in generalized category discovery scenarios.
Abstract
In this paper, we address the problem of generalized category discovery (GCD), \ie, given a set of images where part of them are labelled and the rest are not, the task is to automatically cluster the images in the unlabelled data, leveraging the information from the labelled data, while the unlabelled data contain images from the labelled classes and also new ones. GCD is similar to semi-supervised learning (SSL) but is more realistic and challenging, as SSL assumes all the unlabelled images are from the same classes as the labelled ones. We also do not assume the class number in the unlabelled data is known a-priori, making the GCD problem even harder. To tackle the problem of GCD without knowing the class number, we propose an EM-like framework that alternates between representation learning and class number estimation. We propose a semi-supervised variant of the Gaussian Mixture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning in Bioinformatics · Advanced Image and Video Retrieval Techniques
MethodsContrastive Learning
