Large-scale Pre-trained Models are Surprisingly Strong in Incremental   Novel Class Discovery

Mingxuan Liu; Subhankar Roy; Zhun Zhong; Nicu Sebe; Elisa Ricci

arXiv:2303.15975·cs.CV·August 26, 2024·1 cites

Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery

Mingxuan Liu, Subhankar Roy, Zhun Zhong, Nicu Sebe, Elisa Ricci

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that large-scale self-supervised pre-trained models can effectively discover new classes in unlabeled data continuously and without supervision, outperforming more complex existing methods.

Contribution

It introduces a simple, robust baseline using a frozen pre-trained model and linear classifier for incremental novel class discovery without labeled data.

Findings

01

Pre-trained models outperform state-of-the-art methods in class discovery.

02

Simple baselines are effective and resilient in long-term learning scenarios.

03

Open-source code facilitates reproducibility and further research.

Abstract

Discovering novel concepts in unlabelled datasets and in a continuous manner is an important desideratum of lifelong learners. In the literature such problems have been partially addressed under very restricted settings, where novel classes are learned by jointly accessing a related labelled set (e.g., NCD) or by leveraging only a supervisedly pre-trained model (e.g., class-iNCD). In this work we challenge the status quo in class-iNCD and propose a learning paradigm where class discovery occurs continuously and truly unsupervisedly, without needing any related labelled set. In detail, we propose to exploit the richer priors from strong self-supervised pre-trained models (PTM). To this end, we propose simple baselines, composed of a frozen PTM backbone and a learnable linear classifier, that are not only simple to implement but also resilient under longer learning scenarios. We conduct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

oatmealliu/msc-incd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Topic Modeling · Natural Language Processing Techniques