Supervised Knowledge May Hurt Novel Class Discovery Performance
Ziyun Li, Jona Otholt, Ben Dai, Di Hu, Christoph Meinel, Haojin Yang

TL;DR
This paper introduces a new metric called transfer flow to measure semantic similarity between datasets and finds that supervised knowledge can sometimes harm novel class discovery performance, especially when datasets are dissimilar.
Contribution
The paper proposes the transfer flow metric, builds a benchmark on ImageNet, and demonstrates that supervised knowledge may negatively impact NCD when datasets are semantically dissimilar.
Findings
Transfer flow aligns with hierarchical class structure.
Supervised knowledge can hurt NCD performance with low semantic similarity.
Pseudo transfer flow effectively predicts when supervised knowledge is beneficial.
Abstract
Novel class discovery (NCD) aims to infer novel categories in an unlabeled dataset by leveraging prior knowledge of a labeled set comprising disjoint but related classes. Given that most existing literature focuses primarily on utilizing supervised knowledge from a labeled set at the methodology level, this paper considers the question: Is supervised knowledge always helpful at different levels of semantic relevance? To proceed, we first establish a novel metric, so-called transfer flow, to measure the semantic similarity between labeled/unlabeled datasets. To show the validity of the proposed metric, we build up a large-scale benchmark with various degrees of semantic similarities between labeled/unlabeled datasets on ImageNet by leveraging its hierarchical class structure. The results based on the proposed benchmark show that the proposed transfer flow is in line with the hierarchical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Digital Imaging for Blood Diseases
