Open-World Semi-Supervised Learning
Kaidi Cao, Maria Brbic, Jure Leskovec

TL;DR
This paper introduces a new open-world semi-supervised learning setting where models must classify known classes or identify new ones in unlabeled test data, addressing real-world data variability.
Contribution
The paper proposes ORCA, an end-to-end deep learning method with an uncertainty adaptive margin to handle class distribution mismatch in open-world semi-supervised learning.
Findings
ORCA outperforms baselines on image classification datasets.
Achieves 25% improvement on seen classes.
Achieves 96% improvement on novel classes.
Abstract
A fundamental limitation of applying semi-supervised learning in real-world settings is the assumption that unlabeled test data contains only classes previously encountered in the labeled training data. However, this assumption rarely holds for data in-the-wild, where instances belonging to novel classes may appear at testing time. Here, we introduce a novel open-world semi-supervised learning setting that formalizes the notion that novel classes may appear in the unlabeled test data. In this novel setting, the goal is to solve the class distribution mismatch between labeled and unlabeled data, where at the test time every input instance either needs to be classified into one of the existing classes or a new unseen class needs to be initialized. To tackle this challenging problem, we propose ORCA, an end-to-end deep learning approach that introduces uncertainty adaptive margin mechanism…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Advanced Neural Network Applications
