Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning
Junkai Huang, Chaowei Fang, Weikai Chen, Zhenhua Chai, Xiaolin Wei,, Pengxu Wei, Liang Lin, Guanbin Li

TL;DR
This paper introduces a novel open-set semi-supervised learning method that leverages all unlabeled data through a warm-up phase and a cross-modal matching strategy to effectively utilize out-of-distribution samples for improved feature learning.
Contribution
It proposes a new training mechanism combining warm-up training and cross-modal matching to exploit OOD data rather than discard it, enhancing SSL performance.
Findings
Significantly improves open-set SSL accuracy
Outperforms state-of-the-art methods by large margins
Effective OOD detection via cross-modal matching
Abstract
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data. While the mainstream technique seeks to completely filter out the OOD samples for semi-supervised learning (SSL), we propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning while avoiding its adverse impact on the SSL. We achieve this goal by first introducing a warm-up training that leverages all the unlabeled data, including both the in-distribution (ID) and OOD samples. Specifically, we perform a pretext task that enforces our feature extractor to obtain a high-level semantic understanding of the training images, leading to more discriminative features that can benefit the downstream tasks. Since the OOD samples are inevitably detrimental to SSL,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
