Boosting Semi-Supervised Learning by bridging high and low-confidence   predictions

Khanh-Binh Nguyen; Joon-Sung Yang

arXiv:2308.07509·cs.CV·August 16, 2023

Boosting Semi-Supervised Learning by bridging high and low-confidence predictions

Khanh-Binh Nguyen, Joon-Sung Yang

PDF

Open Access

TL;DR

This paper introduces ReFixMatch, a semi-supervised learning method that leverages all unlabeled data by bridging high and low-confidence predictions, significantly improving accuracy on large-scale benchmarks like ImageNet.

Contribution

ReFixMatch is a novel SSL approach that effectively utilizes low-confidence predictions, addressing confirmation bias and the Matthew effect to enhance model generalization.

Findings

01

Achieves 41.05% top-1 accuracy on ImageNet with 100k labels.

02

Outperforms FixMatch and other state-of-the-art SSL methods.

03

Effectively utilizes all unlabeled data during training.

Abstract

Pseudo-labeling is a crucial technique in semi-supervised learning (SSL), where artificial labels are generated for unlabeled data by a trained model, allowing for the simultaneous training of labeled and unlabeled data in a supervised setting. However, several studies have identified three main issues with pseudo-labeling-based approaches. Firstly, these methods heavily rely on predictions from the trained model, which may not always be accurate, leading to a confirmation bias problem. Secondly, the trained model may be overfitted to easy-to-learn examples, ignoring hard-to-learn ones, resulting in the \textit{"Matthew effect"} where the already strong become stronger and the weak weaker. Thirdly, most of the low-confidence predictions of unlabeled data are discarded due to the use of a high threshold, leading to an underutilization of unlabeled data during training. To address these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning and Data Classification

MethodsFixMatch