RegMixMatch: Optimizing Mixup Utilization in Semi-Supervised Learning

Haorong Han; Jidong Yuan; Chixuan Wei; Zhongyang Yu

arXiv:2412.10741·cs.LG·April 18, 2025

RegMixMatch: Optimizing Mixup Utilization in Semi-Supervised Learning

Haorong Han, Jidong Yuan, Chixuan Wei, Zhongyang Yu

PDF

Open Access 1 Video

TL;DR

RegMixMatch is a novel semi-supervised learning framework that optimizes Mixup usage, including low-confidence samples, to improve label purity and overall performance, achieving state-of-the-art results.

Contribution

It introduces semi-supervised RegMixup and class-aware Mixup techniques to better utilize all unlabeled data and reduce confirmation bias in SSL.

Findings

01

Achieves state-of-the-art results on SSL benchmarks.

02

Effectively utilizes low-confidence samples with class-aware Mixup.

03

Improves label purity and reduces confirmation bias.

Abstract

Consistency regularization and pseudo-labeling have significantly advanced semi-supervised learning (SSL). Prior works have effectively employed Mixup for consistency regularization in SSL. However, our findings indicate that applying Mixup for consistency regularization may degrade SSL performance by compromising the purity of artificial labels. Moreover, most pseudo-labeling based methods utilize thresholding strategy to exclude low-confidence data, aiming to mitigate confirmation bias; however, this approach limits the utility of unlabeled samples. To address these challenges, we propose RegMixMatch, a novel framework that optimizes the use of Mixup with both high- and low-confidence samples in SSL. First, we introduce semi-supervised RegMixup, which effectively addresses reduced artificial labels purity by using both mixed samples and clean samples for training. Second, we develop a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

RegMixMatch: Optimizing Mixup Utilization in Semi-Supervised Learning· underline

Taxonomy

TopicsMachine Learning and Data Classification · Speech Recognition and Synthesis · Text and Document Classification Technologies

MethodsMixup