AuxMix: Semi-Supervised Learning with Unconstrained Unlabeled Data
Amin Banitalebi-Dehkordi, Pratik Gujjar, and Yong Zhang

TL;DR
AuxMix introduces a semi-supervised learning method that effectively utilizes unconstrained unlabeled data by leveraging self-supervised features and entropy regularization, improving performance when auxiliary data differs from labeled data.
Contribution
The paper proposes AuxMix, a novel SSL algorithm that handles auxiliary unlabeled data with different distributions by using self-supervised features and entropy maximization.
Findings
Achieved 5% accuracy improvement over baselines on CIFAR10 with Tiny-ImageNet auxiliary data.
Demonstrated robustness of AuxMix across multiple datasets.
Conducted ablation studies confirming effectiveness of key components.
Abstract
Semi-supervised learning (SSL) has seen great strides when labeled data is scarce but unlabeled data is abundant. Critically, most recent work assume that such unlabeled data is drawn from the same distribution as the labeled data. In this work, we show that state-of-the-art SSL algorithms suffer a degradation in performance in the presence of unlabeled auxiliary data that does not necessarily possess the same class distribution as the labeled set. We term this problem as Auxiliary-SSL and propose AuxMix, an algorithm that leverages self-supervised learning tasks to learn generic features in order to mask auxiliary data that are not semantically similar to the labeled set. We also propose to regularize learning by maximizing the predicted entropy for dissimilar auxiliary samples. We show an improvement of 5% over existing baselines on a ResNet-50 model when trained on CIFAR10 dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
