Semi-Supervised Learning with Meta-Gradient
Xin-Yu Zhang, Taihong Xiao, Haolin Jia, Ming-Ming Cheng, Ming-Hsuan, Yang

TL;DR
This paper introduces a meta-gradient based semi-supervised learning algorithm that improves generalization by optimizing pseudo labels through a learn-to-generalize regularization, especially effective with limited labeled data.
Contribution
It proposes a novel meta-learning approach with a learn-to-generalize regularization for semi-supervised learning, addressing overfitting and generalization issues in low-label scenarios.
Findings
Outperforms state-of-the-art methods on SVHN, CIFAR, and ImageNet.
Effectively reduces overfitting with limited labeled data.
Provides theoretical convergence analysis for the proposed method.
Abstract
In this work, we propose a simple yet effective meta-learning algorithm in semi-supervised learning. We notice that most existing consistency-based approaches suffer from overfitting and limited model generalization ability, especially when training with only a small number of labeled data. To alleviate this issue, we propose a learn-to-generalize regularization term by utilizing the label information and optimize the problem in a meta-learning fashion. Specifically, we seek the pseudo labels of the unlabeled data so that the model can generalize well on the labeled data, which is formulated as a nested optimization problem. We address this problem using the meta-gradient that bridges between the pseudo label and the regularization term. In addition, we introduce a simple first-order approximation to avoid computing higher-order derivatives and provide theoretic convergence analysis.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
