Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data
Colin Wei, Kendrick Shen, Yining Chen, Tengyu Ma

TL;DR
This paper provides a theoretical framework for understanding how self-training with deep neural networks can succeed in semi-supervised learning, domain adaptation, and unsupervised learning, under realistic assumptions.
Contribution
It introduces a unified theoretical analysis of self-training with deep networks, extending previous linear model results to nonlinear neural networks.
Findings
High accuracy achievable under expansion and minimal overlap assumptions
Population objectives lead to effective self-training solutions
Sample complexity bounds are polynomial in key neural network parameters
Abstract
Self-training algorithms, which train a model to fit pseudolabels predicted by another previously-learned model, have been very successful for learning with unlabeled data using neural networks. However, the current theoretical understanding of self-training only applies to linear models. This work provides a unified theoretical analysis of self-training with deep networks for semi-supervised learning, unsupervised domain adaptation, and unsupervised learning. At the core of our analysis is a simple but realistic "expansion" assumption, which states that a low probability subset of the data must expand to a neighborhood with large probability relative to the subset. We also assume that neighborhoods of examples in different classes have minimal overlap. We prove that under these assumptions, the minimizers of population objectives based on self-training and input-consistency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Machine Learning and Algorithms
