Unsupervised optimal deep transfer learning for classification under general conditional shift
Junjun Lang, Yukun Liu

TL;DR
This paper introduces a novel transfer learning method under a general conditional shift assumption, leveraging deep neural networks to improve classification accuracy across different data distributions, especially in high-dimensional settings.
Contribution
It proposes a new GCS assumption, a DNN-based estimator for conditional probabilities, and a pseudo-maximum likelihood approach for target label distribution, achieving minimax optimal rates.
Findings
Achieves minimax optimal convergence rates.
Effectively handles high-dimensional data with low-dimensional structure.
Demonstrates superior performance on simulations and Alzheimer's dataset.
Abstract
Classifiers trained solely on labeled source data may yield misleading results when applied to unlabeled target data drawn from a different distribution. Transfer learning can rectify this by transferring knowledge from source to target data, but its effectiveness frequently relies on stringent assumptions, such as label shift. In this paper, we introduce a novel General Conditional Shift (GCS) assumption, which encompasses label shift as a special scenario. Under GCS, we demonstrate that both the target distribution and the shift function are identifiable. To estimate the conditional probabilities for source data, we propose leveraging deep neural networks (DNNs). Subsequent to transferring the DNN estimator, we estimate the target label distribution utilizing a pseudo-maximum likelihood approach. Ultimately, by incorporating these estimates and circumventing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning
