Meta-learning Transferable Representations with a Single Target Domain
Hong Liu, Jeff Z. HaoChen, Colin Wei, Tengyu Ma

TL;DR
This paper investigates the limitations of fine-tuning and joint training in transfer learning, introduces Meta Representation Learning (MeRLin) to learn transferable features, and demonstrates its superior performance on vision and NLP benchmarks.
Contribution
The paper proposes MeRLin, a novel meta-learning approach that explicitly learns transferable features, with theoretical guarantees and empirical improvements over existing methods.
Findings
MeRLin outperforms state-of-the-art transfer learning methods.
Pre-training may not incentivize learning transferable features.
Joint training can overfit to source-specific features.
Abstract
Recent works found that fine-tuning and joint training---two popular approaches for transfer learning---do not always improve accuracy on downstream tasks. First, we aim to understand more about when and why fine-tuning and joint training can be suboptimal or even harmful for transfer learning. We design semi-synthetic datasets where the source task can be solved by either source-specific features or transferable features. We observe that (1) pre-training may not have incentive to learn transferable features and (2) joint training may simultaneously learn source-specific features and overfit to the target. Second, to improve over fine-tuning and joint training, we propose Meta Representation Learning (MeRLin) to learn transferable features. MeRLin meta-learns representations by ensuring that a head fit on top of the representations with target training data also performs well on target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
