Understanding Transfer Learning and Gradient-Based Meta-Learning Techniques
Mike Huisman, Aske Plaat, Jan N. van Rijn

TL;DR
This paper compares transfer learning, MAML, and Reptile, revealing that finetuning often outperforms meta-learning methods on out-of-distribution tasks due to more diverse features.
Contribution
It provides an in-depth analysis of why finetuning can outperform meta-learning techniques like MAML and Reptile on out-of-distribution tasks.
Findings
Finetuning yields more diverse and discriminative features.
MAML and Reptile specialize in fast adaptation for similar data distributions.
Out-of-distribution generalization favors finetuning due to feature diversity.
Abstract
Deep neural networks can yield good performance on various tasks but often require large amounts of data to train them. Meta-learning received considerable attention as one approach to improve the generalization of these networks from a limited amount of data. Whilst meta-learning techniques have been observed to be successful at this in various scenarios, recent results suggest that when evaluated on tasks from a different data distribution than the one used for training, a baseline that simply finetunes a pre-trained network may be more effective than more complicated meta-learning techniques such as MAML, which is one of the most popular meta-learning techniques. This is surprising as the learning behaviour of MAML mimics that of finetuning: both rely on re-using learned features. We investigate the observed performance differences between finetuning, MAML, and another meta-learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Advanced Neural Network Applications
MethodsModel-Agnostic Meta-Learning
