How Fine-Tuning Allows for Effective Meta-Learning
Kurtland Chua, Qi Lei, Jason D. Lee

TL;DR
This paper provides a theoretical analysis of how fine-tuning in meta-learning, exemplified by MAML, can effectively leverage shared representations across tasks, with risk bounds demonstrating its advantages over frozen representations.
Contribution
The paper introduces a theoretical framework for analyzing MAML-like algorithms, deriving risk bounds for fine-tuning, and comparing their effectiveness to non-fine-tuning methods.
Findings
Risk bounds show fine-tuning can leverage shared structure effectively.
Guarantees instantiated for logistic regression and neural networks.
Existence of worst-case scenarios where fine-tuning outperforms frozen representations.
Abstract
Representation learning has been widely studied in the context of meta-learning, enabling rapid learning of new tasks through shared representations. Recent works such as MAML have explored using fine-tuning-based metrics, which measure the ease by which fine-tuning can achieve good performance, as proxies for obtaining representations. We present a theoretical framework for analyzing representations derived from a MAML-like algorithm, assuming the available tasks use approximately the same underlying representation. We then provide risk bounds on the best predictor found by fine-tuning via gradient descent, demonstrating that the algorithm can provably leverage the shared structure. The upper bound applies to general function classes, which we demonstrate by instantiating the guarantees of our framework in the logistic regression and neural network settings. In contrast, we establish…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
MethodsLogistic Regression · Model-Agnostic Meta-Learning
