An Information-Theoretic Analysis of the Impact of Task Similarity on Meta-Learning
Sharu Theresa Jose, Osvaldo Simeone

TL;DR
This paper introduces new information-theoretic bounds on the meta-generalization gap in meta-learning, explicitly accounting for task similarity, number of tasks, and data samples, using KL and JS divergences.
Contribution
It provides novel bounds that incorporate task relatedness and data quantity, advancing understanding of meta-learning generalization.
Findings
Bounds explicitly incorporate task similarity measures.
Task relatedness significantly impacts generalization gap.
Illustrated bounds with ridge regression example.
Abstract
Meta-learning aims at optimizing the hyperparameters of a model class or training algorithm from the observation of data from a number of related tasks. Following the setting of Baxter [1], the tasks are assumed to belong to the same task environment, which is defined by a distribution over the space of tasks and by per-task data distributions. The statistical properties of the task environment thus dictate the similarity of the tasks. The goal of the meta-learner is to ensure that the hyperparameters obtain a small loss when applied for training of a new task sampled from the task environment. The difference between the resulting average loss, known as meta-population loss, and the corresponding empirical loss measured on the available data from related tasks, known as meta-generalization gap, is a measure of the generalization capability of the meta-learner. In this paper, we present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
