Conditional Mutual Information-Based Generalization Bound for Meta   Learning

Arezou Rezazadeh; Sharu Theresa Jose; Giuseppe Durisi; Osvaldo Simeone

arXiv:2010.10886·cs.LG·February 9, 2021

Conditional Mutual Information-Based Generalization Bound for Meta Learning

Arezou Rezazadeh, Sharu Theresa Jose, Giuseppe Durisi, Osvaldo Simeone

PDF

TL;DR

This paper develops an information-theoretic generalization bound for meta-learning using conditional mutual information, providing a new way to quantify how well a meta-learner can generalize from limited task data.

Contribution

It extends the CMI framework to meta-learning, deriving an explicit bound based on a meta-supersample and demonstrating its advantages over previous bounds.

Findings

01

The bound explicitly involves two CMI terms measuring information about data selection.

02

Numerical example shows the bound's effectiveness compared to prior bounds.

03

The approach offers a new perspective on meta-learning generalization performance.

Abstract

Meta-learning optimizes an inductive bias---typically in the form of the hyperparameters of a base-learning algorithm---by observing data from a finite number of related tasks. This paper presents an information-theoretic bound on the generalization performance of any given meta-learner, which builds on the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020). In the proposed extension to meta-learning, the CMI bound involves a training \textit{meta-supersample} obtained by first sampling $2 N$ independent tasks from the task environment, and then drawing $2 M$ independent training samples for each sampled task. The meta-training data fed to the meta-learner is modelled as being obtained by randomly selecting $N$ tasks from the available $2 N$ tasks and $M$ training samples per task from the available $2 M$ training samples per task. The resulting bound is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.