Information-Theoretic Bounds on Transfer Generalization Gap Based on   Jensen-Shannon Divergence

Sharu Theresa Jose; Osvaldo Simeone

arXiv:2010.09484·cs.LG·January 26, 2021

Information-Theoretic Bounds on Transfer Generalization Gap Based on Jensen-Shannon Divergence

Sharu Theresa Jose, Osvaldo Simeone

PDF

Open Access

TL;DR

This paper derives novel information-theoretic bounds on the transfer generalization gap using generalized Jensen-Shannon divergences, capturing domain shift and sensitivity of the transfer learner, applicable even with unbounded loss functions.

Contribution

It introduces new bounds based on generalized Jensen-Shannon divergence that handle domain shift and unbounded losses, improving upon previous KL divergence-based bounds.

Findings

01

Bounds are tighter when domain shift is significant.

02

Applicable to unbounded loss functions with bounded cumulant generating functions.

03

Numerical example demonstrates the bounds' effectiveness.

Abstract

In transfer learning, training and testing data sets are drawn from different data distributions. The transfer generalization gap is the difference between the population loss on the target data distribution and the training loss. The training data set generally includes data drawn from both source and target distributions. This work presents novel information-theoretic upper bounds on the average transfer generalization gap that capture $(i)$ the domain shift between the target data distribution $P_{Z}^{'}$ and the source distribution $P_{Z}$ through a two-parameter family of generalized $(α_{1}, α_{2})$ -Jensen-Shannon (JS) divergences; and $(ii)$ the sensitivity of the transfer learner output $W$ to each individual sample of the data set $Z_{i}$ via the mutual information $I (W; Z_{i})$ . For $α_{1} \in (0, 1)$ , the $(α_{1}, α_{2})$ -JS divergence can be bounded even when the support of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Gaussian Processes and Bayesian Inference · Machine Learning and Algorithms