Information-Theoretic Bounds on Transfer Generalization Gap Based on Jensen-Shannon Divergence
Sharu Theresa Jose, Osvaldo Simeone

TL;DR
This paper derives novel information-theoretic bounds on the transfer generalization gap using generalized Jensen-Shannon divergences, capturing domain shift and sensitivity of the transfer learner, applicable even with unbounded loss functions.
Contribution
It introduces new bounds based on generalized Jensen-Shannon divergence that handle domain shift and unbounded losses, improving upon previous KL divergence-based bounds.
Findings
Bounds are tighter when domain shift is significant.
Applicable to unbounded loss functions with bounded cumulant generating functions.
Numerical example demonstrates the bounds' effectiveness.
Abstract
In transfer learning, training and testing data sets are drawn from different data distributions. The transfer generalization gap is the difference between the population loss on the target data distribution and the training loss. The training data set generally includes data drawn from both source and target distributions. This work presents novel information-theoretic upper bounds on the average transfer generalization gap that capture the domain shift between the target data distribution and the source distribution through a two-parameter family of generalized -Jensen-Shannon (JS) divergences; and the sensitivity of the transfer learner output to each individual sample of the data set via the mutual information . For , the -JS divergence can be bounded even when the support of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Gaussian Processes and Bayesian Inference · Machine Learning and Algorithms
