Learning under Distribution Mismatch and Model Misspecification

Saeed Masiha; Amin Gohari; Mohammad Hossein Yassaee; and Mohammad Reza; Aref

arXiv:2102.05695·cs.IT·August 11, 2022·1 cites

Learning under Distribution Mismatch and Model Misspecification

Saeed Masiha, Amin Gohari, Mohammad Hossein Yassaee, and Mohammad Reza, Aref

PDF

Open Access

TL;DR

This paper investigates the impact of distribution mismatch and model misspecification on learning algorithms, connecting generalization error with rate-distortion theory to derive improved bounds and exploring auxiliary loss functions for better error control.

Contribution

It introduces a novel connection between generalization error and rate-distortion theory, providing improved bounds and methods to handle distribution mismatch and model misspecification.

Findings

01

Rate-distortion bounds improve upon previous bounds even without mismatch.

02

The connection enables new bounds on generalization error.

03

Auxiliary loss functions can be used to bound generalization error.

Abstract

We study learning algorithms when there is a mismatch between the distributions of the training and test datasets of a learning algorithm. The effect of this mismatch on the generalization error and model misspecification are quantified. Moreover, we provide a connection between the generalization error and the rate-distortion theory, which allows one to utilize bounds from the rate-distortion theory to derive new bounds on the generalization error and vice versa. In particular, the rate-distortion based bound strictly improves over the earlier bound by Xu and Raginsky even when there is no mismatch. We also discuss how "auxiliary loss functions" can be utilized to obtain upper bounds on the generalization error.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Algorithms · Machine Learning and ELM