Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization
Kartik Ahuja, Ethan Caballero, Dinghuai Zhang, Jean-Christophe, Gagnon-Audet, Yoshua Bengio, Ioannis Mitliagkas, Irina Rish

TL;DR
This paper investigates the limitations of the invariance principle in out-of-distribution generalization for classification tasks and proposes combining it with an information bottleneck constraint to improve robustness.
Contribution
It reveals the insufficiency of invariance alone in classification OOD tasks and introduces a combined approach with information bottleneck to enhance generalization.
Findings
Invariance alone is insufficient for OOD generalization in classification.
Stronger restrictions on distribution shifts are needed for invariance-based methods.
Combining invariance with an information bottleneck improves OOD robustness.
Abstract
The invariance principle from causality is at the heart of notable approaches such as invariant risk minimization (IRM) that seek to address out-of-distribution (OOD) generalization failures. Despite the promising theory, invariance principle-based approaches fail in common classification tasks, where invariant (causal) features capture all the information about the label. Are these failures due to the methods failing to capture the invariance? Or is the invariance principle itself insufficient? To answer these questions, we revisit the fundamental assumptions in linear regression tasks, where invariance-based approaches were shown to provably generalize OOD. In contrast to the linear regression tasks, we show that for linear classification tasks we need much stronger restrictions on the distribution shifts, or otherwise OOD generalization is impossible. Furthermore, even with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Algorithms · Adversarial Robustness in Machine Learning
MethodsLinear Regression
