Does Invariant Risk Minimization Capture Invariance?
Pritish Kamath, Akilesh Tangella, Danica J. Sutherland and, Nathan Srebro

TL;DR
This paper critically evaluates Invariant Risk Minimization (IRM), revealing its limitations in capturing true invariances and its fragility to sampling, which can impair generalization in simple problems.
Contribution
It demonstrates that the practical linear IRM can fail to capture invariances and may learn sub-optimal predictors, highlighting gaps between linear and non-linear IRM formulations.
Findings
Linear IRM can fail to capture invariances.
IRM may learn sub-optimal predictors.
IRM is highly sensitive to sampling variability.
Abstract
We show that the Invariant Risk Minimization (IRM) formulation of Arjovsky et al. (2019) can fail to capture "natural" invariances, at least when used in its practical "linear" form, and even on very simple problems which directly follow the motivating examples for IRM. This can lead to worse generalization on new environments, even when compared to unconstrained ERM. The issue stems from a significant gap between the linear variant (as in their concrete method IRMv1) and the full non-linear IRM formulation. Additionally, even when capturing the "right" invariances, we show that it is possible for IRM to learn a sub-optimal predictor, due to the loss function not being invariant across environments. The issues arise even when measuring invariance on the population distributions, but are exacerbated by the fact that IRM is extremely fragile to sampling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Bayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference
