Nonlinear Invariant Risk Minimization: A Causal Approach
Chaochao Lu, Yuhuai Wu, Jo\'se Miguel Hern\'andez-Lobato, Bernhard, Sch\"olkopf

TL;DR
This paper introduces iCaRL, a novel nonlinear invariant risk minimization method that leverages causal representation learning to improve out-of-distribution generalization in machine learning models.
Contribution
It proposes a new approach for nonlinear invariant risk minimization based on causal assumptions, enabling identification of causal factors and OOD generalization beyond linear models.
Findings
Outperforms baseline methods on synthetic and real datasets
Identifies all direct causes of the target in nonlinear settings
Provides theoretical guarantees for causal representation recovery
Abstract
Due to spurious correlations, machine learning systems often fail to generalize to environments whose distributions differ from the ones used at training time. Prior work addressing this, either explicitly or implicitly, attempted to find a data representation that has an invariant relationship with the target. This is done by leveraging a diverse set of training environments to reduce the effect of spurious features and build an invariant predictor. However, these methods have generalization guarantees only when both data representation and classifiers come from a linear model class. We propose invariant Causal Representation Learning (iCaRL), an approach that enables out-of-distribution (OOD) generalization in the nonlinear setting (i.e., nonlinear representations and nonlinear classifiers). It builds upon a practical and general assumption: the prior over the data representation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Bayesian Modeling and Causal Inference · Machine Learning and Data Classification
