Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization
Jivat Neet Kaur, Emre Kiciman, Amit Sharma

TL;DR
This paper emphasizes the importance of modeling the data-generating process for out-of-distribution generalization, introduces a causal framework, and proposes an adaptive algorithm that improves accuracy across diverse distribution shifts.
Contribution
It provides a formal causal characterization of distribution shifts and develops CACM, an adaptive method that selects appropriate constraints based on the data-generating process.
Findings
CACM outperforms fixed-constraint methods on multiple datasets.
Modeling causal relationships improves out-of-distribution generalization.
Incorrect constraints lead to poorer performance on unseen domains.
Abstract
Recent empirical studies on domain generalization (DG) have shown that DG algorithms that perform well on some distribution shifts fail on others, and no state-of-the-art DG algorithm performs consistently well on all shifts. Moreover, real-world data often has multiple distribution shifts over different attributes; hence we introduce multi-attribute distribution shift datasets and find that the accuracy of existing DG algorithms falls even further. To explain these results, we provide a formal characterization of generalization under multi-attribute shifts using a canonical causal graph. Based on the relationship between spurious attributes and the classification label, we obtain realizations of the canonical causal graph that characterize common distribution shifts and show that each shift entails different independence constraints over observed variables. As a result, we prove that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · AI-based Problem Solving and Planning
