Generalization in anti-causal learning
Niki Kilbertus, Giambattista Parascandolo, Bernhard Sch\"olkopf

TL;DR
This paper argues that incorporating causal models into supervised learning enhances generalization by enabling hypothesis validation, especially in anti-causal tasks where traditional methods often fail beyond standard i.i.d. settings.
Contribution
It introduces a framework emphasizing the importance of causal models for hypothesis search and validation in anti-causal learning, challenging current inference-only approaches.
Findings
Anti-causal tasks require causal models for effective generalization.
Incorporating causal validation improves robustness against adversarial attacks.
Theoretical and literature evidence supports causal integration in supervised learning.
Abstract
The ability to learn and act in novel situations is still a prerogative of animate intelligence, as current machine learning methods mostly fail when moving beyond the standard i.i.d. setting. What is the reason for this discrepancy? Most machine learning tasks are anti-causal, i.e., we infer causes (labels) from effects (observations). Typically, in supervised learning we build systems that try to directly invert causal mechanisms. Instead, in this paper we argue that strong generalization capabilities crucially hinge on searching and validating meaningful hypotheses, requiring access to a causal model. In such a framework, we want to find a cause that leads to the observed effect. Anti-causal models are used to drive this search, but a causal model is required for validation. We investigate the fundamental differences between causal and anti-causal tasks, discuss implications for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning
