Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge
Shahine Bouabid, Jake Fawkes, Dino Sejdinovic

TL;DR
This paper introduces collider regression, a method that leverages causal DAG structures to incorporate probabilistic causal knowledge into regression tasks, improving predictive accuracy.
Contribution
The paper presents a novel framework for integrating collider-based causal knowledge into regression, with theoretical guarantees and practical estimators.
Findings
Collider regression improves predictive performance on synthetic data.
Method demonstrates benefits on climate model data.
Theoretical proof of generalisation advantage under mild assumptions.
Abstract
A directed acyclic graph (DAG) provides valuable prior knowledge that is often discarded in regression tasks in machine learning. We show that the independences arising from the presence of collider structures in DAGs provide meaningful inductive biases, which constrain the regression hypothesis space and improve predictive performance. We introduce collider regression, a framework to incorporate probabilistic causal knowledge from a collider in a regression problem. When the hypothesis space is a reproducing kernel Hilbert space, we prove a strictly positive generalisation benefit under mild assumptions and provide closed-form estimators of the empirical risk minimiser. Experiments on synthetic and climate model data demonstrate performance gains of the proposed methodology.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsBayesian Modeling and Causal Inference · Multi-Criteria Decision Making · Machine Learning and Data Classification
