Returning The Favour: When Regression Benefits From Probabilistic Causal   Knowledge

Shahine Bouabid; Jake Fawkes; Dino Sejdinovic

arXiv:2301.11214·stat.ML·June 22, 2023

Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

Shahine Bouabid, Jake Fawkes, Dino Sejdinovic

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces collider regression, a method that leverages causal DAG structures to incorporate probabilistic causal knowledge into regression tasks, improving predictive accuracy.

Contribution

The paper presents a novel framework for integrating collider-based causal knowledge into regression, with theoretical guarantees and practical estimators.

Findings

01

Collider regression improves predictive performance on synthetic data.

02

Method demonstrates benefits on climate model data.

03

Theoretical proof of generalisation advantage under mild assumptions.

Abstract

A directed acyclic graph (DAG) provides valuable prior knowledge that is often discarded in regression tasks in machine learning. We show that the independences arising from the presence of collider structures in DAGs provide meaningful inductive biases, which constrain the regression hypothesis space and improve predictive performance. We introduce collider regression, a framework to incorporate probabilistic causal knowledge from a collider in a regression problem. When the hypothesis space is a reproducing kernel Hilbert space, we prove a strictly positive generalisation benefit under mild assumptions and provide closed-form estimators of the empirical risk minimiser. Experiments on synthetic and climate model data demonstrate performance gains of the proposed methodology.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shahineb/collider-regression
pytorchOfficial

Videos

Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge· slideslive

Taxonomy

TopicsBayesian Modeling and Causal Inference · Multi-Criteria Decision Making · Machine Learning and Data Classification