Sample Selection Bias in Evaluation of Prediction Performance of Causal   Models

James P. Long; Min Jin Ha

arXiv:2106.01921·stat.ML·October 27, 2021

Sample Selection Bias in Evaluation of Prediction Performance of Causal Models

James P. Long, Min Jin Ha

PDF

1 Repo

TL;DR

This paper investigates how sample selection bias affects the evaluation of causal models' prediction performance, revealing that bias can lead to overly optimistic assessments and suggesting improved evaluation methods.

Contribution

It identifies sample selection bias as a key factor in evaluating causal models and proposes using less-biased evaluation sets for more accurate performance assessment.

Findings

01

Sample selection bias inflates causal model performance estimates.

02

Causal models perform similarly or worse than standard estimators on unbiased sets.

03

Simulations without bias show different performance patterns, informing future evaluations.

Abstract

Causal models are notoriously difficult to validate because they make untestable assumptions regarding confounding. New scientific experiments offer the possibility of evaluating causal models using prediction performance. Prediction performance measures are typically robust to violations in causal assumptions. However, prediction performance does depend on the selection of training and test sets. Biased training sets can lead to optimistic assessments of model performance. In this work, we revisit the prediction performance of several recently proposed causal models tested on a genetic perturbation data set of Kemmeren. We find that sample selection bias is likely a key driver of model performance. We propose using a less-biased evaluation set for assessing prediction performance and compare models on this new set. In this setting, the causal models have similar or worse performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

longjp/causal-bias-code
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.