Statistical Learning for Heterogeneous Treatment Effects: Pretraining, Prognosis, and Prediction

Maximilian Schuessler; Erik Sverdrup; Robert Tibshirani

arXiv:2505.00310·stat.ML·June 23, 2025

Statistical Learning for Heterogeneous Treatment Effects: Pretraining, Prognosis, and Prediction

Maximilian Schuessler, Erik Sverdrup, Robert Tibshirani

PDF

TL;DR

This paper introduces pretraining strategies that leverage the relationship between prognostic factors and treatment effect heterogeneity to improve the estimation of conditional average treatment effects using the R-learner framework.

Contribution

It proposes a novel pretraining approach that exploits the correlation between prognosis and treatment effects to enhance causal effect estimation in machine learning models.

Findings

01

Pretraining improves the accuracy of CATE estimates.

02

The approach reduces false discovery rates in heterogeneity detection.

03

Models show higher power in identifying treatment effect heterogeneity.

Abstract

Robust estimation of heterogeneous treatment effects is a fundamental challenge for optimal decision-making in domains ranging from personalized medicine to educational policy. In recent years, predictive machine learning has emerged as a valuable toolbox for causal estimation, enabling more flexible effect estimation. However, accurately estimating conditional average treatment effects (CATE) remains a major challenge, particularly in the presence of many covariates. In this article, we propose pretraining strategies that leverage a phenomenon in real-world applications: factors that are prognostic of the outcome are frequently also predictive of treatment effect heterogeneity. In medicine, for example, components of the same biological signaling pathways frequently influence both baseline risk and treatment response. Specifically, we demonstrate our approach within the R-learner…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.