Honesty in Causal Forests: When It Helps and When It Hurts
Yanfang Hou, Carlos Fern\'andez-Lor\'ia

TL;DR
This paper examines the impact of honest estimation in causal forests, revealing that while it reduces overfitting, it can also hinder the detection of heterogeneity, especially with rich data, and should be used judiciously.
Contribution
It provides empirical evidence that honest estimation can impair individual treatment effect accuracy and frames honesty as a form of regularization requiring context-dependent use.
Findings
Honest estimation can require up to 25% more data to match non-honest models.
Honesty reduces overfitting but may cause underfitting in rich data scenarios.
The choice of honesty should be guided by application goals and empirical testing.
Abstract
Causal forests estimate how treatment effects vary across individuals, guiding personalized interventions in areas like marketing, operations, and public policy. A standard modeling practice with this method is honest estimation: dividing the data into two samples, one to define subgroups and another to estimate treatment effects within them. This is intended to reduce overfitting and is the default in many software packages. But is it the right choice? In this paper, we show that honest estimation can reduce the accuracy of individual-level treatment effect estimates, especially when there are substantial differences in how individuals respond to treatment, and the data is rich enough to uncover those differences. The core issue is a classic bias-variance trade-off: honesty lowers the risk of overfitting but increases the risk of underfitting, because it limits the data available to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Explainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference
