Improving uplift model evaluation on RCT data
Bj\"orn Bokelmann, Stefan Lessmann

TL;DR
This paper analyzes the high variance of uplift model evaluation metrics on RCT data and proposes variance reduction methods based on outcome adjustment, demonstrating their effectiveness through theoretical and empirical results.
Contribution
It introduces variance reduction techniques for uplift evaluation metrics and provides theoretical and empirical evidence of their benefits on RCT data.
Findings
Variance reduction improves evaluation stability.
Methods outperform traditional metrics in noisy data.
Empirical validation on real-world datasets.
Abstract
Estimating treatment effects is one of the most challenging and important tasks of data analysts. In many applications, like online marketing and personalized medicine, treatment needs to be allocated to the individuals where it yields a high positive treatment effect. Uplift models help select the right individuals for treatment and maximize the overall treatment effect (uplift). A major challenge in uplift modeling concerns model evaluation. Previous literature suggests methods like the Qini curve and the transformed outcome mean squared error. However, these metrics suffer from variance: their evaluations are strongly affected by random noise in the data, which renders their signals, to a certain degree, arbitrary. We theoretically analyze the variance of uplift evaluation metrics and derive possible methods of variance reduction, which are based on statistical adjustment of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods and Inference · Consumer Market Behavior and Pricing
