A Large Scale Benchmark for Individual Treatment Effect Prediction and   Uplift Modeling

Eustache Diemert; Artem Betlei; Christophe Renaudin; Massih-Reza; Amini; Th\'eophane Gregoir; Thibaud Rahier

arXiv:2111.10106·stat.ML·November 22, 2021

A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling

Eustache Diemert, Artem Betlei, Christophe Renaudin, Massih-Reza, Amini, Th\'eophane Gregoir, Thibaud Rahier

PDF

Open Access 1 Repo

TL;DR

This paper introduces a large-scale, publicly available benchmark dataset with 13.9 million samples for advancing individual treatment effect prediction and uplift modeling, enabling more robust evaluation and comparison of causal inference methods.

Contribution

It provides the largest dataset to date for ITE prediction, formalizes uplift modeling tasks, and offers baseline evaluations to facilitate future research.

Findings

01

Dataset contains 13.9 million samples from RCTs

02

Baseline methods show significant performance differences

03

Validation confirms dataset's suitability for causal inference

Abstract

Individual Treatment Effect (ITE) prediction is an important area of research in machine learning which aims at explaining and estimating the causal impact of an action at the granular level. It represents a problem of growing interest in multiple sectors of application such as healthcare, online advertising or socioeconomics. To foster research on this topic we release a publicly available collection of 13.9 million samples collected from several randomized control trials, scaling up previously available datasets by a healthy 210x factor. We provide details on the data collection and perform sanity checks to validate the use of this data for causal inference tasks. First, we formalize the task of uplift modeling (UM) that can be performed with this data, along with the relevant evaluation metrics. Then, we propose synthetic response surfaces and heterogeneous treatment assignment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

criteo-research/large-scale-ite-um-benchmark
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques · Machine Learning in Healthcare · Explainable Artificial Intelligence (XAI)