A churn prediction dataset from the telecom sector: a new benchmark for uplift modeling
Th\'eo Verhelst, Denis Mercier, Jeevan Shrestha, Gianluca Bontempi

TL;DR
This paper introduces a new publicly available dataset from Orange Belgium for uplift modeling in churn prediction, providing a challenging benchmark to evaluate causal impact estimation methods in telecom customer retention.
Contribution
It presents the first public dataset for uplift modeling in churn prediction, enabling standardized evaluation and comparison of causal inference techniques in telecom.
Findings
First public dataset for uplift modeling in churn prediction
Dataset's unique characteristics increase challenge level
Enables standardized evaluation of uplift modeling methods
Abstract
Uplift modeling, also known as individual treatment effect (ITE) estimation, is an important approach for data-driven decision making that aims to identify the causal impact of an intervention on individuals. This paper introduces a new benchmark dataset for uplift modeling focused on churn prediction, coming from a telecom company in Belgium, Orange Belgium. Churn, in this context, refers to customers terminating their subscription to the telecom service. This is the first publicly available dataset offering the possibility to evaluate the efficiency of uplift modeling on the churn prediction problem. Moreover, its unique characteristics make it more challenging than the few other public uplift datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFirm Innovation and Growth · Customer churn and segmentation
Methodstravel james
