Double Machine Learning at Scale to Predict Causal Impact of Customer Actions
Sushant More, Priya Kotwal, Sujith Chappidi, Dinesh Mandalapu, Chris, Khawand

TL;DR
This paper demonstrates a scalable implementation of Double Machine Learning to estimate causal impact of customer actions across millions of users, improving accuracy and efficiency over traditional methods.
Contribution
It introduces a scalable, Spark-based causal ML library with flexible configuration, enabling fast, large-scale causal impact estimation for multiple customer actions.
Findings
2.2% improvement over baseline methods
2.5X faster computation time
Scalable application across hundreds of actions and millions of customers
Abstract
Causal Impact (CI) of customer actions are broadly used across the industry to inform both short- and long-term investment decisions of various types. In this paper, we apply the double machine learning (DML) methodology to estimate the CI values across 100s of customer actions of business interest and 100s of millions of customers. We operationalize DML through a causal ML library based on Spark with a flexible, JSON-driven model configuration approach to estimate CI at scale (i.e., across hundred of actions and millions of customers). We outline the DML methodology and implementation, and associated benefits over the traditional potential outcomes based CI model. We show population-level as well as customer-level CI values along with confidence intervals. The validation metrics show a 2.2% gain over the baseline methods and a 2.5X gain in the computational time. Our contribution is to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLib
