A Loss-Function for Causal Machine-Learning
I-Sheng Yang

TL;DR
This paper introduces a universal loss function for causal machine learning that enables direct training of models like neural networks to predict true causal effects, overcoming previous limitations due to lack of point-wise true values.
Contribution
It proposes a novel mean-square-error based loss function for causal inference, allowing direct gradient descent training without meta-learner strategies.
Findings
The loss function is applicable to various models including deep neural networks.
Gradient descent can be performed directly on the proposed loss function.
The method provides a standard for evaluating causal prediction models.
Abstract
Causal machine-learning is about predicting the net-effect (true-lift) of treatments. Given the data of a treatment group and a control group, it is similar to a standard supervised-learning problem. Unfortunately, there is no similarly well-defined loss function due to the lack of point-wise true values in the data. Many advances in modern machine-learning are not directly applicable due to the absence of such loss function. We propose a novel method to define a loss function in this context, which is equal to mean-square-error (MSE) in a standard regression problem. Our loss function is universally applicable, thus providing a general standard to evaluate the quality of any model/strategy that predicts the true-lift. We demonstrate that despite its novel definition, one can still perform gradient descent directly on this loss function to find the best fit. This leads to a new way to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods and Inference · Bayesian Modeling and Causal Inference
