Post Launch Evaluation of Policies in a High-Dimensional Setting
Shima Nassiri, Mohsen Bayati, Joe Cooprider

TL;DR
This paper investigates using synthetic control methods combined with machine learning to evaluate policies in large-scale settings, addressing interpolation bias and machine learning bias to improve accuracy in high-dimensional data.
Contribution
It introduces a two-phase approach with nearest neighbor matching and supervised learning, along with de-biasing techniques, for more accurate policy evaluation in high-dimensional, large-scale experiments.
Findings
Improved counterfactual estimation accuracy in large-scale experiments.
Identification of machine learning bias affecting treatment effect estimates.
Effective de-biasing techniques for high-dimensional policy evaluation.
Abstract
A/B tests, also known as randomized controlled experiments (RCTs), are the gold standard for evaluating the impact of new policies, products, or decisions. However, these tests can be costly in terms of time and resources, potentially exposing users, customers, or other test subjects (units) to inferior options. This paper explores practical considerations in applying methodologies inspired by "synthetic control" as an alternative to traditional A/B testing in settings with very large numbers of units, involving up to hundreds of millions of units, which is common in modern applications such as e-commerce and ride-sharing platforms. This method is particularly valuable in settings where the treatment affects only a subset of units, leaving many units unaffected. In these scenarios, synthetic control methods leverage data from unaffected units to estimate counterfactual outcomes for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvaluation and Performance Assessment
