Where to Experiment? Site Selection Under Distribution Shift via Optimal Transport and Wasserstein DRO
Adam Bouyamourn

TL;DR
This paper introduces a novel approach for selecting experimental sites under distribution shift using optimal transport and Wasserstein DRO, aiming to minimize estimation errors and improve robustness in experimental design.
Contribution
It formulates site selection as an optimal transport problem, develops theoretical bounds for estimation errors, and proposes a Wasserstein DRO-based robust site selection method with a data-driven uncertainty radius.
Findings
Methods outperform random and stratified sampling when covariates are prognostic.
Approach is effective for moderate-to-large problems and when covariates inform treatment effects.
Simulation and real data reanalysis validate the robustness and efficiency of the proposed methods.
Abstract
How should researchers select experimental sites when the deployment population differs from observed data? I formulate the problem of experimental site selection as an optimal transport problem, developing methods to minimize downstream estimation error by choosing sites that minimize the Wasserstein distance between population and sample covariate distributions. I develop new theoretical upper bounds on PATE and CATE estimation errors, and show that these different objectives lead to different site selection strategies. I extend this approach by using Wasserstein Distributionally Robust Optimization to develop a site selection procedure robust to adversarial perturbations of covariate information: a specific model of distribution shift. I also propose a novel data-driven procedure for selecting the uncertainty radius the Wasserstein DRO problem, which allows the user to benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Health Systems, Economic Evaluations, Quality of Life · Risk and Portfolio Optimization
