Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings

Houssam Zenati; Bariscan Bozkurt; Arthur Gretton

arXiv:2506.02793·stat.ML·October 29, 2025

Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings

Houssam Zenati, Bariscan Bozkurt, Arthur Gretton

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel nonparametric framework, CPME, for estimating and testing the entire distribution of outcomes under counterfactual policies using kernel methods, with improved robustness and efficiency.

Contribution

It proposes the CPME framework with both plug-in and doubly robust estimators, and develops a kernel test statistic for hypothesis testing, advancing off-policy evaluation techniques.

Findings

01

Doubly robust estimator improves convergence rates.

02

Kernel test statistic achieves asymptotic normality.

03

Numerical simulations show CPME outperforms existing methods.

Abstract

Estimating the distribution of outcomes under counterfactual policies is critical for decision-making in domains such as recommendation, advertising, and healthcare. We propose and analyze a novel framework-Counterfactual Policy Mean Embedding (CPME)-that represents the entire counterfactual outcome distribution in a reproducing kernel Hilbert space (RKHS), enabling flexible and nonparametric distributional off-policy evaluation. We introduce both a plug-in estimator and a doubly robust estimator; the latter enjoys improved convergence rates by correcting for bias in both the outcome embedding and propensity models. Building on this, we develop a doubly robust kernel test statistic for hypothesis testing, which achieves asymptotic normality and thus enables computationally efficient testing and straightforward construction of confidence intervals. Our framework also supports sampling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings· slideslive

Taxonomy

TopicsProbability and Risk Models · Economic Policies and Impacts · Risk and Portfolio Optimization