The Adaptive Doubly Robust Estimator for Policy Evaluation in Adaptive   Experiments and a Paradox Concerning Logging Policy

Masahiro Kato; Shota Yasui; Kenichiro McAlinn

arXiv:2010.03792·cs.LG·June 22, 2021

The Adaptive Doubly Robust Estimator for Policy Evaluation in Adaptive Experiments and a Paradox Concerning Logging Policy

Masahiro Kato, Shota Yasui, Kenichiro McAlinn

PDF

Open Access

TL;DR

This paper introduces an adaptive doubly robust estimator for policy evaluation in adaptive experiments with dependent samples, revealing a paradox where it outperforms estimators using true logging policies, challenging traditional efficiency explanations.

Contribution

It proposes a novel DR estimator tailored for dependent samples from adaptive experiments and introduces adaptive-fitting to achieve asymptotic normality with non-Donsker nuisance estimators.

Findings

01

The proposed estimator performs better than those using true logging policies.

02

Simulation studies confirm the paradoxical performance advantage.

03

Traditional efficiency explanations do not account for this phenomenon.

Abstract

The doubly robust (DR) estimator, which consists of two nuisance parameters, the conditional mean outcome and the logging policy (the probability of choosing an action), is crucial in causal inference. This paper proposes a DR estimator for dependent samples obtained from adaptive experiments. To obtain an asymptotically normal semiparametric estimator from dependent samples with non-Donsker nuisance estimators, we propose adaptive-fitting as a variant of sample-splitting. We also report an empirical paradox that our proposed DR estimator tends to show better performances compared to other estimators utilizing the true logging policy. While a similar phenomenon is known for estimators with i.i.d. samples, traditional explanations based on asymptotic efficiency cannot elucidate our case with dependent samples. We confirm this hypothesis through simulation studies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Advanced Statistical Process Monitoring · Statistical Methods in Clinical Trials