Off-Policy Evaluation Under Nonignorable Missing Data

Han Wang; Yang Xu; Wenbin Lu; Rui Song

arXiv:2507.06961·stat.ML·July 10, 2025

Off-Policy Evaluation Under Nonignorable Missing Data

Han Wang, Yang Xu, Wenbin Lu, Rui Song

PDF

Open Access

TL;DR

This paper studies how missing data affects off-policy evaluation, showing unbiasedness under ignorable missingness and bias under nonignorable missingness, and proposes an inverse probability weighted estimator for reliable inference.

Contribution

It provides a theoretical analysis of missing data effects on OPE and introduces a new estimator to improve value estimation under missing data conditions.

Findings

01

Unbiased OPE estimates under ignorable missingness.

02

Bias in OPE estimates under nonignorable missingness.

03

Proposed estimator improves reliability of value inference.

Abstract

Off-Policy Evaluation (OPE) aims to estimate the value of a target policy using offline data collected from potentially different policies. In real-world applications, however, logged data often suffers from missingness. While OPE has been extensively studied in the literature, a theoretical understanding of how missing data affects OPE results remains unclear. In this paper, we investigate OPE in the presence of monotone missingness and theoretically demonstrate that the value estimates remain unbiased under ignorable missingness but can be biased under nonignorable (informative) missingness. To retain the consistency of value estimation, we propose an inverse probability weighted value estimator and conduct statistical inference to quantify the uncertainty of the estimates. Through a series of numerical experiments, we empirically demonstrate that our proposed estimator yields a more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research