Model-Free and Model-Based Policy Evaluation when Causality is Uncertain

David Bruns-Smith

arXiv:2204.00956·cs.LG·April 5, 2022

Model-Free and Model-Based Policy Evaluation when Causality is Uncertain

David Bruns-Smith

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the challenges of policy evaluation in the presence of unobserved confounders, proposing worst-case bounds and leveraging robust MDPs to improve estimates, especially when confounders are persistent.

Contribution

It introduces finite-horizon worst-case bounds for off-policy evaluation under unobserved confounding and demonstrates how model-based approaches with robust MDPs can yield sharper estimates.

Findings

01

Robust bounds depend on confounder persistence.

02

Model-based methods outperform naive estimates with domain knowledge.

03

Persistent confounders make off-policy evaluation significantly more challenging.

Abstract

When decision-makers can directly intervene, policy evaluation algorithms give valid causal estimates. In off-policy evaluation (OPE), there may exist unobserved variables that both impact the dynamics and are used by the unknown behavior policy. These "confounders" will introduce spurious correlations and naive estimates for a new policy will be biased. We develop worst-case bounds to assess sensitivity to these unobserved confounders in finite horizons when confounders are drawn iid each period. We demonstrate that a model-based approach with robust MDPs gives sharper lower bounds by exploiting domain knowledge about the dynamics. Finally, we show that when unobserved confounders are persistent over time, OPE is far more difficult and existing techniques produce extremely conservative bounds.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hetankevin/mdpmix
none

Videos

Model-Free and Model-Based Policy Evaluation when Causality is Uncertain· slideslive

Taxonomy

TopicsBayesian Modeling and Causal Inference · Advanced Causal Inference Techniques · Economic Policies and Impacts