Optimal Policy Adaptation under Covariate Shift
Xueqing Liu, Qinwei Yang, Zhaoqing Tian, Ruocheng Guo, and Peng Wu

TL;DR
This paper develops a causal inference-based framework for learning optimal policies under covariate shift, providing efficient estimators and theoretical guarantees, with experiments showing improved reward estimation and policy performance.
Contribution
It introduces a semiparametric approach for policy learning under covariate shift, including efficient estimators and bias/error bounds, advancing transfer learning in policy optimization.
Findings
The proposed estimator achieves higher reward estimation accuracy.
The learned policy closely approximates the optimal policy.
The approach provides theoretical guarantees on bias and generalization error.
Abstract
Transfer learning of prediction models has been extensively studied, while the corresponding policy learning approaches are rarely discussed. In this paper, we propose principled approaches for learning the optimal policy in the target domain by leveraging two datasets: one with full information from the source domain and the other from the target domain with only covariates. First, under the setting of covariate shift, we formulate the problem from a perspective of causality and present the identifiability assumptions for the reward induced by a given policy. Then, we derive the efficient influence function and the semiparametric efficiency bound for the reward. Based on this, we construct a doubly robust and semiparametric efficient estimator for the reward and then learn the optimal policy by optimizing the estimated reward. Moreover, we theoretically analyze the bias and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEconomic Policies and Impacts
