Discussion of Kallus (2020) and Mo, Qi, and Liu (2020): New Objectives   for Policy Learning

Sijia Li; Xiudi Li; Alex Luedtke

arXiv:2010.04805·stat.ML·October 13, 2020·1 cites

Discussion of Kallus (2020) and Mo, Qi, and Liu (2020): New Objectives for Policy Learning

Sijia Li, Xiudi Li, Alex Luedtke

PDF

Open Access

TL;DR

This paper analyzes recent innovative objective functions for policy learning, emphasizing the importance of value function curvature and proposing more efficient methods for using calibration data in robust policy optimization.

Contribution

It introduces two methods to incorporate value function curvature in retargeting and improves data utilization for distributionally robust policies.

Findings

01

Curvature of the value function significantly impacts policy learning.

02

Proposed methods enhance the efficiency of policy optimization.

03

Better use of calibration data improves robustness of policies.

Abstract

We discuss the thought-provoking new objective functions for policy learning that were proposed in "More efficient policy learning via optimal retargeting" by Nathan Kallus and "Learning optimal distributionally robust individualized treatment rules" by Weibin Mo, Zhengling Qi, and Yufeng Liu. We show that it is important to take the curvature of the value function into account when working within the retargeting framework, and we introduce two ways to do so. We also describe more efficient approaches for leveraging calibration data when learning distributionally robust policies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques · Statistical Methods and Inference · Health Systems, Economic Evaluations, Quality of Life