Semiparametric Efficiency in Policy Learning with General Treatments
Yue Fang, Geert Ridder, Haitian Xie

TL;DR
This paper develops a unified semiparametric efficiency framework for policy learning with general treatments, analyzing efficiency bounds, estimator properties, and the impact on welfare regret in various policy settings.
Contribution
It introduces a comprehensive efficiency theory for policy learning with discrete, continuous, or mixed treatments, including new insights on estimator efficiency and welfare regret.
Findings
Inverse propensity weighting with estimated propensity is efficient, unlike with true propensity.
Efficiency bounds are established for randomized policies with known and estimated propensity scores.
Theoretical results are supported by simulations and real-world applications in job training and savings programs.
Abstract
Recent literature on policy learning has primarily focused on regret bounds of the learned policy. We provide a new perspective by developing a unified semiparametric efficiency framework for policy learning, allowing for general treatments that are discrete, continuous, or mixed. We provide a characterization of the failure of pathwise differentiability for parameters arising from deterministic policies. We then establish efficiency bounds for pathwise differentiable parameters in randomized policies, both when the propensity score is known and when it must be estimated. Building on the convolution theorem, we introduce a notion of efficiency for the asymptotic distribution of welfare regret, showing that inefficient policy estimators not only inflate the variance of the asymptotic regret but also shift its mean upward. We derive the asymptotic theory of several common policy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Advanced Bandit Algorithms Research · Statistical Methods and Inference
