Policy Learning under Unobserved Confounding: A Robust and Efficient Approach
Zequn Jin, Gaoqian Xu, Xi Zheng, Yahong Zhou

TL;DR
This paper introduces a robust policy learning method that accounts for unobserved confounding using the marginal sensitivity model, enabling reliable decision-making from observational data even with potential hidden biases.
Contribution
It develops a novel framework combining MSM with distributionally robust optimization, providing closed-form welfare criteria and asymptotic guarantees for policies under unobserved confounding.
Findings
The method achieves robust policy performance in simulations.
It effectively handles unobserved confounding in empirical applications.
Provides theoretical regret bounds for the proposed policies.
Abstract
This paper develops a robust and efficient method for policy learning from observational data in the presence of unobserved confounding, complementing existing instrumental variable (IV) based approaches. We employ the marginal sensitivity model (MSM) to relax the commonly used yet restrictive unconfoundedness assumption by introducing a sensitivity parameter that captures the extent of selection bias induced by unobserved confounders. Building on this framework, we consider two distributionally robust welfare criteria, defined as the worst-case welfare and policy improvement functions, evaluated over an uncertainty set of counterfactual distributions characterized by the MSM. Closed-form expressions for both welfare criteria are derived. Leveraging these identification results, we construct doubly robust scores and estimate the robust policies by maximizing the proposed criteria. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
