
TL;DR
This paper introduces Nonparametric Bayesian Policy Learning (NBPL), a flexible framework for uncertainty-aware treatment decision-making using Bayesian nonparametrics, with theoretical guarantees and empirical applications.
Contribution
It develops a nonparametric Bayesian approach for treatment policy learning, establishing minimax-optimal regret convergence and consistent model comparison.
Findings
Posterior welfare regret converges at the minimax-optimal rate.
Posterior model comparison across policy classes is pointwise consistent.
Empirical applications demonstrate NBPL's practical utility.
Abstract
I propose Nonparametric Bayesian Policy Learning (NBPL) as a framework for uncertainty-aware treatment choice. I consider a decision-maker (DM) seeking to select an expected welfare-maximizing treatment rule using observable characteristics. A key observation is that, for a given welfare criterion and policy class, uncertainty about welfare-relevant objects is entirely induced by uncertainty about a reduced-form distribution. I assume the DM places a nonparametric Dirichlet process prior on this reduced-form parameter and uses the resulting posterior to conduct inference on optimal treatment assignments, optimal welfare, and comparisons across policy classes. The NBPL framework is flexible, and its implementation via the Bayesian bootstrap is highly tractable. I establish two main theoretical properties of NBPL. First, posterior welfare regret under NBPL converges at the minimax-optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
