Nonparametric Bayesian Policy Learning

Haonan Ye

arXiv:2605.17068·econ.EM·May 19, 2026

Nonparametric Bayesian Policy Learning

Haonan Ye

PDF

TL;DR

This paper introduces Nonparametric Bayesian Policy Learning (NBPL), a flexible framework for uncertainty-aware treatment decision-making using Bayesian nonparametrics, with theoretical guarantees and empirical applications.

Contribution

It develops a nonparametric Bayesian approach for treatment policy learning, establishing minimax-optimal regret convergence and consistent model comparison.

Findings

01

Posterior welfare regret converges at the minimax-optimal rate.

02

Posterior model comparison across policy classes is pointwise consistent.

03

Empirical applications demonstrate NBPL's practical utility.

Abstract

I propose Nonparametric Bayesian Policy Learning (NBPL) as a framework for uncertainty-aware treatment choice. I consider a decision-maker (DM) seeking to select an expected welfare-maximizing treatment rule using observable characteristics. A key observation is that, for a given welfare criterion and policy class, uncertainty about welfare-relevant objects is entirely induced by uncertainty about a reduced-form distribution. I assume the DM places a nonparametric Dirichlet process prior on this reduced-form parameter and uses the resulting posterior to conduct inference on optimal treatment assignments, optimal welfare, and comparisons across policy classes. The NBPL framework is flexible, and its implementation via the Bayesian bootstrap is highly tractable. I establish two main theoretical properties of NBPL. First, posterior welfare regret under NBPL converges at the minimax-optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.