Bayesian Inference of Contextual Bandit Policies via Empirical Likelihood

Jiangrong Ouyang; Mingming Gong; Howard Bondell

arXiv:2602.10608·stat.ML·February 12, 2026

Bayesian Inference of Contextual Bandit Policies via Empirical Likelihood

Jiangrong Ouyang, Mingming Gong, Howard Bondell

PDF

Open Access

TL;DR

This paper introduces a Bayesian inference method based on empirical likelihood for analyzing multiple contextual bandit policies, especially effective in small samples, providing accurate uncertainty quantification and policy comparison.

Contribution

It develops a novel empirical likelihood-based Bayesian inference approach for joint policy analysis in contextual bandits, enhancing robustness and uncertainty measurement in finite samples.

Findings

01

Method performs well in Monte Carlo simulations.

02

Effective in small sample regimes.

03

Provides comprehensive uncertainty quantification.

Abstract

Policy inference plays an essential role in the contextual bandit problem. In this paper, we use empirical likelihood to develop a Bayesian inference method for the joint analysis of multiple contextual bandit policies in finite sample regimes. The proposed inference method is robust to small sample sizes and is able to provide accurate uncertainty measurements for policy value evaluation. In addition, it allows for flexible inferences on policy comparison with full uncertainty quantification. We demonstrate the effectiveness of the proposed inference method using Monte Carlo simulations and its application to an adolescent body mass index data set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Advanced Causal Inference Techniques · Gaussian Processes and Bayesian Inference