Bayesian Inference of Contextual Bandit Policies via Empirical Likelihood
Jiangrong Ouyang, Mingming Gong, Howard Bondell

TL;DR
This paper introduces a Bayesian inference method based on empirical likelihood for analyzing multiple contextual bandit policies, especially effective in small samples, providing accurate uncertainty quantification and policy comparison.
Contribution
It develops a novel empirical likelihood-based Bayesian inference approach for joint policy analysis in contextual bandits, enhancing robustness and uncertainty measurement in finite samples.
Findings
Method performs well in Monte Carlo simulations.
Effective in small sample regimes.
Provides comprehensive uncertainty quantification.
Abstract
Policy inference plays an essential role in the contextual bandit problem. In this paper, we use empirical likelihood to develop a Bayesian inference method for the joint analysis of multiple contextual bandit policies in finite sample regimes. The proposed inference method is robust to small sample sizes and is able to provide accurate uncertainty measurements for policy value evaluation. In addition, it allows for flexible inferences on policy comparison with full uncertainty quantification. We demonstrate the effectiveness of the proposed inference method using Monte Carlo simulations and its application to an adolescent body mass index data set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Advanced Causal Inference Techniques · Gaussian Processes and Bayesian Inference
