Empirical Bayesian Multi-Bandit Learning
Xia Jiang, Rong J.B. Zhu

TL;DR
This paper introduces a hierarchical Bayesian framework for multi-task bandit learning, estimating covariance structures to improve decision-making and achieve lower regret in complex environments.
Contribution
It proposes an empirical Bayesian approach to learn covariance matrices across bandits, along with two algorithms, ebmTS and ebmUCB, with regret bounds and superior performance.
Findings
Algorithms outperform existing methods in synthetic datasets.
Lower cumulative regret achieved in real-world experiments.
Effective balance of exploration and exploitation demonstrated.
Abstract
Multi-task learning in contextual bandits has attracted significant research interest due to its potential to enhance decision-making across multiple related tasks by leveraging shared structures and task-specific heterogeneity. In this article, we propose a novel hierarchical Bayesian framework for learning in various bandit instances. This framework captures both the heterogeneity and the correlations among different bandit instances through a hierarchical Bayesian model, enabling effective information sharing while accommodating instance-specific variations. Unlike previous methods that overlook the learning of the covariance structure across bandits, we introduce an empirical Bayesian approach to estimate the covariance matrix of the prior distribution. This enhances both the practicality and flexibility of learning across multi-bandits. Building on this approach, we develop two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
