Empirical Bayesian Multi-Bandit Learning

Xia Jiang; Rong J.B. Zhu

arXiv:2510.26284·cs.LG·November 7, 2025

Empirical Bayesian Multi-Bandit Learning

Xia Jiang, Rong J.B. Zhu

PDF

TL;DR

This paper introduces a hierarchical Bayesian framework for multi-task bandit learning, estimating covariance structures to improve decision-making and achieve lower regret in complex environments.

Contribution

It proposes an empirical Bayesian approach to learn covariance matrices across bandits, along with two algorithms, ebmTS and ebmUCB, with regret bounds and superior performance.

Findings

01

Algorithms outperform existing methods in synthetic datasets.

02

Lower cumulative regret achieved in real-world experiments.

03

Effective balance of exploration and exploitation demonstrated.

Abstract

Multi-task learning in contextual bandits has attracted significant research interest due to its potential to enhance decision-making across multiple related tasks by leveraging shared structures and task-specific heterogeneity. In this article, we propose a novel hierarchical Bayesian framework for learning in various bandit instances. This framework captures both the heterogeneity and the correlations among different bandit instances through a hierarchical Bayesian model, enabling effective information sharing while accommodating instance-specific variations. Unlike previous methods that overlook the learning of the covariance structure across bandits, we introduce an empirical Bayesian approach to estimate the covariance matrix of the prior distribution. This enhances both the practicality and flexibility of learning across multi-bandits. Building on this approach, we develop two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.